May 4, 2017
Smart Grid & Fog: Taking steps towards a prototype Software Platform for Fog Computing
by Gabor Karsai, Professor of Electrical Engineering and Computer Science and Senior Research Scientist at the Institute for Software-Integrated Systems at Vanderbilt University
The US power grid is a machine on a truly gigantic scale. It supplies reliable and stable electric power to more than $300 million customers via its 450,000 miles of high-voltage transmission lines. And it is also inherently distributed and decentralized: there is no single central controlling authority, rather a set of connected, interacting organizations that make energy management decisions in their own scope but coordinate with each other.
The power grid is also changing: from the radial model where the power generated in large power plants is distributed to consumers to a model where customers are both producers and consumers of energy. New technologies like solar and wind introduce further complexities as they are intermittent and require rapid, yet coordinated control actions over a geographically large area.
All these properties make the power system a prime target for fog computing applications. The sheer volume of data produced by a single measurement sensor (about ~1GB/day) and the timing requirements of some control loops (some control response times have to be less than 60 milliseconds) necessitate rapid, robust, distributed sensor processing decision making that can only be performed on fog platforms.
Our project, titled Resilient Information Architecture Platform for Smart Grid (RIAPS) aims at a building software infrastructure that enables the construction of distributed, fault-tolerant, real-time applications on a fog computing platform. RIAPS adheres to the base principles of the OpenFog Reference Architecture.
The project is supported by Department of Energy’s ARPA-E and is being conducted at Vanderbilt University, North Carolina State University, and Washington State University. The results of the project: the RIAPS design and the prototype implementation will be open source.
RIAPS is very different from conventional platforms as it was designed for inherently distributed and decentralized applications. An application is composed of interconnected real-time software components (similar to micro-services) that can be event- and/or time-triggered and that interact via well-defined communication patterns, including publish/subscribe and synchronous and asynchronous service invocations. Such components are location transparent and agnostic about the underlying messaging framework. They are also single threaded (mostly) so that developers should never write multi-threaded code.
Beyond this, RIAPS also provides a set of generic services available to all applications. RIAPS applications are deployed on (industrial) fog nodes, in the field, at power system devices, so they are remotely managed: installed, activated, deactivated, and uninstalled. This is helped by a deployment service.
The hardware nodes, software applications, and their components can join and leave at any time – possibly due to reconfiguration, or failures. The software components ‘find each other’ through a fault tolerant distributed discovery service that acts as a decentralized registry of apps, components, and their services. This service is based on a distributed hash table that each node participates in, continuously sharing and propagating a changes in the registry in a peer-to-peer manner.
Obviously anything can fail at any time, yet the applications must be resilient and thus able to recover. This is supported by a fault management service that runs at each node. The service monitors various ‘health indicators’ (e.g. network link is alive, application components are running, etc.) and upon detecting an anomaly it triggers an automatic recovery action (e.g. application component restart), possibly informing the application components themselves about the cause of the anomaly. The point is that the fault management service acts as an overseer that reasons about the cause of the anomaly, yet it also involves the application in the recovery process.
In real-time applications the notion of physical time is of extreme importance: application logic has to be aware of it and must operate ‘in time’. This is helped in RIAPS by a high-precision time synchronization service that keeps the local clocks of the RIAPS nodes precisely synchronized. Applications can read the clock and learn about its value and accuracy, can measure how much time a message is delayed in the messaging framework and the network, and schedule activities (e.g. taking a control action) at precisely defined points in time, in the future.
Distributed applications often rely on classic distributed algorithms (e.g. group membership) that are somewhat tricky to implement, and boring to re-implement. In RIAPS these generic algorithms are implemented in the form of a distributed coordination service and accessed via generic API-s. The service supports group membership, leader election, and consensus protocols and relies heavily on asynchronous notifications with application-controlled timeouts. This way complex distributed algorithms can be easily implemented (e.g. distributed averaging) that are very critical for some power systems.
Security is an essential ingredient of RIAPS. RIAPS is envisioned as an open, yet managed, platform, where applications of various pedigree and provenance are operating side by side. Hence a robust information flow and access control is needed so that no malicious application can interfere with the system. In other words, RIAPS is protected from the apps, and the apps are protected from each other, and they can access only what they are authorized to. The application installation and management infrastructure is a potential threat vector, so it is protected encrypted communication, cryptographically signed packages, etc. – the best practices from industry. Applications are also resource-constrained so that when they are ill-behaving, they are prevented from performing a denial of service attack on the system. Communications are encrypted, and their setup is authenticated to prevent confidentiality and integrity violations.
RIAPS is an active project and it is entering its second year. A prototype is available for experimentation – the prototype supports the basic software component model, the discovery and deployment services, and has some power system examples, including integrations with industrial device protocols (like MODBUS and C37.118) and a microgrid control application. If you are interested, please visit http://riaps.isis.vanderbilt.edu or drop us an email.
About the author:
Gabor Karsai is Professor of Electrical Engineering and Computer Science and Senior Research Scientist at the Institute for Software-Integrated Systems at Vanderbilt University. He conducts research in the area of design tools and software platforms for cyber-physical systems, the theory and practice of model-integrated computing, and on resilient, decentralized, autonomous systems. Currently he is leading the ARPA-E project titled Resilient Information Architecture Platform for Smart Grid.