rt-ai: real time stream processing and inference at the edge enables intelligent IoT

The “rt” part of rt-ai doesn’t just stand for “richardstech” for a change, it also stands for “real-time”. Real-time inference at the edge will allow decision making in the local loop with low latency and no dependence on the cloud. rt-ai includes a flexible and intuitive infrastructure for joining together stream processing pipelines in distributed, restricted processing power environments. It is very easy for anyone to add new pipeline elements that fully integrate with rt-ai pipelines. This leverages some of the concepts originally prototyped in rtndf while other parts of the rt-ai infrastructure have been in 24/7 use for several years, proving their intrinsic reliability.

Edge processing and control is essential if there is to be scalable use of intelligent IoT. I believe that dumb IoT, where everything has to be sent to a cloud service for processing, is a broken and unscalable model. The bandwidth requirements alone of sending all the data back to a central point will rapidly become unworkable. Latency guarantees are difficult to impossible in this model. Two advantages of rt-ai (keeping raw data at the edge where it belongs and only upstreaming salient information to the cloud along with minimizing required CPU cycles in power constrained environments) are the keys to scalable intelligent IoT.

Manifold: getting ready for the new <timestamp, object> store

Previously, Manifold did not directly deal with timestamps. Instead, rtndf includes a timestamp in the JSON part of the JSON and binary messages that are passed around over the Manifold. However, I am working on a <timestamp, object> store (i.e. a storage system that is indexed via a timestamp) and it makes sense to include this as a Manifold node since it is general purpose and not specific to rtndf over Manifold. Consequently, the latest version of Manifold has been modified to include a timestamp in the MANIFOLD_EHEAD data structure that is at the start of every Manifold message (it’s Manifold’s equivalent of the IP header in some ways).

This has had a knock-on effect of changing APIs since now the timestamp has to be passed explicitly to Manifold library functions. This means that the C++ and Python core libraries needed small modifications. Always a pain to have to go back and tweak all of the Manifold and rtndf C++ and Python nodes but it was worth it in this case. The <timestamp, object> storage system needs to support very high throughputs for both writes and reads so passing the timestamp around as a double in binary form makes a lot of sense. The idea is that all data flowing through the Manifold can be captured by the <timestamp, object> store which can then be accessed by other nodes at some other time. The store can be searched by absolute or nearest timestamp, allowing easy correlation of data across multiple sources.

The pyramid – an rtn data flow point of presence

PyramidThe pyramid was originally put together for another project but has received a new lease of life as an rtn data flow point of presence. It uses a Logitech C920 webcam for video and audio and has powered speakers for text to speech or direct audio output. The top of the pyramid has an LED panel that indicates the current state of the pyramid:

  • Idle – waiting for wakeup phrase.
  • Listening – collecting input.
  • Processing – performing speech recognition and processing.
  • Speaking – indicates that the pyramid is generating sound.

The pyramid has a Raspberry Pi 2 internally along with a USB-connected Teensy 3.1 with an OctoWS2811 to run the LED panel. The powered speakers came out of some old Dell PC speakers and the case was 3D printed.

It runs these rtndf/Manifold nodes:

  • uvccam – generates a 1280 x 720 video stream at 30fps.
  • audio – generates a PCM audio stream suitable for speech recognition.
  • tts – text to speech node to convert text to speech.
  • tty – a serial interface used to communicate with the Teensy 3.1.

Speech recognition is performed by the speechdecode node that runs on a server, as is object recognition (recognize), motion detection (modet) and face recognition (facerec).

The old project had an intelligent agent that took the output of the various stream processors and generated the messages to control the pyramid. This has yet to be moved over to rtndf.

Containerizing of Manifold and rtndf (almost) complete

sensorviewI’ve certainly been learning a fair bit about Docker lately. Didn’t realize that it is reasonably easy to containerize GUI nodes as well as console mode nodes so now rtnDocker contains scripts to build and run almost every rtndf and Manifold node. There are only a few that haven’t been successfully moved yet. imuview, which is an OpenGL node to view data from IMUs, doesn’t work for some reason. The audio capture node (audio) and the audio part of avview (the video and audio viewer node) also don’t work as there’s something wrong with mapping the audio devices. It’s still possibly to run these outside of a container so it isn’t the end of the world but it is definitely a TODO.

Settings files for relevant containerized nodes are persisted at the same locations as the un-containerized versions making it very easy to switch between the two.

rtnDocker has an all script that builds all of the containers locally. These include:

  • manifoldcore. This is the base Manifold core built on Ubuntu 16.04.
  • manifoldcoretf. This uses the TensorFlow container as the base instead of raw Ubuntu.
  • manifoldcoretfgpu. This uses the TensorFlow GPU-enabled container as the base.
  • manifoldnexus. This is the core node that constructs the Manifold.
  • manifoldmanager. A management tool for Manifold nodes.
  • rtndfcore. The core rtn data flow container built on manifoldcore.
  • rtndfcoretf. The core rtn data flow container built on manifoldcoretf.
  • rtndfcoretfgpu. The core rtn data flow container built on manifoldcoretfgpu.
  • rtndfcoretfcv2. The core rtn data flow container built on rtndfcoretf and adding OpenCV V3.0.0.
  • rtndfcoretfgpucv2. The core rtn data flow container built on rtndfcoretfgpu and adding OpenCV V3.0.0.

The last two are good bases to use for anything combining machine learning and image processing in an rtn data flow PPE. The OpenCV build instructions were based on the very helpful example here. For example, the recognize PPE node, an encapsulation of Inception-v3, is based on rtndfcoretfgpucv2. The easiest way to build these is to use the scripts in the rtnDocker repo.

rtndf gets JSON + native binary transport

Using JSON for streaming data flows is very convenient. It allows pipeline processing elements (PPEs) to insert extracted features that enhance the value of the data in a very natural manner – just adding new JSON fields to the message.

JSON does have one big limitation though: it has no way to transport binary data natively. I have been using base64 encoding for high rate sensor data (such as video and audio) but it always seemed ugly. So all of rtndf, with support from Manifold, now supports a joint JSON + native binary transport.

There’s no great rocket-science here, the payloads just look like this:

<Serialized JSON message length (4 bytes)>
<Binary data length (4 bytes)>
<Serialized JSON message>
<Binary data>

Simple but effective in eliminating wasteful base64 encodes/decodes. I know that there are some binary JSON-like libraries but this seems like a better solution for this specific application where typically there is only one big binary field that needs to be transported.

Another reason for being concerned about base64 encoding is that I want to be able to move uncompressed video frames around using shared memory links when the pipeline section is running within one machine. Right now, video frames move as JPEGs internally to keep data rates manageable but this uncompressed transport concept could eliminate a lot of wasteful JPEG encode/decodes too.

rtndf now running on Manifold

rtndf has now been (mostly) updated to run on the Manifold networking infrastructure instead of MQTT. This move opens up a lot of new possibilities, especially since a lot more existing code can be ported into the Manifold/rtndf environment very easily. This includes useful things such as timestamp-addressable data stores which make searching for related events straightforward.

Manifold – a new networking infrastructure for rtndf

While I have been using MQTT so far for rtndf, I always had in mind using my own infrastructure. I have been developing the concepts on and off since about 2003 and there’s a direct line from the early versions (intended for clusters of robots to form ad-hoc meshes), through SyntroNet and SNC to the latest incarnation called Manifold. It has some nice features such as auto-discovery, optimized distributed multicast, easy resilience and a distributed directory system that makes node discovery really easy.

The Manifold is made up of nodes. The most important node is ManifoldNexus which forms the hyper-connected fabric of the Manifold. The plan is for rtndf apps to become Manifold nodes to take advantage of the capabilities of Manifold. Manifold has APIs for C++ and Python.

Even though it is very new, Manifold is working quite well. Using Python source and sink scripts, it’s possible get throughput of around 2G bytes per second for both end to end (E2E) and multicast traffic. This figure was obtained using 5000 400,000 byte packets per second on an I7 5820K machine. Between machines, rates are obviously limited by link speeds for large packets. Round-trip E2E latency is around 50uS for small packets which could probably be improved. Maximum E2E message rate is about 100,000 per second between two nodes.

Manifold does potentially lend itself to being used with poll mode Ethernet links and shared memory links. Poll mode shared memory links are especially effective as latency is minimized and data predominately bounces off the CPU’s caches, not to mention DPDK links for inter-machine connectivity. Plenty of work left to do…