Adding a schemaless, timestamp searchable data store to rt-ai Edge using Manifold

The MQTT-based heart of rt-ai Edge is ideal for constructing stream processing networks (SPNs) that are intended to run continuously. rt-ai Edge tools (such as rtaiDesigner) make it easy to modify and re-deploy SPNs across multiple nodes during the design phase but, once in full time operation, these SPNs just run by themselves. An existing stream processing element (SPE), PutNiFi, allows data from an rt-ai Edge network to be stored and processed by big data tools –  using Elasticsearch for example. However, these types of big data tools aren’t always appropriate, especially if low latency access is required as Java garbage collection can cause random delays.

For many applications, much simpler but reliably low latency storage is desirable. The Manifold system already has a storage app, ManifoldStore,  that is optimized for timestamp-based searches of historical data. A new SPE called PutManifold allows data from an SPN to flow into a Manifold networking surface. The SPN screen capture above shows two instances of the PutManifold SPE used to transfer audio and video data from the SPN. ManifoldStore grabs passing data and stores it using timestamp as the key. Manifold applications can then access historical data flows using streamId/timestamp pairs. It is particularly simple to coordinate access across multiple data streams. This is very useful when trying to correlate events across multiple data sources at a particular point or window in time.

ManifoldStore is intrinsically schemaless in that it can store anything that consists of a JSON part and a binary data part, as used in rt-ai Edge. A new application called rtaiView is a universal viewer that allows multiple streams of all types to be displayed in a traditional split-screen monitoring format. It uses ManifoldStore for its underlying storage and provides a window into the operation of the SPN.

Manifold is designed to be very flexible with various features that reduce configuration for ad-hoc uses. This makes it very easy to perform offline processing of stored data as and when required which is ideal for offline machine learning applications.

Moving into the big (data) league with rt-ai Edge and Apache NiFi

The main reason for rt-ai Edge‘s existence is to reduce large volumes of raw data into much smaller amounts of data with high semantic content. Sometimes this can be acted upon in the local loop (i.e. within the edge space) when that makes sense or low latency is critical. Even if it is, it may still be useful to store the information extracted from the raw streams for later offline processing such as machine learning. Since Apache NiFi has all the required interfaces, it makes sense that rt-ai Edge can pass data into Apache NiFi, using it as a gateway to big data type applications.

For this simple example, I am storing recovered license plate data in Elasticsearch. The screen capture above shows the rt-ai Edge stream processing network (SPN) with the new PutNiFi stream processing element (SPE). PutNiFi transfers any rt-ai message desired into an Apache NiFi instance using MQTT for transport.

This screen capture shows the very simple Apache NiFi design. The ConsumeMQTT processor is used to collect messages from the PutNiFi SPE and then passes these to Elasticsearch for storage. Obviously a lot more could be going on here if required.

Automatic license plate recognition with OpenALPR and rt-ai Edge

I came across OpenALPR a little while ago when thinking about the general problem of enhancing the value of video feeds. It has an easy to use Python binding so it didn’t take very long to create an rt-ai Edge stream processing element (SPE). Actually, the OpenALPR part of it is one line of code – it takes a jpeg from the video stream and adds any recognized plate info as metadata to the output message. The trivial stream processing network in the screen capture above shows its operation as an inline semantic enhancer of a video stream. The OpenALPR SPE only outputs a video frame it if either already contains metadata or else the OpenALPR SPE has added metadata. In this way, multiple recognizers can be applied to the same frame using a pipeline of SPEs.

While I can now see a few private houses starting to sprout specialized license plate reading cameras (which are optimized for this purpose, especially for night operation), I don’t have anything set up as yet so I had to make do with printing car images and waving them in front of a webcam. Seemed to work fine but it would be nice to have a proper setup.

Recognized license plate metadata then becomes another feature that can be used for machine learning and inference within the edge environment – another step on the path to sentient spaces perhaps.

Real time edge inference monitoring with rt-ai Edge

rt-ai Edge is progressing nicely and now supports multi-node operation (i.e. multiple networked servers participating in a processing network) along with real-time monitoring. The screen capture shows a simple processing network where the video feed from a camera is passed through a DeepLab-v3+ stream processing element (SPE) and then on to two separate media viewers. At the top of each SPE block in the Designer window is some text like Cam(Default). Here, Cam is the name given to the SPE while Default is the name of the node (server) on which the SPE is running. In this design there are two nodes, Default and rtai0.

The code underlying the common SPE API communicates with the Designer window and supplies the stats about bytes and messages in and out. Soon, this path will also allow SPE-specific real-time parameter tweaking from the Designer window.

To add a node to the system, it just needs to have all of the prerequisites installed and run a special NodeManager SPE. This also communicates with the Designer and supports SPE deployment and runtime control, activated when the user presses the Deploy design button. Moving an SPE between nodes is just a case of reassigning it, generating the design and then deploying the design again.

The green outlines around each SPE indicate the state of the SPE and the node on which it is running. When it is all green, as in the first screen capture, this indicates that both SPE and node are running. For the second screen capture, I manually terminated the View2 SPE on rtai0. The inner part of the outline has now gone red. This indicates that the node is up but the SPE is down. If the outline is all red, it means that the node is down and not communicating with the Designer.

It’s interesting to note that DeepLab-v3+ is processing around 5 frames per second using a GTX-1080 GPU. The input rate from the camera is 30 frames per second. The processor drops frames while it is still processing an earlier frame, ensuring that queues do not build up and latency is kept to a minimum.

DeepLabv3+ Stream Processing Element (SPE) for rt-ai Edge

Integrating DeepLabv3+ with rt-ai Edge turned out to be pretty straightforward and follows from an existing TensorFlow-based Inception Stream Processing Element (SPE). The screen capture above shows an example of what it can do when given a video stream, where the DeepLab SPE has removed all pixels that aren’t part of recognized objects. This is why I am waving a bottle of beer about (and not because it is after 5pm). The PASCAL VOC dataset on which the model I am using has been trained can recognize a finite set of categories of objects. Waving a cow about didn’t seem practical hence the bottle. This is the original frame from the camera:

The DeepLab SPE also allows a specific category to be selected. In the case of the capture below, this was just the bottle:

On the right hand side of the media viewer screen you can see the metadata that has been generated by the DeepLab SPE. This is an example of how rt-ai Edge SPEs can be used to enhance the semantic content of data – video frames in this case.

It is pretty easy to configure the DeepLab SPE using rtaiDesigner:

This is the design screen showing the fairly trivial flow used for this test. Cam is a webcam capture SPE, Audio is an audio capture SPE. The DeepLab SPE is connected in the flow between the capture SPE and the media view SPE.

An interesting feature of rt-ai Edge is how SPEs can be configured. An SPE consists of some code (Python scripts in these cases) and a module spec (mspec) file. The mspec file contains information about subscriber and publisher ports as well as a section that is used to generate a configuration dialog. An example for the DeepLab SPE module dialog is shown above. This is the mspec file that generated it:

{
    "ModuleType" : "DeepLab",

    "ModuleDialog" : {
        "DialogName" : "DeepLab",
        "DialogDesc" : "Settings dialog for DeepLab semantic segmentation",

        "DialogData" : [
            {
                "VarName" : "OutputFormat",
                "VarDesc" : "Output frame format",
                "VarType" : "ConfigSelection",
                "VarValue" : "0",
                "VarStringArray" : [{ "VarEntry" : "Color map"},{"VarEntry" : "Masked image" },{"VarEntry" : "Single category masked image" }]
            },
            {
                "VarName" : "Category",
                "VarDesc" : "Single category selector",
                "VarType" : "ConfigSelection",
                "VarValue" : "15",
                "VarStringArray" : [
                    {"VarEntry" : "background"},
                    {"VarEntry" : "aeroplane"},
                    {"VarEntry" : "bicycle" },
                    {"VarEntry" : "bird" },
                    {"VarEntry" : "boat" },
                    {"VarEntry" : "bottle" },
                    {"VarEntry" : "bus" },
                    {"VarEntry" : "car" },
                    {"VarEntry" : "cat" },
                    {"VarEntry" : "chair" },
                    {"VarEntry" : "cow" },
                    {"VarEntry" : "diningtable" },
                    {"VarEntry" : "dog" },
                    {"VarEntry" : "horse" },
                    {"VarEntry" : "motorbike" },
                    {"VarEntry" : "person" },
                    {"VarEntry" : "pottedplant" },
                    {"VarEntry" : "sheep" },
                    {"VarEntry" : "sofa" },
                    {"VarEntry" : "train" },
                    {"VarEntry" : "tv" }
                ]
            },
            {
                "VarName" : "Preview",
                "VarDesc" : "Enable preview",
                "VarType" : "ConfigBool",
                "VarValue" : "false"
            }
        ]
    },
    
    "ModulePubSubs" : {
        "Pubs" : {
            "VideoOut" : "VideoMJPEG"
        },

        "Subs" : {
            "VideoIn" : "VideoMJPEG"
        }
    }
}

This makes it very easy to try out different settings. Use the module’s dialog to change something, regenerate the design using the Generate design button and then restart the network. Right now, for testing, rtaiDesigner generates start.sh and stop.sh scripts that can be used to quickly implement changes. Hopefully, in the future, configuration changes will be possible on the fly without having to restart the stream processing network.

Semantic image segmentation with TensorFlow using DeepLab


I have been trying out a TensorFlow application called DeepLab that uses deep convolutional neural nets (DCNNs) along with some other techniques to segment images into meaningful objects and than label what they are. Using a script included in the DeepLab GitHub repo, the Pascal VOC 2012 dataset is used to train and evaluate the model. One of the results is shown above. It has managed to extract some pretty ugly furniture from a noisy background quite nicely. Here are couple more examples:


The software has done a nice job of extracting the foreground objects in another very noisy scene.


The person in the background is picked up pretty nicely here – I didn’t even notice the person at first.

Incidentally, to get the local_test.sh to work on Ubuntu 16.04 I had to change the call to download_and_convert_voc2012.sh to use bash instead of sh otherwise it generated an error. Also, I needed to install cuDNN 7.0.4 for Cuda 9.0 rather than cuDNN 7.1.1 in order to get the Jupyter notebook example operating.

What I would like to do now is to create an rt-ai Edge Stream Processing Element (SPE) based on this code to act as a preprocessor stage in order to isolate and identify salient objects in a video stream in real time. One of my interests is understanding behaviors from video and this could be a valuable component in that pipeline by allowing later stages to focus on what’s important in each frame.

Why not just use NiFi and MiNiFi instead of rt-ai Edge?

Any time I start a project I always wonder if I am just reinventing the wheel. After all, there is so much software out there (on GitHub and others)  that almost everything already exists in some form. The most obvious analog to rt-ai Edge is Apache NiFi and Apache MiNiFi. NiFi provides a very rich environment of processor blocks and great tools for joining them together to create stream processing pipelines. However, there are some characteristics of NiFi that I don’t particularly like. One is the reliance on the JVM and the consequent garbage collection issues that mess up latency guarantees. Tuning a NiFi installation can be a bit tricky – check here for example. However, many of these things are the price that is inevitably paid for having such a rich environment.

rt-ai Edge was designed to be a much simpler and lower overhead way of creating flexible stream processing pipelines in edge processors with low latency connections and no garbage collection issues. That isn’t to say that an rt-ai Edge pipeline module could not be written using a managed memory language if desired (it certainly could) but instead that the infrastructure does not suffer from this problem.

In fact, rt-ai Edge and NiFi can play together extremely well. rt-ai Edge is ideal at the edge, NiFi is ideal at the core. While MiNiFi is the NiFi solution for embedded and edge processors, rt-ai Edge can either replace or work with MiNiFi to feed into a NiFi core. So maybe it’s not a case of reinventing the wheel so much as making the wheel more effective.