Fresh from success with YOLOv3 on the desktop, a question came up of whether this could be made to work on the Movidius Neural Compute Stick and therefore run on the Raspberry Pi.
The NCS is a neat little device and because it connects via USB, it is easy to develop on a desktop and then transfer everything needed to the Pi.
The app zoo, on the ncsdk2 branch, has a tiny_yolo_v2 implementation that I used as the basis for this. It only took about an hour to get this working on the desktop – integration with rt-ai was very easy. The Raspberry Pi end was not – all kinds of version number issues and things like that. However, even though not all of the tools would compile, I just moved the compiled graph from the desktop to the Pi and that worked fine.
This is the design. The main difference here from the usual test designs is that the MYOLO SPE is assigned to node pi34 (the Raspberry Pi) rather than the desktop (Default). Just assigning the MYOLO SPE to the Pi saved me from having to connect a Picam or uvc camera to the Pi and also allowed me to get a better feel for the pure performance of the Pi with the NCS.
As can be seen from the first screen capture it worked fine although, because it supports only a subset (20 of 91) of the usual COCO labels, it did not pick up the mouse or the keyboard. Performance-wise, it was running at about 1fps and 30% CPU. Just for reference, I was getting about 8fps on the i7 desktop.
Docker containers are a great way of reducing the headaches caused by pre-requisites and software versions when deploying code in general and rt-ai SPEs in particular. So it made sense to add support for SPEs in Docker containers in addition to the existing bare metal SPEs. The screen capture above shows the test design in rtaiDesigner using the Docker containerized version of the existing TensorFlow object detector. It is essentially identical to the bare metal version, just with the object detection SPE replaced with the Dockerized version. The container was based on the TensorFlow GPU image.
SPE code is deployed to nodes as a package that includes start and stop scripts. Normally, the start script is something very simple: a single line kicking off a Python script for example. Docker SPEs use a slightly more complex start script that first tries to pull the required Docker image from a defined registry location and then invokes the container in the required manner (using nvidia-docker if necessary).
No changes were required to the SPE code itself in this case – just customization of the start and stop scripts and I added some files used to build the container and install it in the local registry so that the build and update process is very straightforward. Plus, as this test design shows, bare metal and containerized SPEs can be mixed without limitation as the stream interfaces are identical in all cases.
I decided that it would be fun to try out a Google AIY Vision Kit as a sort of warm-up for the potentially much more significant Edge TPU.
The Vision Kit is basically the same configuration as the ZeroSensor camera except with an extra board in the camera path that can perform inference on the captured images. The kit comes with some frozen graphs that can be used to detect a few things but I thought it would be interesting to try training a MobileNet SSD network with the Pascal VOC 2012 training data which can identify 20 different objects. The instructions for how to do this are here.
Once that was all running, the next step was to integrate it with rt-ai Edge. It’s pretty similar to the earlier full-blown TensorFlow version so it didn’t take too long to get working.
The design is much the same as usual except with the new VisionKit object detection SPE instead of TFObjectDetect or Deeplab. Note that the PiCam and VisionKit SPEs are running on the AIY Vision Kit, whereas the MediaView SPE is running on a desktop.
This is the output from the MediaView SPE. The metadata has been formatted to look exactly the same as the previous TensorFlow detector so that they can be used interchangeably in stream processing networks. I am getting about 2 fps with 640 x 360 images which is actually better than I expected.
I really like using JSON encoding as a way of transferring messages between processes as it is machine and language independent. Plus, it is very well suited to stream processing networks (such as rt-ai Edge) as arbitrary fields can be added to existing JSON messages and passed along. Contrast this with compiled IDLs which typically have no flexibility whatsoever.
One problem though is that binary data cannot be included in JSON messages directly. Typically base64 encoding is used to convert binary data into text. However, this is inefficient, especially in a stream processing network where base64 decoding and encoding might have to be done several times.
There are a variety of modifications to JSON around but it is very simple to just add binary data on to the end of a JSON message to form a complete message that can be transferred via MQTT for example.
In Python, an MQTT message can be published like this:
def publish(topic, jsonData, binData = None):
jsonDump = json.dumps(jsonData)
jsonString = struct.pack('>I', len(jsonDump)) + jsonDump + binData
Here, jsonData contains the normal JSON message text, binData contains the binary data to be sent along with it. To receive the message, use something like this:
def onMessage(client, userdata, message):
jsonLength = struct.unpack('>I', message.payload[0:4])
jsonData = json.loads(message.payload[4:4+jsonLength])
binData = message.payload[4+jsonLength:]
rtndf now has Python PPEs that support streaming data from a variety of environmental sensors. The sensehat PPE streams data from all of the sensors on the Raspberry Pi Sense HAT. The sensors PPE streams data from a variety of common environmental sensors:
- ADX345 accelerometer
- BMP180 pressure/temperature sensor
- HTU21D humidity sensor
- MCP9808 temperature sensor
- TMP102 temperature sensor
- TSL2561 light sensor
The specific sensors in use can be enabled by selectively commenting out lines in the sensors Python script.
sensorview is another new PPE that can display the sensor streams generated by sensehat and sensors. The screenshot shows the data from a sensehat for example.
rtndf now has a simple Python script that allows a Raspberry Pi fitted with PiCamera to be used as a PPE to generate an MJPEG video stream over MQTT. The resulting stream looks identical to that generated by uvccam (which works with UVC webcams) so picam and uvccam can be used interchangeably as video sources in rtndf pipelines.
The idea of joining together separate, lightweight processing elements to form complex pipelines is nothing new. DirectX and GStreamer have been doing this kind of thing for a long time. More recently, Apache NiFi has done a similar kind of thing but with Java classes. While Apache NiFi does have a lot of nice features, I really don’t want to live in Java hell.
I have been playing with MQTT for some time now and it is a very easy to use publish/subscribe system that’s used in all kinds of places. Seemed like it could be the glue for something…
So that’s really the background for rtnDataFlow or rtndf as it is now called. It currently uses MQTT as its pub/sub infrastructure but there’s nothing too specific there – MQTT could easily be swapped out for something else if required. The repo consists of a number of pipeline processing elements that can be used to do some (hopefully) useful things. The primary language is Python although there’s nothing stopping anything being used provided it has an MQTT client and handles the JSON messages correctly. It will even be able to include pipeline processing elements in Docker containers. This will make deployment of new, complex, pipeline processing elements very simple.
The pipeline processing elements are all joined up using topics. Pipeline processing elements can publish to one or more topics and/or subscribe to one or more topics. Because pub/sub systems are intrinsically multicasting, it’s very easy to process data in multiple ways in parallel (for redundancy, performance or functionality). MQTT also allows pipeline processing elements to be distributed on multiple systems, allowing load sharing and heterogeneous computing systems (where only some machines might be fitted with GPUs for example).
Obviously, tools are required to design the pipelines and also to manage them at runtime. The design aspect will come from an old code generation project. While that actually generates C and Python code from a design that the user inputs via a graphical interface, the rtnDataFlow version will just make sure all topic names and broker addresses line up correctly and then produce a pipeline configuration file. A special app, rtnFlowControl, will run on each system and will be responsible for implementing the pipeline design specified.
So what’s the point of all of this? I’m tired of writing (or reworking) code multiple times for slightly different applications. My goal is to keep the pipeline processing elements simple enough and tightly focused so that the specific application can be achieved by just wiring together pipeline processing elements. There’ll end up being quite a few of these of course and probably most applications will still need custom elements but it’s better than nothing. My initial use of rtnDataFlow will be to assist with experiments to see how machine learning tools can be used with IoT devices to do interesting things.