rtaiView: an rt-ai app for viewing real-time and historic sensor data

I am now pulling things together so that I can use the ZeroSensors to perform long-term data collection. Data generated by the rt-ai Edge design is passed into the Manifold and then captured by ManifoldStore, one of the standard Manifold nodes. Obviously it would be nice to know that meaningful data is being stored and that’s where rtaiView comes in. The screen capture above shows the real-time display when it has been configured to receive streams from the video and data components of the ZeroSensor streams. This is showing the streams from a couple of ZeroSensors but more can be added and the display adjusts accordingly.

This is the simple ZeroSpace design as seen in the rtaiDesigner editor window. The hardware setup consists of the ZeroSensors running the SensorZero synth stream processor element (SPE) and a server running the DeepLabv3 SPEs and the ManifoldZero synths. The ManifoldZero synths consist of a couple of PutManifold SPEs that take each stream from the ZeroSensor and map it to a Manifold stream.

ManifoldStore captures these streams and persists them to disk as can be seen from the screen capture above.

This allows rtaiView to display the real-time data coming from the ZeroSensors and historic data based on timecode.

The screen capture above shows rtaiView in historic (or DVR) mode. The control widget (at the top right) allows the user to scan through periods of time and visualize the data. The same timecode is used for all streams displayed, making it easy to correlate events between them.

rtaiView is a useful tool for checking that the rt-ai Edge design is operating correctly and that the data stored is useful. In these examples, I have set DeepLabv3 to color map recognized objects. However, this is not the desired mode as I just want to store images that have people detected in them and then have the images only contain the people. The ultimate goal is to use these image sequences along with other sensor data to detect anomalous behavior and also to predict actions so that the rt-ai Edge enabled sentient space can be proactive in taking actions.

Scaling dynamic sentient spaces to multiple locations

One of the fundamental concepts of the rt-xr and rt-ai Edge projects is that it should be possible to experience a remote sentient space in a telepresent way. The diagram above shows the idea. The main sentient space houses a ManifoldNexus instance that supplies service discovery, subscription and message passing functions to all of the other components. Not shown is the rt-ai Edge component that deals with real-time intelligent processing, both reactive and proactive, of real-world sensor data and controls. However, rt-ai Edge interconnects with ManifoldNexus, making data and control flows available in the Manifold world.

Co-located with ManifoldNexus are the various servers that implement the visualization part of the sentient space. The SpaceServer allows occupants of the space to download a space definition file that is used to construct a model of the space. For VR users, this is a virtual model of the space that can be used remotely. For AR and MR users, only augmentations and interaction elements are instantiated so that the real space can be seen normally. The SpaceServer also houses downloadable asset bundles that contain augmentations that occupants have placed around the space. This is why it is referred to as a dynamic sentient space – as an occupant either physically or virtually enters the space, the relevant space model and augmentations are downloaded. Any changes that occupants make get merged back to the space definition and model repository to ensure that all occupants are synced with the space correctly. The SharingServer provides real-time transfer of pose and audio data. The Home Automation server provides a way for the space model to be linked with networked controls that physically exist in the space.

When everything is on a single LAN, things just work. New occupants of a space auto-discover sentient spaces available on that LAN and, via a GUI on the generic viewer app, can select the appropriate space. Normally there would be just one space but the system allows for multiple spaces on a single LAN if required. The issue then is how to connect VR users at remote locations. As shown in the diagram, ManifoldNexus has to ability to use secure tunnels between regions. This does require that one of the gateway routers has a port forwarding entry configured but otherwise requires no configuration other than security. There can be several remote spaces if necessary and a tunnel can support more than one sentient space. Once the Manifold infrastructure is established, integration is total in that auto-discovery and message switching all behave for remote occupants in exactly the same way as local occupants. What is also nice is that multicast services can be replicated for remote users in the remote LAN so data never has to be sent more than once on the tunnel itself. This optimization is implemented automatically within ManifoldNexus.

Dynamic sentient spaces (where a standard viewer is customized for each space by the servers) is now basically working on the five platforms (Windows desktop, macOS, Windows Mixed Reality, Android and iOS). Persistent ad-hoc augmentations using downloadable assets is the next step in this process. Probably I am going to start with the virtual sticky note – this is where an occupant can leave a persistent message for other occupants. This requires a lot of the general functionality of persistent dynamic augmentations and is actually kind of useful for change!

Accessing the iOS WiFi IP address within a Unity app

At the moment, Manifold requires that a client supplies an appropriate IP address (although I might change this in the future). Mostly it is pretty easy to do but the .NET way of doing things (using Dns.GetHostEntry()) didn’t seem to pick up the WiFi IP address on an iPad. After wasting a lot of time, I decided to go back to basics and create a native plugin.

The basis of the code comes from here – it just needed the right wrapping to get it to work with Unity. I will be the first to admit that I know nothing about native iOS coding but, following the Bonjour example here, the code below  seemed to work just fine when placed in the Assets/Plugins/iOS directory of the project.

IPAddress.h:

#import <Foundation/Foundation.h>

@interface IPAddressDelegate : NSObject

- (NSString *)getAddress;
@end

IPAddress.m:

#include <ifaddrs.h>
#include <arpa/inet.h>

#import "IPAddress.h"
@implementation IPAddressDelegate

- (id)init
{
    self = [super init];
    return self;
}

- (NSString *)getAddress {
    NSString *address = @"error";
    struct ifaddrs *interfaces = NULL;
    struct ifaddrs *temp_addr = NULL;
    int success = 0;
    success = getifaddrs(&interfaces);
    if (success == 0) {
        temp_addr = interfaces;
        while(temp_addr != NULL) {
            if(temp_addr->ifa_addr->sa_family == AF_INET) {
                if([[NSString stringWithUTF8String:temp_addr->ifa_name] isEqualToString:@"en0"]) {
                    address = [NSString stringWithUTF8String:inet_ntoa(((struct sockaddr_in *)temp_addr->ifa_addr)->sin_addr)];
                }
            }
            temp_addr = temp_addr->ifa_next;
        }
    }
    // Free memory
    freeifaddrs(interfaces);
    return address;
}
@end

static IPAddressDelegate* delegateObject = nil;

char* MakeStringCopy (const char* string)
{
    if (string == NULL)
        return NULL;
    
    char* res = (char*)malloc(strlen(string) + 1);
    strcpy(res, string);
    return res;
}

const char * getLocalWifiIpAddress()
{
    if (delegateObject == nil)
        delegateObject = [[IPAddressDelegate alloc] init];
    
    return MakeStringCopy([[delegateObject getAddress] UTF8String]);
}

To use the plugin is pretty straightforward. Just add this declaration to a C# class:

	[DllImport ("__Internal")]
	private static extern string getLocalWifiIpAddress();

Then call getLocalWiFiIpAddress() to get the dotted address string.

rt-xr sentient space visualization now on iOS!

I have to admit, I am in a state of shock right now. For some reason today I decided to try to get the rt-xr Viewer software working on iOS. After all, it worked fine on Windows desktop, UWP (Windows MR), macOS and Android so why not? However, I expected endless trouble with the Manifold library but, as it turned out, getting it to work on iOS was trivial. I guess Unity and .NET magic came together so I didn’t have to do too much work once again. In fact, the hardest part was working out how to sort out microphone permission and that wasn’t too hard – this thread certainly helped with that. Avatar pose sharing, audio sharing, proxy objects, video and sensor feeds all work perfectly.

The nice thing now is that most (if not all) of the further development is intrinsically multi-platform.

rt-xr: VR, MR and AR visualization for augmented sentient spaces

It was becoming pretty clear that the Unity/XR parts of rt-ai Edge were taking on a life of their own so they have now been broken out into a new project called rt-xr. rt-ai Edge is an always on, real-time and long-lived stream processing system whereas rt-xr is ideal for ad-hoc networking where components come and go as required. In particular, the XR headsets of real and virtual occupants of a sentient space can come and go on a random basis – the sentient space is persistent and new users just get updated with the current state upon entering the space. In terms of sentient space implementation, rt-ai Edge provides the underlying sensing, intelligent processing and reaction processing (somewhat like an autonomic system) whereas rt-xr provides a more user-orientated system for visualizing and interacting with the space at the conscious level (to keep the analogy going) along with the necessary servers for sharing state, providing object repositories etc.

Functions include:

  • Visualization: A model of the sentient space is used to derive a virtual world for VR headset-wearing occupants of the space and augmentations for MR and AR headset-wearing occupants of the space. The structural model can be augmented with various assets, including proxy objects that provide a UI for remote services.
  • Interaction: Both MR/AR occupants physically within a space can interact with objects in a space while VR users can interact with virtual analogs within the same space for a telepresent experience.
  • Sharing: VR users in a space see avatars representing MR/AR users physically within the space while MR/AR users see avatars representing VR users within the space. Spatially located audio enhances the reality of the shared experience, allowing users to converse in a realistic manner.

rt-xr is based on the Manifold networking surface which greatly simplifies dynamic, ad-hoc architectures, supported by efficient multicast and point to point communication services and easy service discovery.

A key component of rt-xr is the rt-xr SpaceServer. This provides a repository for all augmentation objects and models within a sentient space. The root object is the space definition that models the physical space. This allows a virtual model to be generated for VR users while also locating augmentation objects for all users. When a user first enters a space, either physically or virtually, they receive the space definition file from the rt-xr SpaceServer. Depending on their mode, this is used to generate all the objects and models necessary for the experience. The space definition file can contain references to standard objects in the rt-xr viewer apps (such as video panels) or else references to proxy objects that can be downloaded from the rt-xr SpaceServer or any other server used as a proxy object repository.

The rt-xr SharingServer is responsible for distributing camera transforms and other user state data between occupants of a sentient space allowing animation of avatars representing virtual users in a space. It also provides support for the spatially located audio system.

The rt-xr Viewers are Unity apps that provide the necessary functionality to interact with the rest of the rt-xr system:

  • rt-xr Viewer3D is a Windows desktop viewer.
  • rt-xr ViewerMR is a UWP viewer for Windows Mixed Reality devices.
  • rt-xr ViewerAndroid is a viewer for Android devices.

Proxy objects: Unity assets that are UI extensions of remote servers

For some reason I often end up back at the analog clock for trying out new ideas. I guess it is because it is pretty trivial to operate a clock – just supply three angles. In this case, the clock is a proxy object which is in many ways just a simple extension of the system that animates the avatars for other occupants of a sentient space. A proxy object is a conventional Unity GameObject hierarchy that has certain specially named child nodes. By itself, there’s nothing special about the Unity asset part of a proxy object – it could be an asset included in the app or an asset downloaded from a server using Unity’s asset bundle system. Either way, these specially named nodes can be linked to external servers. In this case, the SharingServer generates an analog clock stream that animates the clock hands. The clock definition is contained in the space definition file that instantiates all the other parts of the scene.

In principle, interaction (i.e. sending stuff back to the remote server) can be added by using specially named nodes to attach scripts that are hard-coded in the app. I haven’t tried this yet but see no reason why it wouldn’t work. The key point is that proxy objects leverage standard scripts in the app as opposed to customized scripts for every asset.

Right now, you can modify the local scale, local position, local orientation, color and text (if associated with a TextMesh) of any of the GameObjects in an asset’s hierarchy. This could easily be extended to other things including updating a texture with a new image. For example, a virtual fireplace could be created where the flames are animated by constantly varying the textures being displayed. The system is still simplistic however as there are no mechanisms for controlling transitions (such as lerping between positions or fading between textures) but this could certainly be added without too much difficulty.

Just for reference, the analog clock stream message looks like this:

{
    "type": "proxyobject",
    "updateList": [
        {
            "name": "PO_AnalogClock_Second",
            "orientation": {
                "x": 0,
                "y": 222,
                "z": 0
            },
            "orientationValid": true
        },
        {
            "name": "PO_AnalogClock_Minute",
            "orientation": {
                "x": 0,
                "y": 342,
                "z": 0
            },
            "orientationValid": true
        },
        {
            "name": "PO_AnalogClock_Hour",
            "orientation": {
                "x": 0,
                "y": 568,
                "z": 0
            },
            "orientationValid": true
        }
    ]
}

Here the y value encodes the relevant hand angle. The hour angle is greater than 360 degrees as the system uses a 24 hour clock but the result is the same whatever.

Sentient space sharing avatars with Windows desktop, Windows Mixed Reality and Android apps


One of the goals of the rt-ai Edge system is that users of the system can use whatever device they have available to interact and extract value from it. Unity is a tremendous help given that Unity apps can be run on pretty much everything. The main task was integration with Manifold so that all apps can receive and interact with everything else in the system. Manifold currently supports Windows, UWP, Linux, Android and macOS. iOS is a notable absentee and will hopefully be added at some point in the future. However, I perceive Android support as more significant as it also leads to multiple MR headset support.

The screen shot above and video below show three instances of the rt-ai viewer apps running on Windows desktop, Windows Mixed Reality and Android interacting in a shared sentient space. Ok, so the avatars are rubbish (I call them Sad Robots) but that’s just a detail and can be improved later. The wall panels are receiving sensor and video data from ZeroSensors via an rt-ai Edge stream processing network while the light switch is operated via a home automation server and Insteon.

Sharing is mediated by a SharingServer that is part of Manifold. The SharingServer uses Manifold multicast and end to end services to implement scalable sharing while minimizing the load on each individual device. Ultimately, the SharingServer will also download the space definition file when the user enters a sentient space and also provide details of virtual objects that may have been placed in the space by other users. This allows a new user with a standard app to enter a space and quickly create a view of the sentient space consistent with existing users.

While this is all kind of fun, the more interesting thing is when this is combined with a HoloLens or similar MR headset. The MR headset user in a space would see any VR users in the space represented by their avatars. Likewise, VR users in a space would see avatars representing MR users in the space. The idea is to get as close to a telepresent experience for VR users as possible without very complex setups. It would be much nicer to use Holoportation but that would require every room in the space has a very complex and expensive setup which really isn’t the point. The idea is to make it very easy and low cost to implement an rt-ai Edge based sentient space.

Still lots to do of course. One big thing is audio. Another is representing interaction devices (pointers, motion controllers etc) to all users. Right now, each app just sends out the camera transform to the SharingServer which then distributes this to all other users. This will be extended to include PCM audio chunks and transforms for interaction devices so that everyone will be able to create a meaningful scene. Each user will receive the audio stream from every other user. The reason for this is that then each individual audio stream can be attached to the avatar for each user giving a spatialized sound effect using Unity capabilities (that’s the hope anyway). Another very important thing is that the apps work differently if they are running on VR type devices or AR/MR type devices. In the latter case, the walls and related objects are not drawn and just the colliders instantiated although virtual objects and avatars will be visible. Obviously AR/MR users want to see the real walls, light switches etc, not the virtual representations. However, they will still be able to interact in exactly the same way as a VR user.