Multi-platform interaction styles for rt-xr and Unity

The implementation of sticky notes in rt-xr opened a whole can of worms but really just forced the development of a set of capabilities that will be needed for the general case where occupants of a sentient space can download assets from anywhere, instantiate them in a space and then interact with them. In particular, the need to be able to create a sticky note, position it and add text to it when being used on the supported platforms (Windows and mac OS desktop, Android, iOS and Windows Mixed Reality) required a surprising amount of work. I decided to standardize on a three button mouse model and map the various interaction styles into mouse events. This means that the bulk of the code doesn’t need to care about the interaction style (mouse, motion controller, touch screen etc) as all the complexity is housed in one script.

The short video below shows this in operation on a Windows desktop.

It ended up running a bit fast but that was due to the video recorder setup – I can’t really do things that fast!

I am still just using opaque devices – where is my HoloLens 2 or Magic Leap?!!! However, things should map across pretty well. Note how the current objects are glued to the virtual walls. Using MR devices, the spatial maps would be used for raycasting so that the objects would be glued to the real walls. I do need to add a mode where you can pull things off walls and position them arbitrarily in space but that’s just a TODO right now.

What doesn’t yet work is sharing of these actions. When an object is moved, that move should be visible to all occupants of the space. Likewise, when a new object is created or text updated on a sticky note, everyone should see that. That’s the next step, followed by making all of this persistent.

Anyway, here is how interaction works on the various platforms.

Windows and mac OS desktop

For Windows, the assumption is that three button (middle wheel) mouse is used. The middle button is used to grab and position objects. The right mouse button opens up the menu for that object. The left button is used for selection and resizing. On the Mac, which doesn’t have a middle button, the middle button is simulated by holding down the Command key which maps the left button into the middle button.

Navigation is via the SpaceMouse on both platforms.

Windows Mixed Reality

The motion controllers have quite a few controls and buttons available. I am using the Grab button to grab and position objects. The trigger is used for selection and resizing while the menu button is used to bring up the object menu. Pointing at the sticky note and pressing the trigger causes the virtual keyboard to appear.

Navigation uses the standard joystick-based teleport system.

Android and iOS

My solution here is a little ugly. Basically, the number of fingers used for a tap and/or hold dictates which mouse button the action maps to. A single touch means the left mouse button, two touches means the right mouse button while three touches means the middle button. It works but it is pretty amusing trying to get three simultaneous touches on an object to initiate a grab on a small screen device like a phone!

Navigation is via single or dual touch. Single touch and slide moves in x and y directions. Dual touch and slide rotates around the y axis. Since touches are used for other things, navigation touches need to be made away from objects or else they will be misinterpreted. Probably there is a better way of doing this. However, in the longer term, see-through mode using something like ARCore or ARKit will eliminate the navigation issue which is only a problem in VR (opaque) mode. I assume the physical occupants of a space will use see-through mode with only remote occupants using VR mode.

I haven’t been using ARCore or ARKit yet, mainly because they haven’t seemed good enough to create a spatial map that is useful for rt-xr. This is changing (ARKit 2 for example) but the question is whether it can cope with multiple rooms. For example, objects behind a real wall should not be visible – they need to be occluded by the spatial map. The HoloLens can do this however and is the best available option right now for multi-room MR with persistence.


Scaling dynamic sentient spaces to multiple locations

One of the fundamental concepts of the rt-xr and rt-ai Edge projects is that it should be possible to experience a remote sentient space in a telepresent way. The diagram above shows the idea. The main sentient space houses a ManifoldNexus instance that supplies service discovery, subscription and message passing functions to all of the other components. Not shown is the rt-ai Edge component that deals with real-time intelligent processing, both reactive and proactive, of real-world sensor data and controls. However, rt-ai Edge interconnects with ManifoldNexus, making data and control flows available in the Manifold world.

Co-located with ManifoldNexus are the various servers that implement the visualization part of the sentient space. The SpaceServer allows occupants of the space to download a space definition file that is used to construct a model of the space. For VR users, this is a virtual model of the space that can be used remotely. For AR and MR users, only augmentations and interaction elements are instantiated so that the real space can be seen normally. The SpaceServer also houses downloadable asset bundles that contain augmentations that occupants have placed around the space. This is why it is referred to as a dynamic sentient space – as an occupant either physically or virtually enters the space, the relevant space model and augmentations are downloaded. Any changes that occupants make get merged back to the space definition and model repository to ensure that all occupants are synced with the space correctly. The SharingServer provides real-time transfer of pose and audio data. The Home Automation server provides a way for the space model to be linked with networked controls that physically exist in the space.

When everything is on a single LAN, things just work. New occupants of a space auto-discover sentient spaces available on that LAN and, via a GUI on the generic viewer app, can select the appropriate space. Normally there would be just one space but the system allows for multiple spaces on a single LAN if required. The issue then is how to connect VR users at remote locations. As shown in the diagram, ManifoldNexus has to ability to use secure tunnels between regions. This does require that one of the gateway routers has a port forwarding entry configured but otherwise requires no configuration other than security. There can be several remote spaces if necessary and a tunnel can support more than one sentient space. Once the Manifold infrastructure is established, integration is total in that auto-discovery and message switching all behave for remote occupants in exactly the same way as local occupants. What is also nice is that multicast services can be replicated for remote users in the remote LAN so data never has to be sent more than once on the tunnel itself. This optimization is implemented automatically within ManifoldNexus.

Dynamic sentient spaces (where a standard viewer is customized for each space by the servers) is now basically working on the five platforms (Windows desktop, macOS, Windows Mixed Reality, Android and iOS). Persistent ad-hoc augmentations using downloadable assets is the next step in this process. Probably I am going to start with the virtual sticky note – this is where an occupant can leave a persistent message for other occupants. This requires a lot of the general functionality of persistent dynamic augmentations and is actually kind of useful for change!

Accessing the iOS WiFi IP address within a Unity app

At the moment, Manifold requires that a client supplies an appropriate IP address (although I might change this in the future). Mostly it is pretty easy to do but the .NET way of doing things (using Dns.GetHostEntry()) didn’t seem to pick up the WiFi IP address on an iPad. After wasting a lot of time, I decided to go back to basics and create a native plugin.

The basis of the code comes from here – it just needed the right wrapping to get it to work with Unity. I will be the first to admit that I know nothing about native iOS coding but, following the Bonjour example here, the code below  seemed to work just fine when placed in the Assets/Plugins/iOS directory of the project.


#import <Foundation/Foundation.h>

@interface IPAddressDelegate : NSObject

- (NSString *)getAddress;


#include <ifaddrs.h>
#include <arpa/inet.h>

#import "IPAddress.h"
@implementation IPAddressDelegate

- (id)init
    self = [super init];
    return self;

- (NSString *)getAddress {
    NSString *address = @"error";
    struct ifaddrs *interfaces = NULL;
    struct ifaddrs *temp_addr = NULL;
    int success = 0;
    success = getifaddrs(&interfaces);
    if (success == 0) {
        temp_addr = interfaces;
        while(temp_addr != NULL) {
            if(temp_addr->ifa_addr->sa_family == AF_INET) {
                if([[NSString stringWithUTF8String:temp_addr->ifa_name] isEqualToString:@"en0"]) {
                    address = [NSString stringWithUTF8String:inet_ntoa(((struct sockaddr_in *)temp_addr->ifa_addr)->sin_addr)];
            temp_addr = temp_addr->ifa_next;
    // Free memory
    return address;

static IPAddressDelegate* delegateObject = nil;

char* MakeStringCopy (const char* string)
    if (string == NULL)
        return NULL;
    char* res = (char*)malloc(strlen(string) + 1);
    strcpy(res, string);
    return res;

const char * getLocalWifiIpAddress()
    if (delegateObject == nil)
        delegateObject = [[IPAddressDelegate alloc] init];
    return MakeStringCopy([[delegateObject getAddress] UTF8String]);

To use the plugin is pretty straightforward. Just add this declaration to a C# class:

	[DllImport ("__Internal")]
	private static extern string getLocalWifiIpAddress();

Then call getLocalWiFiIpAddress() to get the dotted address string.

rt-xr sentient space visualization now on iOS!

I have to admit, I am in a state of shock right now. For some reason today I decided to try to get the rt-xr Viewer software working on iOS. After all, it worked fine on Windows desktop, UWP (Windows MR), macOS and Android so why not? However, I expected endless trouble with the Manifold library but, as it turned out, getting it to work on iOS was trivial. I guess Unity and .NET magic came together so I didn’t have to do too much work once again. In fact, the hardest part was working out how to sort out microphone permission and that wasn’t too hard – this thread certainly helped with that. Avatar pose sharing, audio sharing, proxy objects, video and sensor feeds all work perfectly.

The nice thing now is that most (if not all) of the further development is intrinsically multi-platform.