An rt-xr SpaceObjects tour de force

rt-xr SpaceObjects are now working very nicely. It’s easy to create, configure and delete SpaceObjects as needed using the menu switch which has been placed just above the light switch in my office model above.

The video below shows all of this in operation.

The typical process is to instantiate an object, place and size it and then attach it to a Manifold stream if it is a Proxy Object. Persistence, sharing and collaboration works for all relevant SpaceObjects across the supported platforms (Windows and macOS desktop, Windows MR, Android and iOS).

This is a good place to leave rt-xr for the moment while I wait for the arrival of some sort of AR headset in order to support local users of an rt-xr enhanced sentient space. Unfortunately, Magic Leap won’t deliver to my zip code (sigh) so that’s that for the moment. Lots of teasers about the HoloLens 2 right now and this might be the best way to go…eventually.

Now the focus moves back to rt-ai Edge. While this is working pretty well, it needs to have a few bugs fixed and also add some production modes (such as auto-starting SPNs when server nodes are started). Then begins the process of data collection for machine learning. ZeroSensors will collect data from each monitored room and this will be saved by ManifoldStore for later use. The idea is to classify normal and abnormal situations and also to be proactive in responding to the needs of occupants of the sentient space.

Multi-platform interaction styles for rt-xr and Unity

The implementation of sticky notes in rt-xr opened a whole can of worms but really just forced the development of a set of capabilities that will be needed for the general case where occupants of a sentient space can download assets from anywhere, instantiate them in a space and then interact with them. In particular, the need to be able to create a sticky note, position it and add text to it when being used on the supported platforms (Windows and mac OS desktop, Android, iOS and Windows Mixed Reality) required a surprising amount of work. I decided to standardize on a three button mouse model and map the various interaction styles into mouse events. This means that the bulk of the code doesn’t need to care about the interaction style (mouse, motion controller, touch screen etc) as all the complexity is housed in one script.

The short video below shows this in operation on a Windows desktop.

It ended up running a bit fast but that was due to the video recorder setup – I can’t really do things that fast!

I am still just using opaque devices – where is my HoloLens 2 or Magic Leap?!!! However, things should map across pretty well. Note how the current objects are glued to the virtual walls. Using MR devices, the spatial maps would be used for raycasting so that the objects would be glued to the real walls. I do need to add a mode where you can pull things off walls and position them arbitrarily in space but that’s just a TODO right now.

What doesn’t yet work is sharing of these actions. When an object is moved, that move should be visible to all occupants of the space. Likewise, when a new object is created or text updated on a sticky note, everyone should see that. That’s the next step, followed by making all of this persistent.

Anyway, here is how interaction works on the various platforms.

Windows and mac OS desktop

For Windows, the assumption is that three button (middle wheel) mouse is used. The middle button is used to grab and position objects. The right mouse button opens up the menu for that object. The left button is used for selection and resizing. On the Mac, which doesn’t have a middle button, the middle button is simulated by holding down the Command key which maps the left button into the middle button.

Navigation is via the SpaceMouse on both platforms.

Windows Mixed Reality

The motion controllers have quite a few controls and buttons available. I am using the Grab button to grab and position objects. The trigger is used for selection and resizing while the menu button is used to bring up the object menu. Pointing at the sticky note and pressing the trigger causes the virtual keyboard to appear.

Navigation uses the standard joystick-based teleport system.

Android and iOS

My solution here is a little ugly. Basically, the number of fingers used for a tap and/or hold dictates which mouse button the action maps to. A single touch means the left mouse button, two touches means the right mouse button while three touches means the middle button. It works but it is pretty amusing trying to get three simultaneous touches on an object to initiate a grab on a small screen device like a phone!

Navigation is via single or dual touch. Single touch and slide moves in x and y directions. Dual touch and slide rotates around the y axis. Since touches are used for other things, navigation touches need to be made away from objects or else they will be misinterpreted. Probably there is a better way of doing this. However, in the longer term, see-through mode using something like ARCore or ARKit will eliminate the navigation issue which is only a problem in VR (opaque) mode. I assume the physical occupants of a space will use see-through mode with only remote occupants using VR mode.

I haven’t been using ARCore or ARKit yet, mainly because they haven’t seemed good enough to create a spatial map that is useful for rt-xr. This is changing (ARKit 2 for example) but the question is whether it can cope with multiple rooms. For example, objects behind a real wall should not be visible – they need to be occluded by the spatial map. The HoloLens can do this however and is the best available option right now for multi-room MR with persistence.

 

rt-xr sentient space visualization now on iOS!

I have to admit, I am in a state of shock right now. For some reason today I decided to try to get the rt-xr Viewer software working on iOS. After all, it worked fine on Windows desktop, UWP (Windows MR), macOS and Android so why not? However, I expected endless trouble with the Manifold library but, as it turned out, getting it to work on iOS was trivial. I guess Unity and .NET magic came together so I didn’t have to do too much work once again. In fact, the hardest part was working out how to sort out microphone permission and that wasn’t too hard – this thread certainly helped with that. Avatar pose sharing, audio sharing, proxy objects, video and sensor feeds all work perfectly.

The nice thing now is that most (if not all) of the further development is intrinsically multi-platform.

Sentient space sharing avatars with Windows desktop, Windows Mixed Reality and Android apps


One of the goals of the rt-ai Edge system is that users of the system can use whatever device they have available to interact and extract value from it. Unity is a tremendous help given that Unity apps can be run on pretty much everything. The main task was integration with Manifold so that all apps can receive and interact with everything else in the system. Manifold currently supports Windows, UWP, Linux, Android and macOS. iOS is a notable absentee and will hopefully be added at some point in the future. However, I perceive Android support as more significant as it also leads to multiple MR headset support.

The screen shot above and video below show three instances of the rt-ai viewer apps running on Windows desktop, Windows Mixed Reality and Android interacting in a shared sentient space. Ok, so the avatars are rubbish (I call them Sad Robots) but that’s just a detail and can be improved later. The wall panels are receiving sensor and video data from ZeroSensors via an rt-ai Edge stream processing network while the light switch is operated via a home automation server and Insteon.

Sharing is mediated by a SharingServer that is part of Manifold. The SharingServer uses Manifold multicast and end to end services to implement scalable sharing while minimizing the load on each individual device. Ultimately, the SharingServer will also download the space definition file when the user enters a sentient space and also provide details of virtual objects that may have been placed in the space by other users. This allows a new user with a standard app to enter a space and quickly create a view of the sentient space consistent with existing users.

While this is all kind of fun, the more interesting thing is when this is combined with a HoloLens or similar MR headset. The MR headset user in a space would see any VR users in the space represented by their avatars. Likewise, VR users in a space would see avatars representing MR users in the space. The idea is to get as close to a telepresent experience for VR users as possible without very complex setups. It would be much nicer to use Holoportation but that would require every room in the space has a very complex and expensive setup which really isn’t the point. The idea is to make it very easy and low cost to implement an rt-ai Edge based sentient space.

Still lots to do of course. One big thing is audio. Another is representing interaction devices (pointers, motion controllers etc) to all users. Right now, each app just sends out the camera transform to the SharingServer which then distributes this to all other users. This will be extended to include PCM audio chunks and transforms for interaction devices so that everyone will be able to create a meaningful scene. Each user will receive the audio stream from every other user. The reason for this is that then each individual audio stream can be attached to the avatar for each user giving a spatialized sound effect using Unity capabilities (that’s the hope anyway). Another very important thing is that the apps work differently if they are running on VR type devices or AR/MR type devices. In the latter case, the walls and related objects are not drawn and just the colliders instantiated although virtual objects and avatars will be visible. Obviously AR/MR users want to see the real walls, light switches etc, not the virtual representations. However, they will still be able to interact in exactly the same way as a VR user.

Using Unity and Manifold with Android devices to visualize sentient spaces

This may not look impressive to you (or my wife as it turns out) but it has a lot of promise for the future. Following on from 3DView, there’s now an Android version called (shockingly) AndroidView that is essentially the same thing running on an Android phone in this case. The screen capture above shows the current basic setup displaying sensor data. Since Unity is basically portable across platforms, the main challenge was integrating with Manifold to get the sensor data being generated by ZeroSensors in an rt-aiEdge stream processing network.

I did actually have a Java implementation of a Manifold client from previous work – the challenge was integrating with Unity. This meant building the client into an aar file and then using Unity’s AndroidJavaObject to pass data across the interface. Now I understand how that works, it really is quite powerful and I was able to do everything needed for this application.

There are going to be more versions of the viewer. For example, in the works is rtXRView which is designed to run on Windows MR headsets. The way I like to structure this is to have separate Unity projects for each target and then move common stuff via Unity’s package system. With a bit of discipline, this works quite well. The individual projects can then have any special libraries (such as MixedRealityToolkit), special cameras, input processing etc without getting too cute.

Once the basic platform work is done, it’s back to sorting out modeling of the sentient space and positioning of virtual objects within that space. Multi-user collaboration and persistent sentient space configuration is going to require a new Manifold app to be called SpaceServer. Manifold is ideal for coordinating real-time changes using its natural multicast capability. For Unity reasons, I may integrate a webserver into SpaceServer so that assets can be dynamically loaded using standard Unity functions. This supports the idea that a new user walking into a sentient space is able to download all necessary assets and configurations using a standard app. Still, that’s all a bit in the future.

Google’s WorldSense

Lenovo just announced the Mirage Solo VR headset with Google’s WorldSense inside-out tracking capability. The result is an untethered VR headset which presumably has spatial mapping capabilities, allowing spatial maps to be saved and shared. If so, this would be a massive advance over ARKit and ARCore based AR which makes persistence and collaboration all but impossible (the post here goes into a lot of detail about the various issues related to persistence and collaboration with current technology). The lack of a tether also gives it an edge over Microsoft’s (so-called) Mixed Reality headsets.

Google’s previous Tango system (that’s a Lenovo Phab 2 Pro running it above) did have much more interesting capabilities than ARCore but has fallen by the wayside. In particular, Tango had an area learning capability that is missing from ARCore. I am very much hoping that something like this will exist in WorldSense so that virtual objects can be placed persistently in spaces and that spatial maps can be shared so that multiple headsets see exactly the same virtual objects in exactly the same place in the real space. Of course this isn’t all that helpful when used with a VR headset – but maybe someone will manage a pass-through or see-through mixed reality headset using WorldSense that will enable persistent spatial augmentation using a headset with hopefully reasonable cost for ubiquitous use. If it was also able to perform real time occlusion (where virtual objects can get occluded by real objects), that would be even better!

An interesting complement to this is the Lenovo Mirage stereo camera. This is capable of taking 180 degree videos and stills suitable for use with stereoscopic 3D displays, such as the Mirage headset. Suddenly occurred to me that this might be a way of hacking a  pass-through AR capability for Mirage before someone does it for real :-). This is kind of what Stereolabs are doing for existing VR headsets with their ZED mini except that this is a tethered solution. The nice thing would be to do this in an untethered way.