HoloLens Spectator View…without the HoloLens

I’ll explain the photo above in a moment. Microsoft’s Spectator View is a great device but not that practical in the general case. For example, the original requires modifications to the HoloLens itself and a fairly costly camera capable of outputting clean 1080p, 2k or 4k video on an HDMI port. Total cost can be more than $6000 depending on the camera used. My goal is to do much the same thing but without requiring a HoloLens and at a much lower cost – just using a standard camera with fairly simple calibration. Not only that, but I want to stream the mixed reality video across the internet using WebRTC for both conventional and stereo headsets (such as VR headsets).

So, why is there a HoloLens in the photo? This is the calibration setup. The camera that I am using for this Mixed Reality streaming system is a Stereolabs ZED. I have been working with this quite a bit lately and it seems to work extremely well. Notably it can produce a 2K stereo 3D output, a depth map and a 6 DoF pose, all available via a USB 3 interface and a very easy to use SDK.

Unlike Spectator View, the Unity Editor is not used on the desktop. Instead, a standard HoloLens UWP app is run on a Windows 10 desktop, along with a separate capture, compositor and WebRTC streamer program. There is some special code in the HoloLens app that talks to the rest of the streaming system. This can be present in the HoloLens app even when run on the HoloLens without problems (it just remains inactive in this case).

The calibration process determines, amongst other things, the actual field of view of the ZED and its orientation and position in the Unity scene used to create the virtual part of the mixed reality scene. This is essential in order to correctly render the virtual scene in a form that can be composited with the video coming from the ZED. This is why the HoloLens is placed in this prototype rig in the photo. It puts the HoloLens camera roughly in the same vertical plane as the ZED camera with a small (known) vertical offset. It’s not critical to get the orientation exactly right when fitting the HoloLens to the rig – this can be calibrated out very easily. The important thing is that the cameras see roughly the same field. That’s the because the next step matches features in each view and, from the positions of the matches, can derive the field of view of the ZED and its pose offset from the HoloLens. This then makes it possible to set the Unity camera in the desktop in exactly the right position and orientation so that the scene it streams is correctly composed.

Once the calibration step has completed, the HoloLens can be removed and used as required. The prototype version looks very ungainly like this! The real version will have a nice 3D printed bracket system that will also have the advantage of reducing the vertical separation and limit the possible offsets.

In operation, it is required that the HoloLens apps running on both the HoloLens(es) and the desktop are sharing data about the Unity scene that allows each device to compute exactly the same scene. In this way, everyone sees the same thing. I am actually using Arvizio‘s own sharing system but any sharing system could be used. The Unity scene generated on the desktop is then composited with the ZED camera’s video feed and streamed over WebRTC. The nice thing about using WebRTC is that almost anyone with a Chrome or Firefox browser can display the mixed reality stream without having to install any plugins or extensions. It is also worth mentioning that the ZED does not have to remain fixed in place after calibration. Because it is able to measure its pose with respect to its surroundings, the ZED could potentially pan, tilt and dolly if that is required.

6 thoughts on “HoloLens Spectator View…without the HoloLens”

  1. Could you please share some code that shows the conversion of the ZEDs data so that the SharingService accepts it to display the holograms in the SpectatorView, even when you move the ZED around.
    Will an object, located under a table, not be showed if the ZED is above that table, pointing its lens down at the table?

    1. Unfortunately the code is proprietary and does not work with the standard HoloLens sharing service anyway. As far as I am aware the HoloLens spatial anchor system is opaque. My solution was a calibration step where the offset between the ZED’s coordinate system and the HoloLens’s could be determined. I am not aware of any way that ZED data can be used to simulate a HoloLens spatial anchor.

      If you use the ZED Unity plugin, it does do the kind of occlusion to which you are referring I believe.

  2. Hi Richard,
    thanks for an intresting article!

    I’m just wonder why you need the HoloLenses at all in the SpectatorView? Can’t you just use the ZED for that? The ZED has the 6DoF tracking position “built in”, like the HaloLenses and it also has its camerafunction.


    1. Funny you should say that. I did actually develop code that used the ZED’s tracking capabilities in order to avoid the need for the HoloLens. However, there were some issues. One is that the ZED would easily lose tracking or drift. This may have been improved now but, at the time, meant that it was impractical to move the ZED once calibrated. Another is related to HoloLens sharing, where people with real HoloLenses are interacting with and modifying the scene. The HoloLens has great spatial mapping and anchor sharing capabilities but unfortunately they are pretty opaque and I could not see how to relate the information to the ZED. In the end, using a HoloLens for its spatial mapping and anchor sharing capabilities is just easier.

      1. I see. So the ZED lose tracking, I guess thats an important thing in the ZED, do you mean that problem is just related to the use of the ZED as a tracking device together with the SharedHologram app? Guess one should notice that problem in other functions of the ZED.

      2. I found that the ZED could not reliably track in my test environments, independent of the HoloLens sharing issue. For example, if you panned the ZED and then returned to the original position, the ZED would not always return its pose to the original pose (origin shift). I am sure there are scene-dependent limits to the capabilities of short baseline stereo depth sensing. The structured light system used by the HoloLens seems better in this regard but is by no means perfect – origin shifts have been seen with the HoloLens too. Accurate indoor pose tracking in unstructured environments is a very challenging problem! For many applications, some tracking error or origin shift would be acceptable though.

        What the ZED is great at though is real-time occlusion. The HoloLens typically loses its realism when a real object that should occlude a virtual object does not. As there is almost no downside to performing continuous depth sensing with the ZED, it is able to perform real-time occlusion processing.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.