I wasn’t able to talk about this before, but now I guess the cat’s out of the bag. About two years ago, we helped a team of archaeologists and filmmakers to visualize a very large high-resolution aerial LiDAR scan of a chunk of dense Honduran rain forest in the CAVE. Early analyses of the scan had found evidence of ruins hidden under the foliage, and using LiDAR Viewer in the CAVE, we were able to get a closer look. The team recently mounted an expedition, and found untouched remains of not one, but two lost cities in the jungle. Read more about it at National Geographic and The Guardian. I want to say something cool and Indiana Jones-like right now, but I won’t.
“I decided to spectate a race he was in. I then discovered I could watch him race from his passenger seat. in VR. in real time. I can’t even begin to explain the emotions i was feeling sitting in his car, in game, watching him race. I was in the car with him. … I looked over to ‘him’ and could see all his steering movements, exactly what he was doing. I pictured his intense face as he was pushing for 1st.”
I don’t know if this effect has a name, or even needs one, but it parallels something we’ve observed through our work with Immersive 3D Telepresence:
Microsoft just announced HoloLens, which “brings high-definition holograms to life in your world.” A little while ago, Google invested heavily in Magic Leap, who, in their own words, “bring magic back into the world.” A bit longer ago, CastAR promised “a magical experience of a 3D, holographic world.” Earlier than that, zSpace started selling displays they used to call “virtual holographic 3D.” Then there is the current trailblazer in mainstream virtual reality, the Oculus Rift, and other, older, VR systems such as CAVEs.
While these things are quite different from a technical point of view, from a user’s point of view, they have a large number of things in common. Wouldn’t it be nice to have a short, handy term that covers them all, has a well-matching connotation in the minds of the “person on the street,” and distinguishes these things from other things that might be similar technically, but have a very different user experience?
How about the term “holographic?” Continue reading
We had a couple of visitors from Intel this morning, who wanted to see how we use the CAVE to visualize and analyze Big Datatm. But I also wanted to show them some aspects of our 3D video / remote collaboration / tele-presence work, and since I had just recently implemented a new multi-camera calibration procedure for depth cameras (more on that in a future post), and the alignment between the three Kinects in the IDAV VR lab’s capture space is now better than it has ever been (including my previous 3D Video Capture With Three Kinects video), I figured I’d try something I hadn”t done before, namely remotely interacting with myself (see Figure 1).
I caved and uploaded a snapshot of the current optical tracking sources, including a pre-release snapshot of upcoming Vrui-3.2-001 (please don’t use it outside of the tracking project; it’s bound to change some before it’s really released), to github: http://github.com/Doc-Ok/OpticalTracking.
In the previous part of this ongoing series of posts, I described how the Oculus Rift DK2′s tracking LEDs can be identified in the video stream from the tracking camera via their unique blinking patterns, which spell out 10-bit binary numbers. In this post, I will describe how that information can be used to estimate the 3D position and orientation of the headset relative to the camera; the first important step in full positional head tracking.
3D pose estimation, or the problem of reconstructing the 3D position and orientation of a known object relative to a single 2D camera, also known as the perspective-n-point problem, is a well-researched topic in computer vision. In the special case of the Oculus Rift DK2, it is the foundation of positional head tracking. As I tried to explain in this video, an inertial measurement unit (IMU) by itself cannot track an object’s absolute position over time, because positional drift builds up rapidly and cannot be controlled without an external 3D reference frame. 3D pose estimation via an external camera provides exactly such a reference frame. Continue reading
The final update/edit to my previous post was to report that I had managed to synchronize the DK2′s tracking LEDs to its camera’s video stream by following pH5′s ouvrt code, and that I was able to extract 5-bit IDs for each LED by observing changes in that LED’s brightness over time. Unfortunately I’ll have to start off right away by admitting that I made a bad mistake.
Understanding the DK2′s camera
Once I started looking more closely, I realized that the camera was only capturing 30 frames per second when locked to the DK2′s synchronization cable, instead of the expected 60. After downloading the data sheet for the camera’s imaging sensor, the Aptina MT9V034, and poring over the documentation, I realized that I had set a wrong vertical blanking interval. Instead of using a value of 5, as the official run-time and pH5′s code, I was using a value of 57, because that was the original value I found in the vertical blanking register before I started messing with the sensor. As it turns out, a camera — or at least this camera — captures video in the same way as a monitor displays it: padded with a horizontal and vertical blanking period. By leaving the vertical blanking period too large, I had extended the time it takes the camera to capture and send a frame across its host interface. Extended by how much? Well, the camera has a usable frame size of 752×480 pixels, a horizontal blanking interval of 94 pixels, and a (fixed) pixel clock of 26.66MHz. Using a vertical blanking interval of 5 lines, the total frame time is ((752+94)*(480+5)+4)/26.66MHz = 15.391ms (in case you’re wondering where the “+4″ comes from, so am I. It’s part of the formula in the data sheet). Using 57 as vertical blanking interval, the total frame time becomes ((752+94)*(480+57)+4)/26.66MHz = 17.041ms. Notice something? 17.041ms is longer than the synchronization pulse interval of 16.666ms. Oops. The exposure trigger for an odd frame arrives at a time when the camera is still busy processing the preceding even frame, and is therefore ignored, resulting in the camera skipping every odd frame and capturing at 30Hz. Lesson learned.
Over the weekend, a bunch of people from all over got together on reddit to try and figure out how the Oculus Rift DK2′s optical tracking system works. This was triggered by a call for help to develop an independent SDK from redditor /u/jherico, in response to the lack of an official SDK that works under Linux. That thread became quite unwieldy quickly, with lots of speculation, experimentation, and outright wrong information being thrown around, and then later corrected, but with the corrections nowhere near the wrong bits, etc. etc.
To get some order into things, I want to summarize what we have learned over the weekend, to serve as a starting point for further investigation. In a nutshell, we now know:
- How to turn on the tracking LEDs integrated into the DK2.
- How to extract the 3D positions and maximum emission directions of the tracking LEDs, and the position of the DK2′s inertial measurement unit in the same coordinate system.
- How to get proper video from the DK2′s tracking camera.
Here’s what we still don’t know:
- How to properly control the tracking LEDs and synchronize them with the camera. Update: We got that.
- How to extract lens distortion and intrinsic camera parameters for the DK2′s tracking camera. Update: Yup, we got that, too. Well, sort of.
- And, the big one, how to put it all together to calculate a camera-relative position and orientation of the DK2.
Let’s talk about all these points in a bit more detail. Continue reading
Now that I’ve gotten my Oculus Rift DK2 (mostly) working with Vrui under Linux, I’ve encountered the dreaded artifact often referred to as “black smear.” While pixels on OLED screens have very fast switching times — orders of magnitude faster than LCD pixels — they still can’t switch from on to off and back instantaneously. This leads to a problem that’s hardly visible when viewing a normal screen, but very visible in a head-mounted display due to a phenomenon called “vestibulo-ocular reflex.”
Basically, our eyes have built-in image stabilizers: if we move our head, this motion is detected by the vestibular apparatus in the inner ear (our “sense of equilibrium”), and our eyes automatically move the opposite way to keep our gaze fixed on a fixed point in space (interestingly, this even happens with the eyes closed, or in total darkness).
I’ve been getting a lot of questions about using the Rift DK2 under Linux with Vrui recently, so I figured I’d post a little progress report here instead of answering them individually.
The good news is that I have the DK2 working to the level of the DK1, i.e., I have orientational tracking, lens distortion correction, and chromatic aberration correction. I also have low persistence, but that came for free.
What I don’t have, and most probably won’t have until an official Linux SDK drops, is positional tracking. In order to replicate the work a team of computer vision experts at Oculus have been doing for the last year or so, I’d need a few clones and a time machine. That said, I am working on combining the DK1/DK2′s built-in IMU with other external tracking systems, such as Intersense IS-900 or NaturalPoint OptiTrack. That’s a much easier (but still tricky) problem, and would allow using the Rift as a headset for large-area VR. Probably not interesting for home users, but being able to walk around freely in an 18′x10′x7′ volume opens up entirely different VR applications.
I’m currently working hard on the next release of the Vrui toolkit (version 3.2-001), which will have at least the level of DK2 support that I have internally now (combined tracking might or might not make it, but that can already be faked, see 3D Video Capture With Three Kinects).
The reason why I’m not releasing right now is that I’m still trying to optimize the “user experience” by integrating the ideas I described in A Trip Down the Graphics Pipeline. The idea is that plugging in a Rift and starting a Vrui application should just work. I have most of that going; the only issue is telling OpenGL to sync to the vertical retrace on the Rift’s display, no matter what. Right now that can only be done via environment variable, and I’m looking for the right place in Vrui to set that variable from inside a program. It’s a work-around until Nvidia expose that functionality via their NV-CONTROL X extension, or, even better, via a GLX extension (are you listening, Nvidia?). Or, why not change the implementation of GLX_SGI_video_sync, which is already bound to a display and drawable, such that it always syncs to the first video controller servicing that drawable? Wouldn’t even require a specification change. Just an idea.
And last but not least, once I got the DK2 and its low-persistence screen working, I realized how cavalier I’ve been about low-level timing issues in Vrui. With screen-based VR and LCD-based HMDs it has simply never been an issue before, but now it’s pretty obvious. Good thing is, I think I have a handle on it.
In summary: it’ll be a little bit longer, but I’m on it. Will I be able to release before Oculus does their Linux SDK? Sure hope so! And just in case you think I’ve been sitting on my hands for the last six months: there are already about 300 large and small changes between 3.1-002 and 3.2-001.
And here is today’s unrelated picture: