I caved and uploaded a snapshot of the current optical tracking sources, including a pre-release snapshot of upcoming Vrui-3.2-001 (please don’t use it outside of the tracking project; it’s bound to change some before it’s really released), to github: http://github.com/Doc-Ok/OpticalTracking.
In the previous part of this ongoing series of posts, I described how the Oculus Rift DK2’s tracking LEDs can be identified in the video stream from the tracking camera via their unique blinking patterns, which spell out 10-bit binary numbers. In this post, I will describe how that information can be used to estimate the 3D position and orientation of the headset relative to the camera; the first important step in full positional head tracking.
3D pose estimation, or the problem of reconstructing the 3D position and orientation of a known object relative to a single 2D camera, also known as the perspective-n-point problem, is a well-researched topic in computer vision. In the special case of the Oculus Rift DK2, it is the foundation of positional head tracking. As I tried to explain in this video, an inertial measurement unit (IMU) by itself cannot track an object’s absolute position over time, because positional drift builds up rapidly and cannot be controlled without an external 3D reference frame. 3D pose estimation via an external camera provides exactly such a reference frame. Continue reading
The final update/edit to my previous post was to report that I had managed to synchronize the DK2’s tracking LEDs to its camera’s video stream by following pH5’s ouvrt code, and that I was able to extract 5-bit IDs for each LED by observing changes in that LED’s brightness over time. Unfortunately I’ll have to start off right away by admitting that I made a bad mistake.
Understanding the DK2’s camera
Once I started looking more closely, I realized that the camera was only capturing 30 frames per second when locked to the DK2’s synchronization cable, instead of the expected 60. After downloading the data sheet for the camera’s imaging sensor, the Aptina MT9V034, and poring over the documentation, I realized that I had set a wrong vertical blanking interval. Instead of using a value of 5, as the official run-time and pH5’s code, I was using a value of 57, because that was the original value I found in the vertical blanking register before I started messing with the sensor. As it turns out, a camera — or at least this camera — captures video in the same way as a monitor displays it: padded with a horizontal and vertical blanking period. By leaving the vertical blanking period too large, I had extended the time it takes the camera to capture and send a frame across its host interface. Extended by how much? Well, the camera has a usable frame size of 752×480 pixels, a horizontal blanking interval of 94 pixels, and a (fixed) pixel clock of 26.66MHz. Using a vertical blanking interval of 5 lines, the total frame time is ((752+94)*(480+5)+4)/26.66MHz = 15.391ms (in case you’re wondering where the “+4” comes from, so am I. It’s part of the formula in the data sheet). Using 57 as vertical blanking interval, the total frame time becomes ((752+94)*(480+57)+4)/26.66MHz = 17.041ms. Notice something? 17.041ms is longer than the synchronization pulse interval of 16.666ms. Oops. The exposure trigger for an odd frame arrives at a time when the camera is still busy processing the preceding even frame, and is therefore ignored, resulting in the camera skipping every odd frame and capturing at 30Hz. Lesson learned.
Over the weekend, a bunch of people from all over got together on reddit to try and figure out how the Oculus Rift DK2’s optical tracking system works. This was triggered by a call for help to develop an independent SDK from redditor /u/jherico, in response to the lack of an official SDK that works under Linux. That thread became quite unwieldy quickly, with lots of speculation, experimentation, and outright wrong information being thrown around, and then later corrected, but with the corrections nowhere near the wrong bits, etc. etc.
To get some order into things, I want to summarize what we have learned over the weekend, to serve as a starting point for further investigation. In a nutshell, we now know:
- How to turn on the tracking LEDs integrated into the DK2.
- How to extract the 3D positions and maximum emission directions of the tracking LEDs, and the position of the DK2’s inertial measurement unit in the same coordinate system.
- How to get proper video from the DK2’s tracking camera.
Here’s what we still don’t know:
- How to properly control the tracking LEDs and synchronize them with the camera. Update: We got that.
- How to extract lens distortion and intrinsic camera parameters for the DK2’s tracking camera. Update: Yup, we got that, too. Well, sort of.
- And, the big one, how to put it all together to calculate a camera-relative position and orientation of the DK2. 🙂 Update: Aaaaand, we got that, too.
Let’s talk about all these points in a bit more detail. Continue reading
Step 1: System Preparation
If you are already running Linux, good for you. Skip the next paragraph.
If you don’t have Linux yet, go and grab it. I personally prefer Fedora, but it’s generally agreed that Ubuntu is the easiest to install for new Linux users, so let’s go with that. The Ubuntu installer makes it quite easy to install alongside an existing Windows OS on your system. Don’t bother installing Linux inside a virtual machine, though: that way Vrui won’t get access to your high-powered graphics cards, and performance will be abysmal. It won’t be able to talk to your Rift, either.
One of the first things to do after a fresh Linux install is to install the vendor-supplied drivers for your graphics card (if you don’t have a discrete Nvidia or ATI/AMD graphics card, go buy a GeForce!). Installing binary drivers is much easier these days. Here are instructions for Nvidia and ATI/AMD cards. If you happen to be on Fedora, enable the rpmfusion repositories and get the appropriate driver packages from there.