I know, the Oculus Rift DK2 is obsolete equipment, but nonetheless — there are a lot of them still out there, it’s still a decent VR headset for seated applications, I guess they’re getting cheaper on eBay now, and I put in all the work back then to support it in Vrui, so I might as well describe how to use it. If nothing else, the DK2 is a good way to watch DVD movies, or panoramic mono- or stereoscopic videos, in VR.
With the first commercial version of the Oculus Rift (Rift CV1) now trickling out of warehouses, and Rift DK2, HTC Vive DK1, and Vive Pre already being in developers’ hands, it’s time for a more detailed comparison between these head-mounted displays (HMDs). In this article, I will look at these HMDs’ lenses and optics in the most objective way I can, using a calibrated fish-eye camera (see Figures 1, 2, and 3).
I’ve been involved in some arguments about the inner workings of the Oculus Rift’s and HTC/Valve Vive’s tracking systems recently, and while I don’t want to get into any of that right now, I just did a little experiment.
The tracking update rate of the Oculus Rift DK2, meaning the rate at which Oculus’ tracking driver sends different position/orientation estimates to VR applications, is 1000 Hz. However, the time between updates is 2ms, meaning that the driver updates the position/orientation, and then updates it again immediately afterwards, 500 times per second.
This is not surprising at all, given my earlier observation that the DK2 samples its internal IMU at a rate of 1000 Hz, and sends data packets containing 2 IMU samples each to the host at a rate of 500 Hz. The tracking driver is then kind enough to process these samples individually, and pass updated tracking data to applications after it’s done processing each one. That second part is maybe a bit superfluous, but I’ll take it.
Here is a (very short excerpt of a) dump from the test application I wrote:
0.00199484: -0.0697729, -0.109664, -0.458555 6.645e-06 : -0.0698003, -0.110708, -0.458532 0.00199313: -0.069828 , -0.111758, -0.45851 6.012e-06 : -0.0698561, -0.112813, -0.458488 0.00200075: -0.0698847, -0.113875, -0.458466 6.649e-06 : -0.0699138, -0.114943, -0.458445 0.0019885 : -0.0699434, -0.116022, -0.458427 5.915e-06 : -0.0699734, -0.117106, -0.45841 0.0020142 : -0.070004 , -0.118196, -0.458393 5.791e-06 : -0.0700351, -0.119291, -0.458377 0.00199589: -0.0700668, -0.120392, -0.458361 6.719e-06 : -0.070099 , -0.121499, -0.458345 0.00197487: -0.0701317, -0.12261 , -0.45833 6.13e-06 : -0.0701651, -0.123727, -0.458314 0.00301248: -0.0701991, -0.124849, -0.458299 5.956e-06 : -0.0702338, -0.125975, -0.458284 0.00099399: -0.0702693, -0.127107, -0.458269 5.971e-06 : -0.0703054, -0.128243, -0.458253 0.0019938 : -0.0703423, -0.129384, -0.458238 5.938e-06 : -0.0703799, -0.130529, -0.458223 0.00200243: -0.0704184, -0.131679, -0.458207 7.434e-06 : -0.0704576, -0.132833, -0.458191 0.0019831 : -0.0704966, -0.133994, -0.458179 5.957e-06 : -0.0705364, -0.135159, -0.458166 0.00199577: -0.0705771, -0.136328, -0.458154 5.974e-06 : -0.0706185, -0.137501, -0.458141
The first column is the time interval between each row and the previous row, in seconds. The second to fourth rows are the reported (x, y, z) position of the headset.
I hope this puts the myth to rest that the DK2 only updates its tracking data when it receives a new frame from the tracking camera, which is 60 times per second, and confirms that the DK2’s tracking is based on dead reckoning with drift correction. Now, while it is possible that the commercial version of the Rift does things differently, I don’t see a reason why it should.
PS: If you look closely, you’ll notice an outlier in rows 15 and 17: the first interval is 3ms, and the second interval is only 1ms. One sample missed the 1000 Hz sample clock, and was delivered on the next update.
I’ve been busy finalizing the upcoming 4.0 release of the Vrui VR toolkit (it looks like I will have full support for Oculus Rift DK2 just before it is obsoleted by the commercial version, haha), and needed a short break.
So I figured I’d do something I’ve never done before in VR, namely, watch a full-length theatrical movie. I’m still getting DVDs from Netflix like it’s 1999, and I had “Avengers: Age of Ultron” at hand. The only problem was that I didn’t have a VR-enabled movie player.
Well, how hard can that be? Not hard at all, as it turns out. I installed the development packages for the xine multimedia framework, browsed through their hacker’s guide, figured out where to intercept audio buffers and decoded video frames, and three hours later I had a working prototype. A few hours more, and I had a user interface, full DVD menu navigation, a scrub bar, and subtitles. In 737 lines of code, a big chunk of which is debugging output to trace the control and data flow of the xine library. So yeah, libxine is awesome.
Then it was time to pull the easy chair into the office, start VruiXine, put on the Rift, map DVD navigation controls to the handy SteelSeries Stratus XL bluetooth gamepad they were giving away at Oculus Connect2, and relax (see Figure 1).Continue reading
I finally managed to get the Oculus Rift DK2 fully supported in my Vrui VR toolkit, and while there are still some serious issues, such as getting the lens distortion formulas and internal HMD geometry exactly right, I’ve already noticed something really neat.
I have a bunch of graphically simple applications that run at ridiculous frame rates (some get several thousand fps on an Nvidia GeForce 770 GTX), and with some new rendering configuration options in Vrui 4.0 I can disable vsync, and render directly into the display window’s front buffer. In other words, I can let these applications “race the beam.”
There are two main results of disabling vsync and rendering into the front buffer: For one, the CPU and graphics card get really hot (so this is not something you want to do this naively). But second, let’s assume that some application can render 1,000 fps. This means, every millisecond, a new complete video frame is rendered into video scan-out memory, where it gets picked up by the video controller and sent across the video link immediately. In other words, almost every line of the Rift’s display gets a “fresh” image, based on most up-to-date tracking data, and flashes this image to the user’s retina without further delay. Or in other words, total motion-to-photon latency for the entire screen is now down to around 1ms. And the result of that is by far the most solid VR I’ve ever seen.
Not entirely useful, but pretty cool nonetheless.
Last Friday I made a trek down to the San Francisco peninsula, to visit and chat with a couple of other VR folks: Cyberith, SVVR, and AltspaceVR. In the process, I also had the chance to try a couple of VR devices I hadn’t seen before.
Virtual locomotion, and its nasty side effect, simulator sickness, are a pretty persistent problem and timely topic with the arrival of consumer VR just around the corner. Many enthusiasts want to use VR to explore large virtual worlds, as in taking a stroll through the frozen tundra of Skyrim or the irradiated wasteland of Fallout, but as it turns out, that’s one of the hardest things to do right in VR.
We had a couple of visitors from Intel this morning, who wanted to see how we use the CAVE to visualize and analyze Big Datatm. But I also wanted to show them some aspects of our 3D video / remote collaboration / tele-presence work, and since I had just recently implemented a new multi-camera calibration procedure for depth cameras (more on that in a future post), and the alignment between the three Kinects in the IDAV VR lab’s capture space is now better than it has ever been (including my previous 3D Video Capture With Three Kinects video), I figured I’d try something I hadn”t done before, namely remotely interacting with myself (see Figure 1).
I caved and uploaded a snapshot of the current optical tracking sources, including a pre-release snapshot of upcoming Vrui-3.2-001 (please don’t use it outside of the tracking project; it’s bound to change some before it’s really released), to github: http://github.com/Doc-Ok/OpticalTracking.
In the previous part of this ongoing series of posts, I described how the Oculus Rift DK2’s tracking LEDs can be identified in the video stream from the tracking camera via their unique blinking patterns, which spell out 10-bit binary numbers. In this post, I will describe how that information can be used to estimate the 3D position and orientation of the headset relative to the camera; the first important step in full positional head tracking.
3D pose estimation, or the problem of reconstructing the 3D position and orientation of a known object relative to a single 2D camera, also known as the perspective-n-point problem, is a well-researched topic in computer vision. In the special case of the Oculus Rift DK2, it is the foundation of positional head tracking. As I tried to explain in this video, an inertial measurement unit (IMU) by itself cannot track an object’s absolute position over time, because positional drift builds up rapidly and cannot be controlled without an external 3D reference frame. 3D pose estimation via an external camera provides exactly such a reference frame. Continue reading
The final update/edit to my previous post was to report that I had managed to synchronize the DK2’s tracking LEDs to its camera’s video stream by following pH5’s ouvrt code, and that I was able to extract 5-bit IDs for each LED by observing changes in that LED’s brightness over time. Unfortunately I’ll have to start off right away by admitting that I made a bad mistake.
Understanding the DK2’s camera
Once I started looking more closely, I realized that the camera was only capturing 30 frames per second when locked to the DK2’s synchronization cable, instead of the expected 60. After downloading the data sheet for the camera’s imaging sensor, the Aptina MT9V034, and poring over the documentation, I realized that I had set a wrong vertical blanking interval. Instead of using a value of 5, as the official run-time and pH5’s code, I was using a value of 57, because that was the original value I found in the vertical blanking register before I started messing with the sensor. As it turns out, a camera — or at least this camera — captures video in the same way as a monitor displays it: padded with a horizontal and vertical blanking period. By leaving the vertical blanking period too large, I had extended the time it takes the camera to capture and send a frame across its host interface. Extended by how much? Well, the camera has a usable frame size of 752×480 pixels, a horizontal blanking interval of 94 pixels, and a (fixed) pixel clock of 26.66MHz. Using a vertical blanking interval of 5 lines, the total frame time is ((752+94)*(480+5)+4)/26.66MHz = 15.391ms (in case you’re wondering where the “+4” comes from, so am I. It’s part of the formula in the data sheet). Using 57 as vertical blanking interval, the total frame time becomes ((752+94)*(480+57)+4)/26.66MHz = 17.041ms. Notice something? 17.041ms is longer than the synchronization pulse interval of 16.666ms. Oops. The exposure trigger for an odd frame arrives at a time when the camera is still busy processing the preceding even frame, and is therefore ignored, resulting in the camera skipping every odd frame and capturing at 30Hz. Lesson learned.