I just got back from the Silicon Valley Virtual Reality Conference & Expo in the awesome Computer History Museum in Mountain View, just across the street from Google HQ. There were talks, there were round tables, there were panels (I was on a panel on non-game applications enabled by consumer VR, livestream archive here), but most importantly, there was an expo for consumer VR hardware and software. Without further ado, here are my early reports on what I saw and/or tried.
I finally got to try the DK2 for the first time, and it held up to my expectations. Oculus were showing the “Couch Knights” demo, but I must admit I wasn’t really paying much attention to it (which is why the Oculus guy in the other seat beat me silly, or at least that’s my excuse). I was primarily checking out the positional head tracking, the higher resolution (it’s higher), the new subpixel arrangement (it’s a different, less distracting screen pattern than DK1’s “screen door”), and the low-persistence display (it blurs less). I feel that the DK2’s screen is at the point where it’s not a problem anymore.
So let’s focus on the big thing, camera-based positional head tracking. I’m glad to report that it works. Tracking has no noticeable jitter and low latency, but I couldn’t judge the overall accuracy because the demo was not adjusted for my IPD, which is on the narrow side, and everything, including tracking, was off scale. That said, based on the used technology and the high quality of the rest of the system, I don’t expect any issues.
In my nixed ANTVR diatribe I make the bold claim that the qualitative difference in experience between a fully position-tracked HMD like the DK2, and an orientation-tracked HMD like the DK1, is on the same order of magnitude as that between an orientation-tracked HMD and a regular computer screen (which is why the ANTVR can not be better than DK2/Morpheus regardless of its other tech specs, but never mind that now). I now have to amend that claim: the improvement from a position-tracked HMD isn’t quite as big when your ass is glued to a seat. I understand why Oculus is preaching a “seated VR experience” (for liability reasons if nothing else), but I hope they’ll tell developers, sub rosa if necessary, that their software should work in, and fully exploit, a standing experience, even if the current tracking system doesn’t quite support it. Your body knows where the floor should be, and if you are sitting down, but your player avatar is standing up, it just doesn’t feel right. The inversion of this rule is, of course, that racing or other cockpit games should only be played in the seated position. To meander back to the topic, this explains why Couch Knights was chosen as a demo. The camera’s tracking volume was large enough that I stayed inside it no matter how harded I leaned out of the chair. I did not stand up, but others who did told me tracking cut out for them. The fix is to adjust the camera position based on how you want to use the system.
The biggest thing, however, that occurred to me while trying the DK2 with Couch Knights — and this is no criticism of the hardware or software — was embodiment. In my recent Kinect video I muse about the uncanny valley, the observation that the closer a visual representation of a human approaches reality, the more magnified minor flaws are, and the less real it looks. In Couch Knights, there is a fully developed player avatar that sits in the same pose as the player, and even holds the same game controller. When the Oculus guy handed me the controller, I looked down (already wearing the Rift) to place my thumbs on the analog sticks on the unfamiliar controller layout. But according to the avatar, my thumbs were already on the sticks. As I had just worked with 3D video embodiment before, and subconsciously expected that the virtual view of my hands would represent reality, that brought my brain to a grinding halt, and it took me a long time to get around that disconnect and find the sticks. I had to force myself to close my eyes and feel around for the sticks, because visual cues from the avatar were completely misleading. Even after that, the player avatar never felt like my body. Couch Knights tries very hard, even animating the upper avatar body to match head position, but by trying so hard, it failed even harder. That’s the uncanny valley in a nutshell. I haven’t heard this specific complaint from anybody else, so it might just be my personal problem. Maybe playing with 3D video embodiment heightened my sensitivity for this discrepancy.
Bottom line: The Oculus Rift DK2 is a proper VR HMD, albeit slightly limited by the recommended “seated VR experience.” Not a problem if developers and users are willing to “void the warranty.”
The Morpheus (let’s just call it that for now) was a surprise to me. There has been a lot of skepticism about it, especially on the Oculus subreddit (maybe a bit of brand loyalty there), but I found it an unqualified success. Just like with DK2, I completely ignored what I was supposed to do in the demo, and focused on head tracking, screen resolution, latency, etc. I think the demo was set up to progress and then self-terminate after certain player actions, so by never doing those like I was supposed to, I spent way longer in there than the Sony guy wanted me to, but that was his problem (there was no line). Unlike Couch Knights, Sony’s demo didn’t have a full-body avatar, only floating gauntlets, but that didn’t bother me much.
Anyway, I’m confident in saying that the Morpheus is, taken by itself, on the same quality level as the DK2. I wasn’t able to try the two back-to-back, so I cannot judge which of the two is maybe slightly better than the other, but I think the difference is at most minor. I did notice that the Morpheus wasn’t fully enclosed, and I was able to see a sliver of reality underneath the screens. Chalk one up for Oculus. The big difference was that Sony was bold enough to aim for a standing VR experience, and that paid off big time for me. I was able to walk around the training dummy I was supposed to punch and slice, peer through the gaps in its armor, poke my head through its chest, and all those fun things. There were a few minor glitches in tracking as I walked around, but nothing disruptive. After I was done I realized I had forgotten to try tracking while facing away from the camera, which should work due to the Morpheus’ back-facing LEDs. Oops.
The biggest revelation about the Morpheus system, for me, were the Playstation Move controllers. I have been working with the Move in my copious spare time for a while, to develop a hybrid inertial/optical tracking driver for full 6-DOF, low-latency tracking in the Vrui VR toolkit. Idea in a nutshell: low-latency inertial tracker tracks position and orientation via dead reckoning, high-latency camera corrects for positional drift retroactively. Result: globally accurate tracking with the latency of the inertial tracker. Problem: really tricky. I have long been convinced that it would work great, but seeing Sony’s implementation really motivated me to go back to it and get it done as soon as possible. It simply worked, even when aiming a virtual crossbow at far-away targets. The controllers were tracked by the same Playstation 4 Eye camera as the headset itself. Granted, the DK2’s game controller was more appropriate to remote-control a mini-knight in Couch Knights, but Morpheus let me dismember a life-size training dummy at 1:1 scale. Guess which I’d rather do. Random thought: did Sony intentionally bring the “standing knight” demo to one-up Couch Knights? If so: well played, Sony.
Bottom line: Sony’s Project Morpheus is a proper VR HMD, and as a system combined with two Playstation Move controllers, it’s very close in experience to my CAVE, and a total blast. Since Oculus Rift DK2 could be combined with Playstation Move controllers as well, there is no clear winner — besides the users, that is.
I’m a big fan of the Razer Hydra, warts and all, and I was looking forward to trying its descendant, the STEM. Sixense were showing two demos: Portal 2, and a simple shooting gallery where the player could pick up two revolvers by using the tracked handles, and then aim them at targets and blast away. I agreed with the Sixense folks that the shooting gallery made a better demo because it focused on the STEM’s tracking abilities, without distracting the player with puzzles, a story, and glorious vistas. Yet again I ignored the demo completely, and instead specifically aimed for the Hydra’s weak points in tracking, i.e., latency (felt good), global field warp causing large-scale positional displacements inherent to the electro-magnetic technology, and orientational dependence of positional tracking, i.e., lateral displacements in response to purely rotational movements caused by sub-par magnetometer calibration inside the tracked handles.
My experiments must have looked really strange to onlookers. For example, I was holding the two handles such that the virtual guns’ muzzles touched in virtual space, and then rotated one handle around the point in real space where I felt its real muzzle would be. In a perfect tracker, the virtual muzzle would have stayed precisely in place, and while there was still significant displacement, it was a lot less bad than with the Hydra. Field warp, on the other hand, was just as before. Field warp is a property of the local environment itself, and completely out of control of the tracking hardware. There is no way to detect and correct it without an independent secondary tracking system using another technology, such as an optical one. It might be possible to correct for field warp using two different electro-magnetic technologies simultaneously, say AC vs pulsed DC, but I don’t think STEM is doing that. That said, in most cases field warp is not a deal breaker. It is not good when there is something else in the virtual environment that is tracked with global accurately, such as when combining 3D video with Hydra handles, but in other circumstances the player can adapt to it.
Apart from that, there was a little bit of jitter, particularly noticeable when aiming the virtual guns at far-away targets. I think it will require fine-tuning of filter coefficients, trading off jitter against latency, to get this dialed in just so. I think this will have to be exposed to the end user, say via a slider in a UI (Vrui does it via a configuration file setting).
There was one major glitch: while I was messing around, one of the handles froze — not completely, but only in position. It still reacted to rotations, but I couldn’t move it any more. I don’t think it was a hardware issue (there is no way to explain that behaviour given the tracking technology), but a driver problem. Major clue: a restart fixed it. I’m apparently good at breaking demos (I “broke” the Hydra in The Gallery: Six Elements as well).
Looking forward, the Sixense folks mentioned their plan of putting accelerometers and gyroscopes into the tracked handles, to reduce latency and improve orientational stability. That sounds like a good idea to me, I hope they’ll be able to pull it off.
Bottom line: It’s still an electro-magnetic tracker with all the concomitant issues, but much improved compared to Razer Hydra, and with a reasonable chance for further improvements before hitting the market.
This one I was really curious about, because I hadn’t seen much about it, and was skeptical of some of its claims (such as zero-drift inertial positional tracking). After having looked at it very closely (though sadly they didn’t let me try it myself), I think I understand exactly how it works. The trick to achieving zero-drift inertial positional tracking is to — wait for it — not do inertial positional tracking at all. But then how does it track the position of your hands and feet? Via zero-drift orientational tracking, which isn’t quite as hard.
Take the “PrioVR Lite,” which has one IMU on each upper arm, one on each lower arm, and one on each hand. If you assume that the body’s center of gravity (CoG) and the shoulders don’t move, then the orientation of the upper arm implies the position of the elbow joint based on the shoulder position and the length of the humerus (upper arm bone). From that, add that the orientation of the lower arm and the length of the radius and ulna, and you have the wrist position, and so forth. It’s basic forward kinematics. It’s the same for the head; head orientation and relative position of head over the shoulders yield head position (just like the neck model in the Rift DK1). Finally, the chest sensor measures upper body orientation, i.e., leaning, and defines the root point for the upper body kinematic chains.
The problem, of course, is that the skeleton model used for forward kinematics must match the user’s real body fairly well, or the tracked position of the extremities won’t match reality 1:1. This showed up very obviously in the PrioVR demo: the guy wearing the suit was sending text messages on his cellular telephone a lot, so he was standing still with his hands held closely together for extended periods of time. Great opportunity for me to sneak up to the big screen and check the hand positions of his avatar, which were at least one hip width apart. This means PrioVR didn’t bother to calibrate the kinematics skeleton to the guy demoing their system. Oops.
Now I said up there “assume that the shoulders don’t move.” What happens if they do? I asked the guy to show me what happens when he lifts his shoulders while leaving his arms hanging freely (at that point he already thought I was a weirdo, so no further harm), and, confirming my hypothesis, the arms and hands of his avatar didn’t move at all. So that’s how they do it. Inertial positional tracking would have picked up that motion, but then, it would never work as a system due to drift.
That leaves the question: how does PrioVR do locomotion? One selling point of the suit is that it can track the user’s position throughout larger spaces. Here’s how: I don’t think the “Lite” suit supports locomotion at all (confirmed via web site). The suit demoed at the expo was a “Core” suit, which tracks upper and lower legs. As before, we assume that the body’s CoG doesn’t move (or, thinking differently, that the entire body moves relative to the CoG). Then the orientation of the upper and lower legs yields ankle position, and we can get foot position from that assuming the ankles don’t rotate. If the user stands still, both feet will be at the same elevation below the CoG, and the software can decide that they are both on the ground, so it will lock the user’s avatar to the virtual floor (that works for crouching, too). Now the user lifts one leg, which the software can detect easily, and puts it down again. Did it go straight back down, or forward to take a step? The software can estimate the foot’s arc based on the same forward kinematics, and therefore predict the new location of the foot on the virtual floor once it comes down again. Apply half the difference from old to new foot position to the CoG, and the avatar just took a half step forward. Now follow with the other foot and repeat, and the avatar is walking. Basically: whichever foot is lower below the CoG gets locked to the floor (this entire description is simplified, of course). Now the big question: can the PrioVR suit detect a jump, and distinguish a jump straight up from a jump forward, and estimate jump distance? I don’t think so, but if it does at least one of these, major kudos. Update: comments on reddit from PrioVR themselves confirm my hypothesis: correct operation of PrioVR requires at least one foot on the ground at all times. So no jumping or running, and orientation-based forward kinematics is the best explanation of how the PrioVR works.
As it turns out, in practice locomotion has some problems. Jitter in orientational measurements leads to noisy foot positions, and applying a hard threshold (I guess) for step detection means that the avatar might take spurious mini-steps, or miss some small intended steps. And that’s exactly what happened: while the demo guy was standing still, his avatar’s feet were continuously dancing a little jig (which looked really funny on screen, because it was a very serious-looking avatar) and the avatar was (very slowly) moonwalking through the virtual environment. Finally, the avatar’s step size didn’t match real step size to a noticeable degree, but that was most probably due to skeleton mismatch, just like with the hands (where it was easier to confirm). The “Pro” suit might improve locomotion somewhat because it tracks foot orientation, so it can detect the user standing on tiptoes, but there will still be drift.
One last note: the PrioVR’s measurements were surprisingly noisy. The avatar’s feet were visibly tapdancing, and the hands were shaking like leaves. That was very strange, because inertial tracking is integrative, i.e., low-pass filtered, meaning it suffers from little jitter. It is possible that this was due to radio interference between the suit and the base station; dropping samples from an inertial tracker is a very bad thing. I hope that tracking would have been much smoother in a more controlled environment.
Bottom line: PrioVR uses a forward kinematics approach to provide drift-free 6-DOF tracking data for limbs and specifically hands, as advertised, but it requires careful per-user calibration, which cannot be done by the system itself, but must be measured manually using good old-fashioned measuring tape, or be provided by an external body measurement system like a Kinect, and suffers from global inaccuracy and drift in overall avatar position, and significant measurement noise at least in an uncontrolled environment like an expo floor. I like it overall; just like the Hydra, if you know its problems, you can work around them. If they integrated the trackers into a complete spandex suit and gave it neon trim, I’d get one in a heartbeat.
In my initial review of the Leap Motion I classified it as good hardware hamstrung by ill-fitting use case (keyboard and mouse replacement) and bad software and interface approaches, and I think the lukewarm reception it received has supported that. However, there’s now a new skeleton-based finger tracking SDK, and beta testers have said good things about it. I got to try it, and I can confirm that it works much better than ever before. The correspondence between one’s real hand and the extracted skeleton is still often tenuous, and there are the inherent occlusion issues, but it might now have crossed the threshold from novelty into practical usability.
I think the fundamental problem with optical hand tracking is that it is unreliable for triggering events. Due to occlusion, tracking breaks down exactly at the point when it is most crucial, namely when the user pinches two fingers to indicate her intent to, say, pick up a virtual object. Unlike with a physical button of some sort, the user cannot just assume that the system detected that event, but has to wait for visual or other feedback from the system before proceeding. In an application like the Nanotech Construction Kit, where the user has to interact quickly with small building blocks to effectively build complex molecules, that would be a major problem. Instead of the ideal “grab – drag – release” sequence, the user now has to grab, then wiggle a little and look to check if really grabbed, then drag, then release. Which is a pity, because 6-DOF tracking of the overall skeleton, after the initial pinch event, looks really good.
Bottom line: Much better than before, and good for “analog” inputs such as hand position and orientation, but problematic for “binary” controls such as pinch events. OK for single events with immediate feedback, but probably major slowdown for rapid sequenced interactions.
The Dive is just one example of cheap head-mounted displays relying completely on a smartphone for display, (orientational) head tracking, and application logic and 3D rendering, but after having tried several different ones now, I see systemic problems. Part of it is physical integration. By necessity, the screen needs to be removable from the head-mount enclosure, and that means there are large tolerances. And tolerances are not really tolerable in displays that sit right in front of your eyes. Besides theoretical rendering calibration issues, at least with the Dive I had problems with the basic optics. The lenses required for HMD viewing are adjustable in two directions, and their mounts are somewhat flimsy as a result. I fiddled with them for a while, but still couldn’t get the lenses to line up to the point where I got a fused stereo image. There was significantly, fusion-breaking vertical displacement between the two images that required me to keep pushing one lens adjuster up with my thumb throughout using it. Granted, this was a demo unit probably used by dozens of people before me, but it didn’t instill confidence.
But more importantly, it seems that even current-generation smartphones are simply not up to the task of running VR. With all systems I tried, there was severe display lag (at least several hundred milliseconds), and I can’t explain that by lack of raw graphics power or overly high scene complexity. Is it possible that smartphone screens run at low frame rates, and additionally delay output by a few frames for compositing or post-processing to make media playback look better? Whatever the reason, I couldn’t stand it for long.
On top of that, at least in the Dive I tried, the smartphone inside it seemed to have a faulty inertial sensor. The scene rotated about the vertical axis at maybe five degrees per second, and after a short while the horizon started tilting, too. I can’t say if this was just a bad sensor in that one particular phone (didn’t check make and model), or if it’s par for the course for smartphones. After all, smartphone inertial sensors aren’t meant for this application at all.
Bottom line: It may be cheap (although I just found out the Dive is 57 euros), but given the experience I had, I’d rather keep the money and not play smartphone games in VR. Actually, I’d rather the Dive et al. wouldn’t advertise themselves as VR devices at all. Call it “3D viewer for smartphone games,” or “a virtual IMAX screen on your smartphone” or whatever, but let’s not equate this to a real VR device. To misquote some guy: “The only thing that can kill VR now is bad VR.”
Another smartphone-based HMD, and with similar issues as the Durovis Dive. But this one was interesting because it has an option for AR. The Seebright works by placing the smartphone and core optics (two large, fresnel or regular, lenses) at the viewer’s forehead, and a mirror in front of the viewer’s eyes. This leads to a somewhat narrow field of view (45° according to the engineer I talked to), but it allows for a semi-transparent screen by replacing the mirror with a beam splitter. They had one to show (a 60/40 split between VR imagery and real world), and it was good. They didn’t have any AR demos yet, but I could imagine how those would work.
Bottom line: Is limited by using a smartphone as all-in-one sensor, processor, and display source, but has option for see-through AR.
This is embarassing. There was a booth showing a Rift with a faceplate-mounted stereo camera for pass-through AR. I had wanted to try one of those for a long time, because I had doubts how well they would work. Now I finally got to try one, but I forgot taking a business card or writing down the name of the company. Dear reader, if you know what I’m talking about (see Figure 3), please let me know below. Update: Thanks to Kent Bye, I could fix the article.
Now that I’ve tried, I still have my doubts. The two big ones are latency and stereo calibration. First off, latency on the device I tried was very high, it almost felt like half a second. Initially I didn’t understand why, because video pass-through should be much faster, but then I realized they were doing AR position tracking based on the camera feeds to align virtual objects with the video feed, and that explains most if not all of the delay.
Stereo calibration is a more subtle, but in my opinion just as important issue. When seeing real 3D objects through the video pass-through, they need to be at the proper scale, and at the proper depth. To achieve this, the projection parameters of the cameras must match the display parameters of the HMD. The only way to really do that is to place the focal point of the camera at the user’s pupil positions, and have the (real or virtual) sensor chips match the post-distortion position of the HMD’s screens. That’s pretty much impossible, of course, because there is already something in those positions — the user’s eyes, and the Rift’s screens. For any other physical setup, there will be a difference in projection between the capture and display parts, and that difference manifests as object being off scale, and the depth dimension being squished or exaggerated. Unfortunately, once regular video is captured from some viewpoint, the viewpoint cannot be adjusted in post-processing without major effort (essentially turning the stereo camera into a 3D camera via stereo reconstruction), and introducing major artifacts. Just piping video directly from the left/right cameras to the left/right screens and applying lens distortion correction is not enough.
And I did see that projection mismatch very clearly. For me, all real objects appeared shrunk significantly, and the depth dimension appeared squished. There was depth to the view, but objects (like my hands) looked almost like cardboard cutouts or billboards standing in space.
Now the big question is how distracting/annoying this would be to a regular user. I am heavily biased because I have been working with 3D video, and while that has a host of other problems, at least scale and depth perception are top notch. If the goal is to allow a user to interact with the real world without removing the headset, then the distortion won’t be a problem. It’s like looking through somebody else’s prescription glasses; things don’t quite look right, but it’s possible to get stuff done. But if the goal is to pretend that the headset is completely invisible, to replicate the look and feel of see-through AR, then it’s not there.
The only way, I think, to make pass-through AR work without distortion is to use either a real 3D camera (something like the Kinect), so that the captured 3D scene can be re-projected using the HMD’s proper display parameters, or a lightfield camera with a large enough aperture to cover a range of pupil positions, for basically the same effect.
Bottom line: Pass-through AR in general, and the device I saw in particular, can work for many applications, but if you expect to get the feeling of the HMD becoming invisible, then you might be disappointed.
Disclaimer: I was not involved in development of the Infinadeck, but i am hoping to work with the inventor on developing a closed-loop control system, and integration with VR displays via the Vrui VR toolkit.
The Infinadeck, as shown at SVVR ’14, is a prototype of an omni-directional treadmill that provides a natural walking experience. The Infinadeck’s surface can move independently in X and Y, and counter the user’s movement to keep them centered on the treadmill regardless of which direction they are walking. The current prototype does not have an automatic controller yet (treadmill speed was hand-controlled by the inventor, George Burger), and is not yet integrated into a VR display system by combining it with a head-mounted or screen-based VR display.
I tried the Infinadeck at SVVR ’14 for the first time, and while it doesn’t yet react to the user’s movements automatically, walking on it felt really good. The surface moves smoothly (with a slight wobble along the belt direction that will be fixed), and has very little bounce — it feels like walking on a solid floor, not on a trampoline. I think the Infinadeck has great potential, but see the disclaimer above.
Bottom line: It’s an early prototype, but I believe in its potential.