“It can’t be comfortable or healthy to stare at a screen a few inches in front of your eyes.”
The popularity of Google Cardboard, and the upcoming commercial releases of the Oculus Rift, HTC Vive, and other modern head-mounted displays (HMDs) have raised interest in virtual reality and VR devices in parts of the population who have never been exposed to, or had reason to care about, VR before. Together with the fact that VR, as a medium, is fundamentally different from other media with which it often gets lumped in, such as 3D cinema or 3D TV, this leads to a number of common misunderstandings and frequently-asked questions. Therefore, I am planning to write a series of articles addressing these questions one at a time.
First up: How is it possible to see anything on a screen that is only a few inches in front of one’s face?
Short answer: In HMDs, there are lenses between the screens (or screen halves) and the viewer’s eyes to solve exactly this problem. These lenses project the screens out to a distance where they can be viewed comfortably (for example, in the Oculus Rift CV1, the screens are rumored to be projected to a distance of two meters). This also means that, if you need glasses or contact lenses to clearly see objects several meters away, you will need to wear your glasses or lenses in VR.
Last Friday I made a trek down to the San Francisco peninsula, to visit and chat with a couple of other VR folks: Cyberith, SVVR, and AltspaceVR. In the process, I also had the chance to try a couple of VR devices I hadn’t seen before.
Virtual locomotion, and its nasty side effect, simulator sickness, are a pretty persistent problem and timely topic with the arrival of consumer VR just around the corner. Many enthusiasts want to use VR to explore large virtual worlds, as in taking a stroll through the frozen tundra of Skyrim or the irradiated wasteland of Fallout, but as it turns out, that’s one of the hardest things to do right in VR.
Figure 1: Cyberith Virtualizer, driven by an experienced user (Tuncay Cakmak). Yes, you can jump and run, with some practice.
Yesterday, I attended the second annual Silicon Valley Virtual Reality Conference & Expo in San Jose’s convention center. This year’s event was more than three times bigger than last year’s, with around 1,400 attendees and a large number of exhibitors.
Unfortunately, I did not have as much time as I would have liked to visit and try all the exhibits. There was a printing problem at the registration desk in the morning, and as a result the keynote and first panel were pushed back by 45 minutes, overlapping the expo time; additionally, I had to spend some time preparing for and participating in my own panel on “VR Input” from 3pm-4pm.
The panel was great: we had Richard Marks from Sony (Playstation Move, Project Morpheus), Danny Woodall from Sixense (STEM), Yasser Malaika from Valve (HTC Vive, Lighthouse), Tristan Dai from Noitom (Perception Neuron), and Jason Jerald as moderator. There was lively discussion of questions posed by Jason and the audience. Here’s a recording of the entire panel:
One correction: when I said I had been following Tactical Haptics‘ progress for 2.5 years, I meant to say 1.5 years, since the first SVVR meet-up I attended. Brainfart. Continue reading →
We had a couple of visitors from Intel this morning, who wanted to see how we use the CAVE to visualize and analyze Big Datatm. But I also wanted to show them some aspects of our 3D video / remote collaboration / tele-presence work, and since I had just recently implemented a new multi-camera calibration procedure for depth cameras (more on that in a future post), and the alignment between the three Kinects in the IDAV VR lab’s capture space is now better than it has ever been (including my previous 3D Video Capture With Three Kinects video), I figured I’d try something I hadn”t done before, namely remotely interacting with myself (see Figure 1).
Figure 1: How to properly pat yourself on the back using time-delayed 3D video.
Now that I’ve gotten my Oculus Rift DK2 (mostly) working with Vrui under Linux, I’ve encountered the dreaded artifact often referred to as “black smear.” While pixels on OLED screens have very fast switching times — orders of magnitude faster than LCD pixels — they still can’t switch from on to off and back instantaneously. This leads to a problem that’s hardly visible when viewing a normal screen, but very visible in a head-mounted display due to a phenomenon called “vestibulo-ocular reflex.”
Basically, our eyes have built-in image stabilizers: if we move our head, this motion is detected by the vestibular apparatus in the inner ear (our “sense of equilibrium”), and our eyes automatically move the opposite way to keep our gaze fixed on a fixed point in space (interestingly, this even happens with the eyes closed, or in total darkness).
I just moved all my Kinects back to my lab after my foray into experimental mixed-reality theater a week ago, and just rebuilt my 3D video capture space / tele-presence site consisting of an Oculus Rift head-mounted display and three Kinects. Now that I have a new extrinsic calibration procedure to align multiple Kinects to each other (more on that soon), and managed to finally get a really nice alignment, I figured it was time to record a short video showing what multi-camera 3D video looks like using current-generation technology (no, I don’t have any Kinects Mark II yet). See Figure 1 for a still from the video, and the whole thing after the jump.
Figure 1: A still frame from the video, showing the user’s real-time “holographic” avatar from the outside, providing a literal kind of out-of-body experience to the user.
Using Vrui, implementing this was a piece of cake. Instead of modifying the existing Quikwrite tool, I created a new transformation tool that converts a two-axis analog joystick, e.g., a thumbstick on a game controller, to a virtual 6-DOF input device moving inside a flat square. Then, when binding the unmodified Quikwrite tool to that virtual input device, exactly the expected happens: the directions of the thumbstick translate 1:1 to the character selection regions of the Quikwrite square. I’m expecting that this new transformation tool will come in handy for other applications in the future, so that’s another benefit.
Text entry in virtual environments is one of those old problems that never seem to get solved. The core issue, of course, is that users in VR either don’t have keyboards (because they are in a CAVE, say), or can’t effectively use the keyboard they do have (because they are wearing an HMD that obstructs their vision). To the latter point: I consider myself a decent touch typist (my main keyboard doesn’t even have key labels), but the moment I put on an HMD, that goes out the window. There’s an interesting research question right there — do typists need to see their keyboards in their peripheral vision to use them, even when they never look at them directly? — but that’s a topic for another post.
Until speech recognition becomes powerful and reliable enough to use as an exclusive method (and even then, imagining having to dictate “for(int i=0;i<numEntries&&entries[i].key!=searchKey;++i)” already gives me a headache), and until brain/computer interfaces are developed and we plug our computers directly into our heads, we’re stuck with other approaches.
Unsurprisingly, the go-to method for developers who don’t want to write a research paper on text entry, but just need text entry in their VR applications right now, and don’t have good middleware to back them up, is a virtual 3D QWERTY keyboard controlled by a 2D or 3D input device (see Figure 1). It’s familiar, straightforward to implement, and it can even be used to enter text.
Figure 1: Guilty as charged — a virtual keyboard in the Vrui toolkit, implemented as a GLMotif pop-up window with rows and columns of buttons.
A little more than a month ago, I wrote a pair of articles investigating and explaining the internals of the Rift’s display, and how small deviations in calibration have a large effect on the perceived size of the virtual world, and the degree of “solidity” (for lack of a better word) of the virtual objects therein. In those posts, I pointed out that a single lens distortion correction formula doesn’t suffice, because lens distortion parameters depend on the position of the viewers’ eyes relative to the lenses, particularly the eye/lens distance, otherwise known as “eye relief.” And guess what: I just got an email via the Oculus developer mailing list announcing the (preview) release of SDK version 0.3.1, which lists eye relief-dependent lens correction as one of its major features.
Maybe I should keep writing articles on the virtues of 3D pupil tracking, and the obvious benefits of adding an inertially/optically tracked 6-DOF input device to the consumer-level Rift’s basic package, and those things will happen as well. 🙂
Disclaimer: Presence research is not my area of expertise. I’m basically speaking as an interested layperson, and just writing down some vaguely related observations that have re-occurred to me recently.
So, presence. What is presence, and why should we care? Libraries full of papers have been written about it, and there’s even a long-running journal of that title. I guess one could say that presence is the sensation of bodily being in a place or environment where one knows one is not. And why is it important in the discussion of virtual reality? Because it is often trotted out as the distinguishing feature between the medium of VR (yes, VR is the medium, not the content) and other media, such as film or interactive 3D graphics; in other words, it is often a feature that’s used to sell the idea of VR (not that there’s anything wrong with that).
But how does one actually measure presence, and know that one has achieved it? Some researchers do it by putting users into fMRI machines, but that’s not really something you can do at home. So here are a few things I’ve observed over sixteen years of working in VR, and showing 3D display environments and 3D software to probably more than 1,000 people, both experts and members of the general public: