Mixed-reality recording, i.e., capturing a user inside of and interacting with a virtual 3D environment by embedding their real body into that virtual environment, has finally become the accepted method of demonstrating virtual reality applications through standard 2D video footage (see Figure 1 for a mixed-reality recording made in VR’s stone age). The fundamental method behind this recording technique is to create a virtual camera whose intrinsic parameters (focal length, lens distortion, …) and extrinsic parameters (position and orientation in space) exactly match those of the real camera used to film the user; to capture a virtual video stream from that virtual camera; and then to composite the virtual and real streams into a final video.
Figure 1: Ancient mixed-reality recording from inside a CAVE, captured directly on a standard video camera without any post-processing.
I promised I would keep off-topic posts to a minimum, but I have to make an exception for this. I just found out that Roger Ebert died today, at age 70, after a long battle with cancer. This is very sad, and a great loss. There are three primary reasons why I have always stayed aware of Mr. Ebert’s output: I love movies, video games, and 3D, and he had strong opinions on all three of those areas.
When hearing about a movie, my first step is always the Internet Movie Database, and the second step is a click-through to Mr. Ebert’s review. While I didn’t always agree with his opinions, his reviews were always very useful in forming an opinion; and anyway, after having listened to his full-length commentary track on Dark City — something that everybody with even a remote interest in movies or science fiction should check out — he could do no wrong in my book.
I do not want to weigh in on the “video games as art” discussion, because that’s neither here nor there.
However, I do want to address Mr. Ebert’s opinions on stereoscopic movies (I’m not going to say 3D movies!), because that’s close to my heart (and this blog… hey, we’re on topic again!). In a nutshell, he did not like them. At all. And the thing is, I don’t really think they work either. Where I strongly disagreed with him is the reason why they don’t work. For Mr. Ebert, 3D itself was a fundamentally flawed idea in principle. For me, the currentimplementation of stereoscopy as seen in most movies is deeply flawed (am I going to see “Jurassic Park 3D?” Hell no!). What I’m saying is, 3D can be great; it’s just not done right in most stereoscopic movies, and maybe properly applying it will require a change in the entire idea of what a movie is. I always felt that the end goal of 3D movies should not be to watch the proceedings on a stereoscopic screen from far away, but to be in the middle the action, as in viewing a theater performance by being on stage amidst the actors.
I had always hoped that Mr. Ebert would at some point see how 3D is supposed to be, and then nudge movie makers towards that ideal. Alas, it was not to be.
“Virtual Worlds Using Head-mounted Displays” is the most complex video I’ve made so far, and I figured I should explain how it was done (maybe as a response to people who might say I “cheated”).
I’m on vacation in Mexico right now, and yesterday evening my brother-in-law took my wife and me to see “The Hobbit,” in 3D, in quite the fancy movie theater, with reclining seats and footrests and to-the-seat service and such.
I don’t want to talk about the movie per se, short of mentioning that I liked it, a lot, but about the 3D. Or the “stereo,” I should say, as I mentioned previously. My overall impression was that it was done very well. Obviously, the movie was shot in stereo (otherwise I’d have refused to see it that way), and obviously a lot of planning went into that aspect of it. There was also no apparent eye fatigue, or any other typical side effect of bad stereo, and considering how damn long the movie was, and that I was consciously looking for conversion problems or artifacts, that means someone was doing something right. As a technical note to cinemas: there was a dirty spot on the screen, a bit off to the side (looked as if someone had thrown a soda at the screen a while ago), and that either degraded the screen polarization, or was otherwise slightly visible in the image, and was a bit distracting. So, keep your stereo screens immaculately clean! Another very slightly annoying thing was due to the subtitles (the entire movie was shown in English with Spanish subtitles, and then there were the added subtitles when characters spoke Elvish or the Dark Tongue), and even though I didn’t read the subtitles, I still automatically looked at them whenever they popped up, and that was distracting because they were sticking out from the screen quite a bit.
I’ve already mentioned KeckCAVES‘ involvement in NASA‘s newest Mars mission, the Mars Science Laboratory, in a previous post, but now I have an update. Dawn Sumner, UC Davis‘ member of the Curiosity science team, was interviewed last week for “Onward California,” which I guess is some new system-wide outreach and public relations effort to get the public’s mind off last fall’s “unpleasantries.” Just kidding UC, you know I love you.
Anyway… Dawn decided that the best way to talk about her work on Mars would be to do the interview in the CAVE, showing how our software, particularly Crusta Mars, was used during the planning stages of the mission, specifically landing site selection. I then suggested that it would be really nice to do part of the interview about the rover itself, using a life-size and high-resolution 3D model of the rover. So Dawn went to her contacts at the Jet Propulsion Laboratory, and managed to get us a very detailed 3D model, made of several million polygons and high-resolution textures, to load into the CAVE.
What someone posing with a life-size 3D model of the Mars Curiosity rover might look like.
As it so happens, I have a 3D mesh viewer that was able to load and render the model (which came in Alias|Wavefront OBJ format), with some missing features, specifically no specularity and bump mapping. The renderer is fast enough to draw the full, undecimated mesh at sufficient frame rate for immersive display, around 30 frames per second.
The next problem, then, was how to film the beautiful rover model in the CAVE without making it look like garbage, another topic about which I’ve posted before. The film team, from the Department of the 4th Dimension, fortunately was on board, and filmed the interview in several segments, using hand-held and static camera setups.
We have pretty much figured out how to film hand-held video using a secondary head tracker attached to the camera, but static setups where the camera is outside the CAVE, and hence outside the tracking system’s range, always take a lot of trial and error to set up. For good video quality, one has to precisely measure the 3D position of the camera lens relative to the CAVE and then configure that in the CAVE software.
Previously, I used to do that by guesstimating the camera position, entering the values into the configuration file, and then using a Vrui calibration utility to visually judge the setup’s correctness. This involves looking at the image and why it’s wrong, mentally changing the camera position to correct for the wrongness, editing the configuration file, and repeating the whole process until it looks OK. Quite annoying that, especially if there’s an entire film crew sitting in the room checking their watches and rolling their eyes.
After that filming session, I figured that Vrui could use a more interactive way of setting up CAVE filming, a user interface to set up and configure several different filming modes without having to leave a running application. So I added a “filming support” vislet, and to properly test it, filmed myself posing and playing with the Curiosity rover (MSL Design Courtesy NASA/JPL-Caltech):
Pay particular attention to the edges and corners of the CAVE, and how the image of the 3D model and the image backdrop seamlessly span the three visible CAVE screens (left, back, floor). That’s what a properly set up CAVE video is supposed to look like. Also note that I set up the right CAVE wall to be rendered for my own point of view, in stereo, so that I could properly interact with the 3D model and knew what I was pointing at. Without such a split-CAVE setup, it’s very hard to use the CAVE when in filming mode.
The filming support vislet supports head-tracked recording, static recording, split-CAVE recording (where some screens are rendered for the user, and some for the camera), setting up custom light sources, and a draggable calibration grid and input device markers to simplify calibrating a static camera setup when the camera is outside the tracking system’s range and cannot be measured directly.
All in all, it works quite well, and is a significant improvement over the previous setup method. It is now possible to change filming modes and camera setups from within a running application, without having to exit, edit configuration files, and restart.
On the danger of looking like a lame copycat, I’ll still do it, because the technical angle I had in mind is different from the A.V. Club’s approach, but if you disagree, tell me off in the comments.
Let’s get going, with a completely subjective selection, and in no particular order.
Star Wars, 1977
What? There’s no VR in it! True, but there are “holograms” in it. And because it’s an extremely common misconception, and I get it thrown at me all the time, I need to say it: real holograms don’t work that way! You know the scene I’m referring to:
The thing is that real holograms need to be “supported” by a piece of holographic screen behind them — you can only see the part of the hologram that’s between your eyes and the screen. Holograms are free-standing — just not as free-standing as most people subconsciously assume; holographic projectors such as R2-D2’s here are fiction. It’s important because the argument goes: once we get real-time holograms, we won’t need to build CAVEs anymore. Technically true, yes, but you’d need to build a space enclosed by holographic screens to get the same effect as a CAVE, so basically the same thing. Sorry.
Verdict: Fiction!
Disclosure, 1994
But this one’s in the A.V. Club article! True, and I feel bad about copying them even more blatantly. But I have to amend what they’re saying. I have no beef with their evaluation on the ridiculousness scale, but from a technical point of view, VR as depicted in Disclosure, at least in the following scene, exists and is used today:
Let’s see: tracked head-mounted display, tracked data glove, omni-directional treadmill, 3D scanner that captures a real-time 3D image of the user and projects it into the virtual space — I have all that in my lab, minus the treadmill (unfortunately). I’m even working with architecture firms. Walking across a virtual cathedral to access files, and a bottomless chasm in the middle of your database server for no reason? Yeah, that’s silly.
Verdict: Nailed it!
Minority Report, 2002
This one’s interesting. There are two VR bits in it: the famous “maestro-style” free-hand GUI, and the 3D home movie. Let’s tackle the simple one first, the 3D home movie:
The 3D video itself looks exactly like the kind of video you can capture with a 3D camera like the Kinect, down to the fringe triangle artifacts (someone on YouTube even made a mash-up between this and my first Kinect video; it’s uncanny). The projection system is another story: at first glance, it’s another completely free-standing hologram (fiction!), but a bit of fanwank can explain that it was actually a projection onto a 3D multi-viewpoint fog projection display (exists! just not quite as good yet).
Partial verdict: Nailed it!
The part with which I have a gripe is the 3D GUI:
From a technical point of view, we could have built that in 2002: tracked data gloves (had them in my lab in 1998, albeit with wires), projection onto a translucent screen (nothing to it), gesture interface, we could have rigged up a physical data transfer module (it’s basically a transparent USB stick, right?), etc.
So here’s my gripe: the whole thing makes no sense. Some people have issue with the manual data transfer — why not send the data over the network? — but you could fanwank that as a security issue. No, the problem is why use a 3D user interface in the first place? Look exactly at what he’s doing. All the data with which he’s interacting are 2D — text, images, movies. All the interactions are 2D: he moves and pinch-zooms, he rotates in the screen plane. Oh, and kind folks who did the annotation? It’s not a “holoscreen” — it only shows 2D images, so it’s simply a “screen.”
There is no free 3D manipulation, so why is he using a free 3D user interface? It’s bad, ergonomically. Holding your hands out like that for precise work over an extended time (more than a few minutes) is painful. The syndrome is called “Gorilla Arm.” The ideal hardware and UI for this type of work is a multi-touch surface device, probably set not vertically, but at an angle like a drafting table. Then your hands and fingers have something to rest on and push against for the interactions, which makes them much easier and less painful.
Why am I harping on this? People are rushing to recreate this interface, now that the hardware is cheaply available, because it looks extremely cool in the movie. It fooled me for the first two times watching. So people are working hard trying to make an interface that’s literally painful to use, and people actually trying it will hate it, and the backlash will hurt us all. Please, don’t do it.
Partial verdict: Nailed it technically, but failed ergonomics
Iron Man, 2008
This one I love:
It starts out like the Minority Report GUI, but then it gets good the moment the suit’s 3D model appears over the virtual workbench. I’m wondering if that’s intentional one-upmanship: start out just like the other, and then blow it away.
Anyway, let’s look at the technology: free-standing 3D display above a virtual workbench, hand tracking and gesture interpretation without data gloves. Pushing it, but we have the Kinect, we may soon have the Leap, and we can always imagine that he could be wearing stylish VR goggles in Tony Stark’s inimitable style. Or, alternatively, assuming that what we’re seeing in the movie is a representation of what Tony sees, and not what another person in the camera’s place would see, and the former could be only the part of the 3D model that’s between him and the workbench screen, which could be auto-stereoscopic, then it’s entirely today’s technology.
So with a bit of squinting and allowing for the Hollywood glitz filter, yes, we can build that. As for the interaction: tell me it doesn’t look exactly like this, again accounting for the glitz filter, and me using only one hand (we have a second input device now):
Now you might ask: why am I lauding free-space 3D interactions here, when I decried them in Minority Report? Simple, because here they are used for actual 3D manipulation, where you accept a bit of discomfort because there’s no better alternative. And you’ll also notice that he’s holding his arms in a more comfortable position, not at shoulder height (or only for as long as required to grab an object). That makes a huge difference, and it’s what our users do when they spend long hours in the CAVE.
Verdict: a bit shinier than what we can do today, but overall Nailed it!
Iron Man 2, 2010
Several scenes in this one. The first is the coffee table scene:
Pretty standard multi-touch surface display and interactions. Not really VR, as it’s all 2D, but worth a mention anyway. Verdict: Nailed it!
The workshop walk-through scene:
Similar to the scene from the first Iron Man, this one features completely free-standing 3D imagery, implied to be free-standing holograms, and therefore fiction. But in the context of the movie, it’s entirely possible that his entire workshop is panelled in auto-stereoscopic displays, and that the movie is only showing us what Tony sees. That could be done today, but it’s not close to practical, a huge stretch, and because of the common misconception about holograms, I’ll have to give it a demerit. Add to that the fact that the user interface here is a lot more “do what I mean” than in the first movie’s scene. There, the gestures he performs correspond directly enough to actions on the 3D model that a good 3D UI can explain it, but here it’s over the line. This UI, as depicted, can only work if a strong AI is running it. Since we already know that Tony employs a strong AI as an assistant, that makes sense in the context of the movie, but sadly it’s fiction.
Overall verdict: Fiction!
All right, that’s my list for now. I’m not going to touch the Matrix, Thirteenth Floor, eXistenz, et al., because those are obviously pure fiction. But if I forgot anything that deserves mention, because it depicts an internally consistent combination of display hardware and user interface that may or may not exist or be theoretically feasible, please let me know below. I have Netflix.
D’oh, I forgot one, especially embarrassing because I mentioned Babylon 5. How could I!
Babylon 5, And The Sky Full Of Stars, 1994
Can’t find a clip, but here’s the episode recap on the Lurker’s Guide. Synopsis: the station’s commander gets kidnapped and interrogated by being strapped into a virtual reality system, so that the interrogators can mindscrew him and break him more easily. The VR system itself is not thought through enough to be analyzed, except the display bit itself: it’s a retinal projector, shining the image of the virtual 3D world directly into the user’s eyes (only into one eye in the episode, sad oversight). Exists!
The input part of the system, on the other hand, must use some kind of neural interface, because the user (or captive in this case) can move inside the virtual world normally while being strapped into a chair in the real world, so Fiction!
How the interrogator, or the commander’s virtual body, get mapped into the virtual world is not even addressed, so Didn’t think about it!
Let’s just say I like this episode in spite of the VR stuff, not because of it. It’s just a TV show, after all.