The most interesting aspect of this talk, for me, was that the art project and all the software development for it, are done by the “other” part of the KeckCAVES project, the more mathematically/complex systems-aligned cluster around Jim Crutchfield of UC Davis‘ Complexity Sciences Center and his post-docs and graduate students. In practice, this means that I saw some of the software for the first time, and also heard about some problems the developers ran into that I was completely unaware of. This is interesting because it means that the Vrui VR toolkit, on which all this software is based, is maturing from a private pet project to something that’s actually being used by parties who are not directly collaborating with me.
Figure 1: Danny Rey, Tribal Historic Preservation Officer, and Marcos Guerrero, Cultural Resources Manager, representatives of the United Auburn Indian Community, viewing a high-resolution 3D scan of the Maidu Historic Trail and Site in the KeckCAVES immersive visualization facility. In the background Joe Dumit of UC Davis’ Science and Technology Studies, and myself. Photo provided by Marshall Millett.
Marshall has been using KeckCAVES software, particularly LiDAR Viewer (about which I should really write a post), and also the KeckCAVES facility itself and related technology, to visualize his high-resolution 3D models at 1:1 scale, and with the ability to experience them in ways that are not normally possible (most of these sites are fragile and/or sacred, and not available to the public). Part of this work were several visits of community representatives to the KeckCAVES facility, to view their digitally reconstructed historic site (see Figure 1).
But let’s back up a bit. When it comes to VR, there are three prevalent opinions:
It’s a dead technology. It had its day in the early nineties, and there hasn’t been anything new since. After all, the CAVE was invented in ’91 and is basically still the same, and head-mounted displays have been around even longer.
It hasn’t been born yet. But maybe if we wait 10 more years, and there are some significant breakthroughs in display and computer technology, it might become interesting or feasible.
It’s fringe technology. Some weirdos keep picking at it, but it hasn’t ever led to anything interesting or useful, and never will.
I’ve talked about “holographic displays” a lot, most recently in my analysis of the upcoming zSpace display. What I haven’t talked about is how exactly such holographic displays work, what makes them “holographic” as opposed to just stereoscopic, and why that is a big deal.
Teaser: A user interacting with a virtual object inside a holographic display.
We are currently involved in an NSF-funded project to study the changes in global ocean flow patterns in response to past climate change, specifically the difference in flow patterns between the last glacial maximum (otherwise known as the “Ice Age”, ~25000 years ago) and the Holocene (otherwise known as “today”).
In layman’s terms, the basic idea is to use differences in the chemical composition, particularly the abundance of isotopes of carbon (13C) and oxygen (18O), of benthiccore samples collected from the ocean floor all around the world to establish correlations between sampling sites, and from that derive a global flow model that best explains these correlations. (By the way, 13C is not the carbon isotope used in radiocarbon dating; that honor goes to 14C).
This is a multi-institution collaborative project. The core sample isotope ratios are collected and collated by Lorraine Lisiecki and her graduate students at UC Santa Barbara, and the mathematical method to reconstruct flow patterns based on those samples is developed by Jake Gebbie at Woods Hole Oceanographic Institution. Howard Spero at UC Davis is the overall principal investigator of the project, and UC Davis’ contribution is visualization and analysis software, building on the strengths of the KeckCAVES project. I’ve posted previously about our efforts to construct low-cost immersive display systems at our collaborators’ sites so that they can use the visualization software developed by us in its native habitat, and also collaborate with us and each other remotely in real-time using Vrui’s collaboration infrastructure.
So here is the first major piece of visualization software developed specifically for this project. It was developed by Rolf Westerteiger, a visiting PhD student from Germany, based on the Vrui VR toolkit. Here is Rolf himself, using his application in the CAVE:
PhD student Rolf Westerteiger using his immersive visualization application in the KeckCAVES CAVE.
This application reads a database of core sample compositions created by Lorraine Lisiecki, and a reconstructed 3D flow field created by Jake Gebbie, and puts both into a global three-dimensional context. The software shows a block model of the Earth’s global ocean floor (at the same resolution as the 3D flow field, and vertically exaggerated by a significant factor), and allows a user to interactively query and explore the 3D flow.
The primary flow visualization method is line integral convolution (LIC), which creates dense and intuitive visualizations of complex flows. As LIC works best when applied to 2D surfaces instead of 3D volumes, Rolf’s application is based on a set of interactively controllable surfaces (one sphere of constant depth, two cones of constant latitude, two semicircles of constant longitude) which slice through the implicitly-defined 3D LIC volume. To indicate flow direction, the LIC texture is animated by cycling through a phase offset, and color-coded by either flow velocity or water temperature.
The special thing about this LIC visualization is that the LIC textures are not pre-computed, but generated in real time using the GPU and a set of GLSL shaders. This allows for even more interactive exploration than shown in this first result; a user could specify arbitrary slicing surfaces using tracked 3D input devices, and see the LIC pattern displayed on those surfaces immediately. From our experience with the 3D Visualizer software, which is based on very similar principles, we believe that this will lead to a very powerful exploratory tool.
A secondary flow visualization method are tracer particles, which can be injected into the global ocean at arbitrary positions using a tracked 3D input device, and leave behind a trail of their past positions. Together, these two methods provide rich insight into the structure of these reconstructed flows, and especially their evolution over geologic time.
A third visualization method is used to put the raw data that were used to create the flow models into context. A set of labels, one for each core sample in the database, and each showing the relative abundance of the important isotope ratios, are mapped onto the virtual globe at their proper positions to enable visual inspection of the flow reconstruction method.
Unfortunately, Rolf had to return to Germany before we were able to film a video showing off all features of his visualization application, so I had to make a video with myself standing in for him:
The next development steps are to replace the ocean floor block model read from the flow file with a high-resolution bathymetry model (see below), and to integrate the visualization application with Vrui’s remote collaboration infrastructure such that it can be used by all collaborators for virtual joint data exploration sessions.
Global high-resolution bathymetry model at 75x vertical exaggeration. View is centered on Northern Atlantic.
I’ve already mentioned KeckCAVES‘ involvement in NASA‘s newest Mars mission, the Mars Science Laboratory, in a previous post, but now I have an update. Dawn Sumner, UC Davis‘ member of the Curiosity science team, was interviewed last week for “Onward California,” which I guess is some new system-wide outreach and public relations effort to get the public’s mind off last fall’s “unpleasantries.” Just kidding UC, you know I love you.
Anyway… Dawn decided that the best way to talk about her work on Mars would be to do the interview in the CAVE, showing how our software, particularly Crusta Mars, was used during the planning stages of the mission, specifically landing site selection. I then suggested that it would be really nice to do part of the interview about the rover itself, using a life-size and high-resolution 3D model of the rover. So Dawn went to her contacts at the Jet Propulsion Laboratory, and managed to get us a very detailed 3D model, made of several million polygons and high-resolution textures, to load into the CAVE.
What someone posing with a life-size 3D model of the Mars Curiosity rover might look like.
As it so happens, I have a 3D mesh viewer that was able to load and render the model (which came in Alias|Wavefront OBJ format), with some missing features, specifically no specularity and bump mapping. The renderer is fast enough to draw the full, undecimated mesh at sufficient frame rate for immersive display, around 30 frames per second.
The next problem, then, was how to film the beautiful rover model in the CAVE without making it look like garbage, another topic about which I’ve posted before. The film team, from the Department of the 4th Dimension, fortunately was on board, and filmed the interview in several segments, using hand-held and static camera setups.
We have pretty much figured out how to film hand-held video using a secondary head tracker attached to the camera, but static setups where the camera is outside the CAVE, and hence outside the tracking system’s range, always take a lot of trial and error to set up. For good video quality, one has to precisely measure the 3D position of the camera lens relative to the CAVE and then configure that in the CAVE software.
Previously, I used to do that by guesstimating the camera position, entering the values into the configuration file, and then using a Vrui calibration utility to visually judge the setup’s correctness. This involves looking at the image and why it’s wrong, mentally changing the camera position to correct for the wrongness, editing the configuration file, and repeating the whole process until it looks OK. Quite annoying that, especially if there’s an entire film crew sitting in the room checking their watches and rolling their eyes.
After that filming session, I figured that Vrui could use a more interactive way of setting up CAVE filming, a user interface to set up and configure several different filming modes without having to leave a running application. So I added a “filming support” vislet, and to properly test it, filmed myself posing and playing with the Curiosity rover (MSL Design Courtesy NASA/JPL-Caltech):
Pay particular attention to the edges and corners of the CAVE, and how the image of the 3D model and the image backdrop seamlessly span the three visible CAVE screens (left, back, floor). That’s what a properly set up CAVE video is supposed to look like. Also note that I set up the right CAVE wall to be rendered for my own point of view, in stereo, so that I could properly interact with the 3D model and knew what I was pointing at. Without such a split-CAVE setup, it’s very hard to use the CAVE when in filming mode.
The filming support vislet supports head-tracked recording, static recording, split-CAVE recording (where some screens are rendered for the user, and some for the camera), setting up custom light sources, and a draggable calibration grid and input device markers to simplify calibrating a static camera setup when the camera is outside the tracking system’s range and cannot be measured directly.
All in all, it works quite well, and is a significant improvement over the previous setup method. It is now possible to change filming modes and camera setups from within a running application, without having to exit, edit configuration files, and restart.
You might have heard that NASA has a new rover on Mars. What you might not know is that KeckCAVES is quite involved with that mission. One of KeckCAVES’ core scientists, Dawn Sumner, is a member of the Curiosity Science Team. Dawn talks about her experiences as tactical long term planner for the rover’s science mission, and co-investigator on several of the rover’s cameras, on her blog, Dawn on Mars.
Immersive 3D visualization has been used at several stages of mission planning and preparation, including selection of the rover’s landing site. Crusta, the virtual globe software developed by KeckCAVES, was used to create a high-resolution global topography model of Mars, merging the best-quality data available for the entire planet and each of the originally proposed landing sites. Crusta’s ability to run in an immersive 3D display environment such as KeckCAVES’ CAVE, allowing users to virtually walk on the surface of Mars at 1:1 (or any other) scale, and to create maps by drawing directly on the 3D surface, was important in weighing the relative merits of the four proposed sites from engineering and scientific viewpoints.
Dawn made the following video after Gale Crater, her preferred landing site, had been selected for the mission to illustrate her rationale. The video is stereoscopic and can be viewed using red/blue anaglyphic glasses or several other stereo viewing methods:
We filmed this video entirely virtually. Dawn is working with Crusta on a low-cost immersive 3D environment based on a 3D TV, which means she perceived Crusta’s Mars model as a tangible 3D object and was able to interact with it via natural gestures using an optically-tracked Nintendo Wii controller as input device, and point out features of interest on the surface using her fingers. Dawn herself was filmed by two Kinect 3D video cameras, and the combination of virtual Mars and virtual Dawn was rendered into a stereo movie file in real-time while she was working with the software.
Now that Curiosity is on Mars, we are planning to continue using Crusta to visualize and evaluate its progress, and we hope that Crusta will soon help planning and executing the rover’s journey up Mt. Sharp (NASA have their own 3D path planning software, but we believe Crusta has useful complementary features).
Furthermore, as the rover progresses, it will send high-resolution stereo images from its mast-mounted navigation camera. Several KeckCAVES developers are working on software to convert these stereo images into ultra-high resolution digital terrain models, and to register these to, and integrate them with, Crusta’s existing Mars topography model as they become available.
We already tried this process with stereo imagery from the previous two Mars rovers, Spirit and Opportunity. We took the highest-resolution orbital topography data available, collected by the HiRISE camera, and merged it with the rover data, which is approximately 1000 times more dense. The following figure shows the result (click to embiggen):
The white arrow in panel A shows the location of the rover’s high-resolution data patch shown in panels B and C. In panel C, a stratum of rock — identified by its different color — was selected, and a plane was fit to the selected points (highlighted in green) to measure the stratum’s bedding angle.
The above images were created with LiDAR Viewer, another KeckCAVES software package. LiDAR Viewer is used to visually analyze very large 3D point clouds, such as those resulting from laser scanning surveys, or, in this case, orbital and terrestrial stereo imagery.
The terrain data we expect from Curiosity’s stereo cameras will be even higher resolution than that. The end result will be an integrated global Martian topography model with local patches down to millimeter resolution, allowing a scientist in the CAVE to virtually pick up individual pebbles.
I know I’m several years late to the party talking about the recent 3D movie renaissance, but bear with me. I want to talk not about 3D movies, but about their influence on the VR field, good and bad.
First, the good. It’s impossible to deny the huge impact 3D movies have had on VR, simply by commodifying 3D display hardware. I’m going to go out on a limb and say that without Avatar, you wouldn’t be able to go into an electronics store and pick up a 70″ 3D TV for $2100. And without that crucial component, we would not be able to build low-cost fully-immersive 3D display systems for $7000. And we wouldn’t have neat toys like Sony’s HMZ-T1 or the upcoming Oculus Rift either — although the latter is designed for gaming from the ground up, I don’t think the Kickstarter would have taken off if 3D movies weren’t a thing right now.
And the effect goes beyond simply making real VR cheaper. It is that now real VR is affordable for a much larger segment of people. $7000 is still a bit much to spend for home entertainment, but it’s inside the equipment budget for many scientists. And those are my target audience. We are not selling low-cost VR systems per se, but we’re giving away the designs to build them, and the software to run them. And we’ve “sold” dozens of them, primarily to scientists who work with 3D data that is too complex to meaningfully analyze with desktop 3D visualization, but who don’t have the budget to build “professional” systems. Now, dozens is absolutely zilch in mainstream terms, but for our niche it’s a big deal, and it’s just taking off. We’re even getting them into high schools now. And we’re not the only ones “selling” them.
The end result is that many more people are getting exposed to real immersive 3D display environments, and to the practical benefits that they offer for their daily work. That will benefit us all.
But there are some downsides to the 3D movie renaissance as well, and while those can be addressed, we first need to be aware of them. For one, while 3D movies are definitely in the public conscience, I found that nobody is exactly completely bonkers about them. Roger Ebert is an extreme example (I think that Mr. Ebert is wrong in the sense that he claims 3D does not work in principle, whereas I think 3D does not work in many concrete implementations seen in theaters right now, but that’s a topic for another post), but the majority of people I speak to are decidedly “meh” about 3D movies. They say “3D doesn’t work for me” or “I get headaches” or “I get dizzy” etc.
Now that is a problem for VR as a whole, because there is no distinction in the public mind between 3D movies and real immersive 3D graphics. Meaning that people think that VR doesn’t work. But it does. I just did a quick guesstimate, and in the seven years we’ve had our CAVE, I’ve probably brought 1000 people through there, from every segment of the population. It has worked for every single one of them. How do I know? Everyone who enters the CAVE goes through the training course — a beach ball-sized globe hanging in the middle of the CAVE, shown in this video:
(Oh boy, just looking at this six-year-old video, the user interface in Vrui has improved so much. It’s almost embarrassing.)
I ask every single person to step in, touch the globe, and then indicate how big it is. And they all do the same thing: use both hands to make a cradling gesture around a virtual object that’s not actually there. If the 3D effect wouldn’t work for them, they couldn’t do it. QED. Before you ask: I’m aware that a significant percentage of the general population have no stereo vision at all, but immersive 3D graphics works for them as well because it provides motion parallax. I know because one of my best friends has monocular vision, and it works for him. He even co-stars with me in a silly video.
The upshot is that the conversation goes differently now. It used to be that I talk to “VR virgins” about what I do, and they have no pre-conception of 3D, are curious, try the CAVE, and it works for them and they like it. These days, I talk about the CAVE, they immediately say that 3D doesn’t work for them, and they’re very reluctant to try the CAVE. I twist their arms to get them in there nonetheless, and it works for them, and they like it. This is not a problem if I have someone there in person, but it’s a problem when I can’t just stuff the person I’m describing VR to into a VR system, as in, say, when you’re writing a proposal to beg for money. And that’s bad news, big time (but it’s a topic for another post).
There is another interesting change in behavior: let’s say I have a group of people coming in for a tour (yeah, we sometimes get strongarmed into doing those). Used to be, they would come into the CAVE room, and stand around not sure what to expect or what to do. These days, they immediately sit down at the conference table, grab a pair of 3D glasses if they find one, and get ready to be entertained. I then have to tell them that no, that’s not how it works, would they please put the non-head tracked glasses down until later, get up, and get ready to get into the CAVE itself and see it properly? It’s pretty funny, actually.
The other downside is that the use of the word “3D” for movies has watered down that term even more. Now there are:
“3D graphics” for projected 2D images of 3D scenes, i.e., virtual and real photos or movies, i.e., basically everything anybody has ever done. The end results of 3D graphics are decidedly 2D, but the term was coined to distinguish it from 2D graphics, i.e., pictures of scenes playing in flatland.
“3D movies” meaning stereoscopic movies shown on stereoscopic displays. In my opinion, a better term would be “2D plus depth” movies (or they could just go with “stereo movies,” you know), because most directors at this time treat the stereoscopic dimension as a separate entity from the other two dimensions, as something that can be tweaked and played with. And I think that’s one cause of the problem, because they’re messing with people’s brains. And don’t even get me started on “upconverted” 3D movies, oh my.
“3D displays” meaning stereoscopic displays, those used to show 3D movies. They are a necessary component to create 3D images, but not 3D by themselves.
“3D displays” meaning immersive 3D displays like CAVEs. The distinguishing feature of these is that they show three-dimensional scenes and objects in a way similar enough to how we would perceive the same scenes and objects if they were real that our brains accept the illusion, and allow us to work with them as if they were real — and this last bit is really the main point. The difference between this and “3D movies” cannot be overstated. I would rather call these displays “holographic,” but then I get flak from the “holograms are only holograms if they’re based on lasers and interference” crowd, who are technically correct (and isn’t that the best form of correctness?) because that’s how the word was defined, but it’s wrong because these displays look and feel exactly like holograms — they are free-standing, solid-appearing, touchable virtual objects. After all, “hologram,” loosely translated from Greek, means “shows the whole thing.” And that’s exactly what immersive 3D displays do.
And I probably missed a few. So there’s clearly a confusion of terms, and we need to find ways to distinguish what real immersive 3D graphics does from what 3D movies do, and need to do it in ways that don’t create unrealistic expectations, either. Don’t reference “the Matrix,” try not to mention holodecks (but it’s so tempting!), don’t say it’s an indistinguishable replication of reality (in other words, don’t say “virtual reality,” ha!). Ideally, don’t say anything — show them.
In summary, “3D” is now widely embedded in the public conscience, and the VR community has to deal with it. There are obvious and huge benefits, but there are some downsides as well, and those have to be addressed. They can be addressed — fortunately, immersive 3D graphics are not the same as 3D movies — but it takes care and effort. Time to get started.
Leap Motion‘s Leap, an optical tracking system enabling using one’s hands directly to interact with computers in three dimensions, has been the talk of the town recently. So what’s my take on it, and particularly its use for immersive graphics?
Cool story, bro. Two months ago, a group of researchers from UC Davis and I visited the company in their San Francisco offices to see the device for ourselves. Several of Leap Motion’s engineers had seen our booth at the recent Bay Area Maker Faire, and invited us to bring one of our low-cost semi-immersive displays (a 3D TV with a Razer Hydra 6-DOF input device) and show our stuff. We obliged, packed our things, and down along I-80 to SF we went. We showed them ours, they showed us theirs, and fun was had by all.
So what’s the intelligence gathered from this visit? There’s good news, and there’s bad news. The good news is the hardware. Leap Motion have been touting the Leap as a much more precise alternative to the Kinect, and they have that absolutely right. The precision, resolution, and responsiveness of the device are exactly what they claim. Interestingly, I did not glean that insight from the actual software demos they were showing, but from a very simple utility that just showed the raw 3D point cloud of everything that entered the device’s capture space, and identified hands, fingers, and other gadgets such as pencils accurately and in real time. Having done extensive work with the Kinect, I can say that it’s an entirely different kind of tracking, altogether.
So what’s the bad news? Well, as usual, it’s the software and application side. Leap Motion’s company line is that the Leap will make mouse and keyboard obsolete. Not so fast there, buckaroo. Probably 99.99% of computer interactions done by normal people are two-dimensional in nature, and the mouse/keyboard are really good at those. You would not want to use a free-space 3D interface for intrinsically 2D interactions, which is, incidentally, my only gripe with the famous Minority Report interface (but that’s a topic for another post). The end result from doing that already has a fitting name: “Gorilla Arm.” I think I can speak to that because that’s exactly what happens when you’re doing 2D tasks (like using a web browser or filling in a spreadsheet) in an immersive display environment. Trust me, it’s not something you want to do if you can avoid it.
On the other hand, if you’re one of the minority of people who use their computers for 3D tasks, e.g., 3D modeling, sculpting, or, naturally, immersive 3D graphics, it’s an entirely different story. For such applications in the desktop realm, the Leap is a godsend. Instead of having to do the mental gymnastics of using a 2D input device to perform 3D interactions, you just interact directly with the 3D data. This is, again, exactly what’s happening in immersive graphics, and yes, it’s something you definitely do want to do.
So that’s good news, right? Well, yeah, but… The problem here is, and it’s a big problem, that in order to pipe 3D interactions captured by a device like the Leap into a 3D application, you have to punch through the existing 2D-based user interface of that application. The previous approach companies developing novel 3D input devices (think all the data gloves, 3D mice, etc. that have come out and failed over the years) have taken is to provide some form of mouse emulation, so that their devices can be used immediately with existing software. This does not work, ever. In this setup, 3D interactions performed with the device are first boiled down to 2D by the device’s driver, fed into the application, and then turned back into 3D interactions using whatever interface paradigm the application is using. The first step, going from 3D to 2D, is already awkward, and the second step is typically optimized for particular 2D devices, such as mice, which a “simulated” mouse device is most decidedly not. In other words, there are two levels of ill-fitting interface paradigms stacked on top of each other.
So what needs to be done? The answer is quite simple: if you want to effectively use the Leap with a piece of 3D software, that software has to explicitly support the Leap, and needs to use appropriate direct 3D interaction metaphors. Meaning the application developers have to buy into the Leap, dream up good problem-specific 3D interaction metaphors, do studies or experiments to fine-tune them, and then include them in their software. That takes a lot of time and money, and they won’t do it unless there is high demand, i.e., the Leap is already a widely-used device. But it won’t become a widely-used device unless a lot of widely-used 3D software already supports it in an effective way.
So it’s a classical chicken-and-egg problem. Unless you happen to use a certain VR development toolkit that is based around exactly this idea: providing device-optimized 3D interaction metaphors outside of an application’s purview, so that hardware developers can integrate their devices into existing applications without having to change those applications in any way, or even getting to their source code. But I digress…
Back on topic, what Leap Motion need to do is find at least one “killer application,” and do their utmost to get that application just exactly right. And then they have to bundle that application with every device sold. If the people buying their device are stuck with playing Fruit Ninja, or navigating with Google Earth (another thing a mouse is really good at, because Google successfully boiled down the interaction to 2D, and Leap’s Google Earth plug-in doesn’t add any new functionality) or have to use the device to write emails, they won’t recommend it to their friends.
By the way: will the Leap work out-of-the box for 3D video games? Hard to say, but I’m skeptical. They show a “finger gun” control scheme for first-person shooters — again implemented via mouse emulation — but doing that for more than a few minutes will lead to a very sore shoulder. Not that it’s a bad idea in itself — see below for a video showing exactly that interface in a CAVE — but unless the Leap is integrated into a fully calibrated desktop system, it won’t allow a player to actually aim with the “finger gun;” it will be just an equally indirect replacement for moving the mouse left-to-right.
On their web site, Leap Motion mention CAD and clay modeling as applications that inspired them to develop it. Could these be killer applications? Time will tell, but it’s at least a good starting point. So, go ahead and do it! I happen to have a 3D virtual clay modeling application with direct 3D interaction metaphors lying around, just saying…
Now, to restate my overall point after all this skepticism. From what I’ve personally seen, the Leap is an awesome device. I will definitely buy at least one when it comes out. That’s because all the software I’m developing and using on a daily basis is already poised to work with it, due to its input abstraction paradigm. Give me a low-level driver, and the rest is gravy — please, give me a low-level driver! But will the device succeed in the mainstream market, given the issues discussed here? Will it sell hundreds of millions of units, as they hope? For that to happen, I think, they’ll have to do significantly more than what they showed us. Maybe that’s why they pushed back the release date by half a year — here’s hoping.