Build your own Augmented Reality Sandbox

Update: There is now an AR Sandbox support forum with detailed complete installation instructions starting from a blank/new PC, and a video showing a walk-through of same instructions. You’re welcome to read the rest of this article for context and background information, but please ignore the outdated hardware recommendations and installation instructions below. Instead, use the up-to-date hardware recommendations from the AR Sandbox project page, and follow the instructions linked above.

Earlier this year, I branched out into augmented reality (AR) to build an AR Sandbox:

Photo of AR Sandbox, with a central “volcano” and several surrounding lakes. The topographic color map and contour lines are updated in real time as the real sand surface is manipulated, and virtual water flows over the real sand surface realistically.

I am involved in an NSF-funded project on informal science education for lake ecosystems, and while my primary part in that project is creating visualization software to drive 3D displays for larger audiences, creating a hands-on exhibit combining a real sandbox with a 3D camera, a digital projector, and a powerful computer seemed like a good idea at the time. I didn’t invent this from whole cloth; the project got started when I saw a video of such a system done by a group of Czech students on YouTube. I only improved on that design by adding better filters, topographic contour lines, and a physically correct water flow simulation.

The idea is to have these AR sandboxes as more or less unsupervised hands-on exhibits in science museums, and allow visitors to informally learn about geographical, geological, and hydrological principles by playing with sand. The above-mentioned NSF project has three participating sites: the UC Davis Tahoe Environmental Research Center, the Lawrence Hall of Science, and the ECHO Lake Aquarium and Science Center. The plan is to take the current prototype sandbox, turn it into a more robust, museum-worthy exhibit (with help from the exhibit designers at the San Francisco Exploratorium), and install one sandbox each at the three sites.

But since I published the video shown above on YouTube, where it went viral and gathered around 1.5 million views, there has been a lot of interest from other museums, colleges, high schools, and private enthusiasts to build their own versions of the AR sandbox using our software. Fortunately, the software itself is freely available and runs under Linux and Mac OS X, and all the hardware components are available off-the-shelf. One only needs a Kinect 3D camera, a data projector, a recent-model PC with a good graphics card (Nvidia GeForce 480 et al. to run the water simulation, or pretty much anything with water turned off) — and an actual sandbox, of course.

In order to assist do-it-yourself efforts, I’ve recently created a series of videos illustrating the core steps necessary to add the AR component to an already existing sandbox. There are three main steps: two to calibrate the Kinect 3D camera with respect to the sandbox, and one to calibrate the data projector with respect to the Kinect 3D camera (and, by extension, the sandbox). These videos elaborate on steps described in words in the AR Sandbox software’s README file, but sometimes videos are worth more than words. In order, these calibration steps are:

Step 1 is optional and will get a video as time permits, and steps 3, 6, and 8 are better explained in words.

Important update: when running the SARndbox application, don’t forget to add the -fpv (“fix projector view”) command line argument. Without it, the SARndbox won’t use the projector calibration matrix that you so carefully calibrated in step 7. It’s in the README file, but apparently nobody ever reads that. 😉

The only component that’s completely left up to each implementer is the sandbox itself. Since it’s literally just a box of sand with a camera and projector hanging above, and since its exact layout depends a lot on its intended environment, I am not providing any diagrams or blueprints at this point, except a few photos of our prototype system.

Basically, if you already own a fairly recent PC, a Kinect, and a data projector, knock yourself out! It should be possible to jury-rig a working system in a matter of hours (add 30 minutes if you need to install Linux first). It’s fun for the whole family!

VR in the movies

I’m mad at the Onion A.V. Club right now (no, not really, I love those guys). In my post about the Leap Motion Leap I briefly mentioned my one gripe with the way VR is presented in Minority Report, and that I should write a post about it. That evolved into making a post on the larger topic of evaluating how realistic / crazy out there VR depictions in movies are in general, and when I opened the A.V. Club this morning to read my weekly dose of Babylon 5 reviews (oh yes, I am an unapologetic fan), I saw this: The future won’t look like this: 11 unintentionally ridiculous depictions of virtual reality. Curse you, A.V. Club!

On the danger of looking like a lame copycat, I’ll still do it, because the technical angle I had in mind is different from the A.V. Club’s approach, but if you disagree, tell me off in the comments.

Let’s get going, with a completely subjective selection, and in no particular order.

Star Wars, 1977

What? There’s no VR in it! True, but there are “holograms” in it. And because it’s an extremely common misconception, and I get it thrown at me all the time, I need to say it: real holograms don’t work that way! You know the scene I’m referring to:

The thing is that real holograms need to be “supported” by a piece of holographic screen behind them — you can only see the part of the hologram that’s between your eyes and the screen. Holograms are free-standing — just not as free-standing as most people subconsciously assume; holographic projectors such as R2-D2’s here are fiction. It’s important because the argument goes: once we get real-time holograms, we won’t need to build CAVEs anymore. Technically true, yes, but you’d need to build a space enclosed by holographic screens to get the same effect as a CAVE, so basically the same thing. Sorry.

Verdict: Fiction!

Disclosure, 1994

But this one’s in the A.V. Club article! True, and I feel bad about copying them even more blatantly. But I have to amend what they’re saying. I have no beef with their evaluation on the ridiculousness scale, but from a technical point of view, VR as depicted in Disclosure, at least in the following scene, exists and is used today:

Let’s see: tracked head-mounted display, tracked data glove, omni-directional treadmill, 3D scanner that captures a real-time 3D image of the user and projects it into the virtual space — I have all that in my lab, minus the treadmill (unfortunately). I’m even working with architecture firms. Walking across a virtual cathedral to access files, and a bottomless chasm in the middle of your database server for no reason? Yeah, that’s silly.

Verdict: Nailed it!

Minority Report, 2002

This one’s interesting. There are two VR bits in it: the famous “maestro-style” free-hand GUI, and the 3D home movie. Let’s tackle the simple one first, the 3D home movie:

The 3D video itself looks exactly like the kind of video you can capture with a 3D camera like the Kinect, down to the fringe triangle artifacts (someone on YouTube even made a mash-up between this and my first Kinect video; it’s uncanny). The projection system is another story: at first glance, it’s another completely free-standing hologram (fiction!), but a bit of fanwank can explain that it was actually a projection onto a 3D multi-viewpoint fog projection display (exists! just not quite as good yet).

Partial verdict: Nailed it!

The part with which I have a gripe is the 3D GUI:

From a technical point of view, we could have built that in 2002: tracked data gloves (had them in my lab in 1998, albeit with wires), projection onto a translucent screen (nothing to it), gesture interface, we could have rigged up a physical data transfer module (it’s basically a transparent USB stick, right?), etc.

So here’s my gripe: the whole thing makes no sense. Some people have issue with the manual data transfer — why not send the data over the network? — but you could fanwank that as a security issue. No, the problem is why use a 3D user interface in the first place? Look exactly at what he’s doing. All the data with which he’s interacting are 2D — text, images, movies. All the interactions are 2D: he moves and pinch-zooms, he rotates in the screen plane. Oh, and kind folks who did the annotation? It’s not a “holoscreen” — it only shows 2D images, so it’s simply a “screen.”

There is no free 3D manipulation, so why is he using a free 3D user interface? It’s bad, ergonomically. Holding your hands out like that for precise work over an extended time (more than a few minutes) is painful. The syndrome is called “Gorilla Arm.” The ideal hardware and UI for this type of work is a multi-touch surface device, probably set not vertically, but at an angle like a drafting table. Then your hands and fingers have something to rest on and push against for the interactions, which makes them much easier and less painful.

Why am I harping on this? People are rushing to recreate this interface, now that the hardware is cheaply available, because it looks extremely cool in the movie. It fooled me for the first two times watching. So people are working hard trying to make an interface that’s literally painful to use, and people actually trying it will hate it, and the backlash will hurt us all. Please, don’t do it.

Partial verdict: Nailed it technically, but failed ergonomics

Iron Man, 2008

This one I love:

It starts out like the Minority Report GUI, but then it gets good the moment the suit’s 3D model appears over the virtual workbench. I’m wondering if that’s intentional one-upmanship: start out just like the other, and then blow it away.

Anyway, let’s look at the technology: free-standing 3D display above a virtual workbench, hand tracking and gesture interpretation without data gloves. Pushing it, but we have the Kinect, we may soon have the Leap, and we can always imagine that he could be wearing stylish VR goggles in Tony Stark’s inimitable style. Or, alternatively, assuming that what we’re seeing in the movie is a representation of what Tony sees, and not what another person in the camera’s place would see, and the former could be only the part of the 3D model that’s between him and the workbench screen, which could be auto-stereoscopic, then it’s entirely today’s technology.

So with a bit of squinting and allowing for the Hollywood glitz filter, yes, we can build that. As for the interaction: tell me it doesn’t look exactly like this, again accounting for the glitz filter, and me using only one hand (we have a second input device now):

Now you might ask: why am I lauding free-space 3D interactions here, when I decried them in Minority Report? Simple, because here they are used for actual 3D manipulation, where you accept a bit of discomfort because there’s no better alternative. And you’ll also notice that he’s holding his arms in a more comfortable position, not at shoulder height (or only for as long as required to grab an object). That makes a huge difference, and it’s what our users do when they spend long hours in the CAVE.

Verdict: a bit shinier than what we can do today, but overall Nailed it!

Iron Man 2, 2010

Several scenes in this one. The first is the coffee table scene:

Pretty standard multi-touch surface display and interactions. Not really VR, as it’s all 2D, but worth a mention anyway. Verdict: Nailed it!

The workshop walk-through scene:

Similar to the scene from the first Iron Man, this one features completely free-standing 3D imagery, implied to be free-standing holograms, and therefore fiction. But in the context of the movie, it’s entirely possible that his entire workshop is panelled in auto-stereoscopic displays, and that the movie is only showing us what Tony sees. That could be done today, but it’s not close to practical, a huge stretch, and because of the common misconception about holograms, I’ll have to give it a demerit. Add to that the fact that the user interface here is a lot more “do what I mean” than in the first movie’s scene. There, the gestures he performs correspond directly enough to actions on the 3D model that a good 3D UI can explain it, but here it’s over the line. This UI, as depicted, can only work if a strong AI is running it. Since we already know that Tony employs a strong AI as an assistant, that makes sense in the context of the movie, but sadly it’s fiction.

Overall verdict: Fiction!

All right, that’s my list for now. I’m not going to touch the Matrix, Thirteenth Floor, eXistenz, et al., because those are obviously pure fiction. But if I forgot anything that deserves mention, because it depicts an internally consistent combination of display hardware and user interface that may or may not exist or be theoretically feasible, please let me know below. I have Netflix.

D’oh, I forgot one, especially embarrassing because I mentioned Babylon 5. How could I!

Babylon 5, And The Sky Full Of Stars, 1994

Can’t find a clip, but here’s the episode recap on the Lurker’s Guide. Synopsis: the station’s commander gets kidnapped and interrogated by being strapped into a virtual reality system, so that the interrogators can mindscrew him and break him more easily. The VR system itself is not thought through enough to be analyzed, except the display bit itself: it’s a retinal projector, shining the image of the virtual 3D world directly into the user’s eyes (only into one eye in the episode, sad oversight). Exists!

The input part of the system, on the other hand, must use some kind of neural interface, because the user (or captive in this case) can move inside the virtual world normally while being strapped into a chair in the real world, so Fiction!

How the interrogator, or the commander’s virtual body, get mapped into the virtual world is not even addressed, so Didn’t think about it!

Let’s just say I like this episode in spite of the VR stuff, not because of it. It’s just a TV show, after all.

KeckCAVES on Mars

You might have heard that NASA has a new rover on Mars. What you might not know is that KeckCAVES is quite involved with that mission. One of KeckCAVES’ core scientists, Dawn Sumner, is a member of the Curiosity Science Team. Dawn talks about her experiences as tactical long term planner for the rover’s science mission, and co-investigator on several of the rover’s cameras, on her blog, Dawn on Mars.

Immersive 3D visualization has been used at several stages of mission planning and preparation, including selection of the rover’s landing site. Crusta, the virtual globe software developed by KeckCAVES, was used to create a high-resolution global topography model of Mars, merging the best-quality data available for the entire planet and each of the originally proposed landing sites. Crusta’s ability to run in an immersive 3D display environment such as KeckCAVES’ CAVE, allowing users to virtually walk on the surface of Mars at 1:1 (or any other) scale, and to create maps by drawing directly on the 3D surface, was important in weighing the relative merits of the four proposed sites from engineering and scientific viewpoints.

Dawn made the following video after Gale Crater, her preferred landing site, had been selected for the mission to illustrate her rationale. The video is stereoscopic and can be viewed using red/blue anaglyphic glasses or several other stereo viewing methods:

We filmed this video entirely virtually. Dawn is working with Crusta on a low-cost immersive 3D environment based on a 3D TV, which means she perceived Crusta’s Mars model as a tangible 3D object and was able to interact with it via natural gestures using an optically-tracked Nintendo Wii controller as input device, and point out features of interest on the surface using her fingers. Dawn herself was filmed by two Kinect 3D video cameras, and the combination of virtual Mars and virtual Dawn was rendered into a stereo movie file in real-time while she was working with the software.

Now that Curiosity is on Mars, we are planning to continue using Crusta to visualize and evaluate its progress, and we hope that Crusta will soon help planning and executing the rover’s journey up Mt. Sharp (NASA have their own 3D path planning software, but we believe Crusta has useful complementary features).

Furthermore, as the rover progresses, it will send high-resolution stereo images from its mast-mounted navigation camera. Several KeckCAVES developers are working on software to convert these stereo images into ultra-high resolution digital terrain models, and to register these to, and integrate them with, Crusta’s existing Mars topography model as they become available.

We already tried this process with stereo imagery from the previous two Mars rovers, Spirit and Opportunity. We took the highest-resolution orbital topography data available, collected by the HiRISE camera, and merged it with the rover data, which is approximately 1000 times more dense. The following figure shows the result (click to embiggen):

The white arrow in panel A shows the location of the rover’s high-resolution data patch shown in panels B and C. In panel C, a stratum of rock — identified by its different color — was selected, and a plane was fit to the selected points (highlighted in green) to measure the stratum’s bedding angle.

The above images were created with LiDAR Viewer, another KeckCAVES software package. LiDAR Viewer is used to visually analyze very large 3D point clouds, such as those resulting from laser scanning surveys, or, in this case, orbital and terrestrial stereo imagery.

The terrain data we expect from Curiosity’s stereo cameras will be even higher resolution than that. The end result will be an integrated global Martian topography model with local patches down to millimeter resolution, allowing a scientist in the CAVE to virtually pick up individual pebbles.