While I was in Santa Barbara recently to install a low-cost VR system, I also took the chance to visit the Allosphere. One of the folks behind the Allosphere is Tobias Höllerer, a computer science professor at UCSB who I’ve known for a number of years; on this visit, I also met JoAnn Kuchera-Morin, the director of the Allosphere Research Facility, and Matthew Wright, the media systems engineer.
Allosphere Hardware
The Allosphere is an audacious design for a VR environment: a sphere ten meters in diameter, completely illuminated by more than a dozen projectors. Visitors stand on a bridge crossing the sphere at the equator, five meters above ground. While I did take my camera, I forgot to take good pictures; Figure 1 is a pretty good impression of what the whole affair looks like.
Why is this such a bold design? Besides the basic difficulty of building a 10m diameter front-projected sphere, the real challenge is to set it up as a proper immersive display environment. To recap: building a VR environment is not just throwing stereo images at a screen (although even that is easily screwed up), but ensuring that the geometry of the displayed images exactly matches the projections of the virtual 3D objects, as seen through the viewer’s eyes. In an immersive display with flat screens, this is relatively easy: all that’s needed is the exact position, orientation, and size of each screen in 3D space, and the positions of the viewer’s eyes in that same space.
Not so on a curved screen: here, projection of an entire screen cannot be described by a single 4×4 matrix; instead, the 3D position of each pixel on the screen needs to be considered individually, and the color of each pixel in the left/right views needs to be determined by shooting a ray through it from the positions of the viewer’s left and right eyes, respectively.
Another issue, which upon closer examination is very closely related, is that it is not possible to cover a sphere with rectangular projection images without either gaps or overlaps. In other words, one has to manage the projected images very carefully to ensure that the overall image appears seamless. Fortunately, there exists a very good method for that. UCI‘s Aditi Majumder has developed semi-automatic camera-based calibration methods to simultaneously correct geometric distortion, by calculating the position of every pixel of every projection in the same 3D space, and color-balance the entire display by calculating overall gamut correction and per-pixel blending factors. In short, Dr. Majumder’s methods deliver exactly the information needed to generate correct immersive stereo images.
There is one caveat, though. 3D pixel positions are sufficient for ray tracing, but that’s not how 99% of graphics and visualization applications work. Enabling OpenGL-based graphics applications to render properly requires significant additional work. There are two major approaches: one can modify the front-end of the graphics pipeline, for example by subdividing polygons in a geometry shader and then applying per-vertex “ray tracing” in a vertex shader, or one could modify the back-end by first rendering into a virtual “flat” background buffer, and using a fragment shader in a second rendering pass to warp the flat image to match the actual curved projection. The drawback of the first approach is that it typically requires changing an application’s rendering code (especially if the application itself uses any form of shaders) and has a significant performance impact; the second method’s drawback is that it involves resampling a pixel image into another, similar-size, pixel image, which will introduce sampling artifacts.
That said: what are the benefits of a spherical display that are worth the extra effort? Besides proving that it can indeed be done, I can think of two: after proper calibration, the display can be apparently seamless (remaining seams are sub-pixel in size), and the large size of the sphere means that the immersive display has a larger “sweet spot,” meaning it can support immersion for more than one person, or even in the absence of head tracking (as in the Allosphere at this point).
While the Allosphere is still under construction, I’m going to give my opinion of the system as it is right now. Geometric calibration is already very good — much better than I expected, in fact. There are still a few visible borders showing up as dotted lines formed by individual pixels mistreated by the correction algorithm, but those are probably due to software bugs and fixable. Color correction is also very good; while the screens still show visible differences, especially in black level when using point renderings on a black background, like in a planetarium application, I would say those would not be noticeable in actual use unless one is looking for them.
As for the big question: is it so much better than a standard CAVE to be worth the extra effort? I’m heavily biased, obviously, but I’m not sure. The physical screen gaps caused by the CAVE construction are already around one pixel in size (at least in ours; Mechdyne has a patent on that). After proper calibration assuming flat projection (which is what every CAVE should, but doesn’t, have), the only remaining source of misalignment is non-linear projection distortion, such as bowtie or pincushion — but those effects are small, at least with our projectors. One could even remove them using the same calibration method as in the Allosphere, but that’s not a fair comparison, because one would pay the same graphics penalty. On the other hand, the larger “sweet spot” of the Allosphere is a clear benefit. Multiple users in the CAVE have to stick closely together to avoid major distortion; less so in the larger sphere.
I see one additional drawback of the sphere design, but it’s heavily dependent on how the system is used. Our style of working is based on interacting with virtual objects using one’s hands, and when looking at their hands, users look down. In the CAVE there’s a projected floor that shows the virtual objects users are touching; in the Allosphere there’s a walkway without projection. That means our interaction style would not work as well in the Allosphere; it would lead to Gorilla Arm unless one stands right at the railing and works above the lower hemisphere.
In the final analysis, I would say that the Allosphere is extremely impressive as a technological achievement, but I wouldn’t go as far as to suggest ripping out existing VR systems and installing spheres instead. Caveat: it’s still under construction; the lower hemisphere is not yet illuminated, and there were computer problems such as crashes, probably driver issues. But I believe the overall conclusions stand. One funny observation: the Allosphere is extremely impressive when it’s turned off, because one is standing on a suspended walkway! Inside a giant sphere! Compare that to a CAVE, which, when off, is just a boring box. When it’s turned on, on the other hand, the Allosphere looks exactly like a CAVE, by definition.
Software
Given that getting a spherical immersive display to work at all is such a huge undertaking, it’s no surprise that the suite of Allosphere applications is still smaller than what’s available for “vanilla” VR systems. But the applications I did see were very interesting, because they exhibit a quite different approach to interaction than that taken in KeckCAVES applications. Tech demos aside, I saw a virtual globe, which was a neat effect because the globe’s surface was mapped 1:1 to the physical sphere, and a planetarium-like application based on the same code showing patterns in celestial observation instead of actual stars. Both did not seem to be particularly interactive; Matthew was using a gamepad to rotate the globe or the sky sphere, but I think that was it. It’s possible that this was primarily a tool to judge geometry and color calibration
The first real application I saw was a visualization of an fMRI scan of a brain, using animated tracer particles and autonomous agents to visualize neural pathways “lighting up” depending on the subject’s thoughts. This was exactly the kind of data we’re working with (or at least our med school collaborators), and the difference in approach was interesting. In our application, users would directly manipulate the 3D scan by picking it up and moving it around with their hands, slicing it, extracting contour surfaces, taking measurements, etc. Here, the main input device was again a gamepad, with indirect controls to navigate (in this case, full translation and rotation, but I’m not sure about scaling), and comparatively abstract means to affect the visualization. Which, by the way, consisted of a (wireframe) contour surface, but that seemed to have been static geometry. Anyway, the interativity here was in controlling the agents tracing out neural pathways. I did not get a detailed explanation, but there was a button or gesture to “call” the agents, drawn as rotating colored boxes, to the user position, after which they would return into the 3D data and locate features of interest automatically, driven by the fMRI data themselves.
In short, the interactivity was in controlling parameters that in turn controlled the visualization, instead of controlling the visualization directly. From talking to JoAnn, I got another example of a high-dimensional simulation, where — and I’m paraphrasing from memory — multiple users could control the simulation as a group by each one individually controlling a subset of parameters, each directly mapped to an input device axis. I did not see this particular application, but it supports my impression that interaction in the Allosphere is based not on natural interaction, in the sense of mapping 3D motions to equivalent 3D interactions (pick-up-and-move to navigate, for example), but instead on mapping spatial gestures and input device manipulations to abstract control dimensions.
The second application was an art installation, where viewers are mapped into an underwater environment with simulated critters via one or more Kinect cameras. Concretely, users’ bodies are point-sampled by the Kinect(s), and the 3D points are turned into food particles for the lower rungs of the simulated food chain. Those critters will then swim towards the users’ outlines, and larger predators will slowly follow eating them. All in all, a very cool (and very pretty) virtual art piece, with natural interaction to boot, because the users’ motions are directly mapped to their food particle clouds, and quick movements even induce flow in the virtual water. But obviously this is a one-shot application with custom interaction methods that have little in common with other applications, and not an instance of an overall paradigm.
This was a short and short-notice visit (I am very grateful that JoAnn, Matthew, and Tobias took the time), and there was no time to explore these approaches in more detail. I’m looking forward for a much longer discussion of the different approaches in the future, to find out how they compare, and potentially complement each other.
Now here’s the big question I’ve been asking myself: would Vrui, and Vrui applications, work directly in the Allosphere? The answer is yes, in principle — the system is driven by a Linux cluster, with quad-buffer stereo — but the issue is that Vrui does not properly support non-planar screens right now. The only currently possible work-around would be to virtually split each projection into several tiles, and approximate each tile with a plane. The calibration parameters for these tiles could easily be derived from the Allosphere’s per-pixel calibration data, and the remaining trade-off is between accuracy (number of tiles per projection) and performance (each tile must be rendered individually). Post-rendering warping would fit into the overall architecture, but it’s not implemented yet. Maybe I’ll have a chance to try either one of these approaches at some point.