Intel’s “perceptual computing” initiative

I went to the Sacramento Hacker Lab last night, to see a presentation by Intel about their soon-to-be-released “perceptual computing” hardware and software. Basically, this is Intel’s answer to the Kinect: a combined color and depth camera with noise- and echo-cancelling microphones, and an integrated SDK giving access to derived head tracking, finger tracking, and voice recording data.

Figure 1: What perceptual computing might look like at some point in the future, according to the overactive imaginations of Intel marketing people. Original image name: “Security Force Field.jpg” Oh, sure.

First, the hardware. The actual device is much smaller than the Kinect, about 4-5″ wide — it’s only a bit wider than a regular webcam. Initially I was worried about the short distance between the camera’s “eyes,” because the resolution of triangulation-based depth reconstruction, as it’s done by the Kinect using structured light, is proportional to stereo baseline. But it appears, and the Intel guys there were mute about technical details of any kind, that this camera is based on time-of-flight. That means two things: for one, stereo separation doesn’t matter because there’s no stereo; second, depth resolution is independent of distance and not inversely proportional to distance, as in triangulation-based approaches; and third (there’s always a third), the depth camera has only one eye and not two. I should really have noticed that last one earlier.

There was no mention at all of the depth camera’s pixel resolution, but given that time-of-flight depth cameras return a measurement for every pixel, and not just every 25th or so pixel like the Kinect, the effective resolution, in x and y and z, of this small device might turn out to be significantly better than the Kinect’s. We’ll have to see. While specifically aimed at desktop tracking, this device (which doesn’t appear to have an actual name yet) might work well even for longer-range applications.

On the software side, there’s not much to tell yet. On the upside, Intel’s SDK is comprehensive. It covers all aspects one would need to create a holographic display with 3D interaction (head and finger/hand tracking, just add a stereo monitor — head tracking without stereo is useless). The voice recognition is apparently based on Dragon technology, which bodes well. On the downside, the SDK demos Intel showed were lame. One would think that with the number of people Intel had working on this for the last two years (when I first talked to Intel about their “perceptual computing” plans), and the number of Kinect applications from which to crib, they’d have come up with something impressive. Coincidentally, they showed a virtual clay modeling demo, less than two hours after I had uploaded a video of same to YouTube.

All in all, it would be really nice to integrate the tracking and voice components of the SDK¬† into my Vrui VR development toolkit as virtual input device drivers, to immediately make the entire suite of Vrui applications available, in full glorious holographic 3D, for computers equipped with the camera. Alas, the SDK is Windows-only (and Vrui is everything except Windows). At least the Intel guys weren’t immediately dismissive about the possibility of a future Linux SDK, but Windows is clearly their number one priority for the foreseeable future.

I should also mention that there is a developer competition for the perceptual computing SDK right now. That was the reason for yesterday’s presentation; the Hacker Lab will host a hackathon next Friday and Saturday, with prizes for the most impressive demos or applications. Yesterday was a pre-hack meeting, where Intel handed out free cameras to teams or developers that had signed up. I think there’s still a chance to sign up, so if you live in the Sacramento area, I recommend to go for it. The main prize is $7k, nothing to scoff at. I briefly considered signing up for the competition myself, but it would be highly unfair. I haven’t done any Windows development in almost 20 years, and don’t even know how to open a window in the Windows API, let alone how to put 3D graphics into it. But combining, say, the Nanotech Construction Kit with the SDK would be a very cool setup indeed.

Oh, and how does this device compare to the Leap Motion? Well, first off, unlike the Leap, it actually exists at this time. In fact, they were handing them out like candy yesterday. Size-wise, it’s pretty comparable. Like the Leap, this is supposed to be able to track fingers reliably, but the demos I saw didn’t really show off that capability, or gave good indications of how good finger tracking will be in practice. But in addition, this device has a built-in color camera, is able to capture colored 3D video (and exposes it at the SDK’s API), and has a built-in microphone with voice recognition. So it holds up pretty well, I’d say.

3 thoughts on “Intel’s “perceptual computing” initiative

  1. The TOF sensor in the camera is a QVGA sensor from SoftKinetic/Texas Instruments and gives an effective resolution of 4x the Kinect.

    Demos/content will probably be more robust next quarter, and Intel will be announcing the first round contest winners at GDC next week. Intel hasn’t gotten around to it but an independent developer is keeping track of content uploaded on YouTube here: http://grasshoppernetwork.com/showthread.php?tid=1004 Most of these are just ok – but there is an impressive clay modeling video put up over a month ago – the more impressive content right now is from the Ultimate Coder Challenge:
    http://software.intel.com/en-us/ultimatecoder2/

  2. Pingback: The Kinect 2.0 | Doc-Ok.org