Monday, September 21, 2009

Prof. V. Michael Bove, Jr. of MIT Media Lab talks Holographic Video!

Hi, Michael here with a glimpse at what many people consider to be the holy grail of display and entertainment technology: holographic video. Prof. V. Michael Bove, Jr. (of the Object Based Media Lab at the MIT Media Lab) and his team are hard at work creating an inexpensive desktop monitor that displays holographic video images in real time.

Bove, Jr.'s enthusiasm for holovideo is contagious. As he leads me around his Cambridge, Massachusetts lab - bustling with students and packed with all sorts of devices and pieces of future technology, we arrive at a circa-1994 holography table. "We are replacing pretty much everything from here down, except for one moving mirror, with this chip," Bove, Jr. tells me. "That 19 inch rack is being replaced by 4 small boards like this. This gas laser is being replaced with a tiny semiconductor laser. In a lab around the corner, we are working on packaging everything up. The new one is 440 scan lines. We're doing both red and green. One of the problems is that once the display is boxed up, it’s just going to be a black box with a window on the front, it’s not going to be Frankensteinein like that [holography table]. So there’s something sexy about a holovideo display that makes noise, and has high voltage (laughs). It's like1920’s TV! Hey, there’s mechanical stuff in there! There’s motors! There’s galvanometers! At the same time, I hate working in there, since it’s so scary in there!"

I didn't get to see the system in action, but from what I learned during my visit to MIT, affordable holographic video will be arriving sooner than you might expect. Read below for details on the innovative tech that has gone into shrinking a dining room table-sized system to something you can fit into your pocket, the exciting possibilities that holovideo enables, and more.

Prof. Bove, Jr. explains how his group got involved with holovideo, and discusses the advantages of holovideo over autostereo technologies:

In about 1989 / 1990 [Stephen Benton’s team] created the world’s first holographic video display, which was called the Mark 1. My group got involved because we were doing hardware and software for processing video and graphics in real time. So we ended up getting involved in some of the electronics and software to drive the electro-optics that Steve’s crew were building. In the mid-90’s they built a 2nd generation display, the Mark 2 display. We were continuing to be involved in the development of that. And there are lots of interesting things that you can do with a holographic video display. It’s not just that it’s no glasses – autostereoscopic – but it’s also that you can, like with any hologram, you can make the object be in front of the physical display – subject to the fact that you’re vignetted by the frame of the screen so you can’t have objects hang out beyond the edges. But subject to that, you can make objects be out in front, which means that you could do haptic interfaces, gestural interfaces, interact directly with things. We had a student do a PhD dissertation here about 8 years ago on haptic interfaces to objects displayed in holovideo, which most volumetric displays don’t let you do, because the view volume is inside a box, or under a dome, or something like that. So we were very interested in the interaction possibilities.

On his decision to turn holovideo into a consumer technology:

When Steve passed away we decided to move the whole holovideo project up into my group because we had been working on it for more than 10 years with Steve. We decided to take a slightly different spin on it. At the time – and this was roughly in the 2004-2005 time frame – I noted that there was going to be a push for 3D in the home, and I said, “OK, let’s try to make holographic video into a consumer technology.” So, what does that mean? It means, obviously, that you need a display technology that’s inexpensive – a few hundred dollars. You need content. You need processing. And you need a distribution mechanism. So what we predicted was that there was going to be a lot of stereo, and even multiview stereo content already available, so we wouldn’t need to deal with that. And there were going to be lots of 3D models. Every video game is represented as a 3D model. And if you can turn a 3D model into a hologram on the receiving end in real time, then the content and distribution is already taken care of – although there isn’t really a standard for transmitting 3D, people who have done online 3D games have demonstrated that it is practical to distribute 3D models in real time of reasonably high quality.

So we decided to concentrate on the processing and display technology. In terms of the processing, we have since about 2004 been using off-the-shelf GPUs where you take a 3D model, and instead of rendering it as an image you render it as a diffraction pattern for the hologram. Our initial results were between a half and 2 frames per second, and we’re now in the 15 to 30 frame per second range for standard definition TV resolution.

Michael: Wow, that’s impressive. I had heard about the University of Arizona hologram, which can show a new image every few minutes...

Prof. Bove, Jr.: What they have is an electrically eraseable photopolymer. Which means that they have to write an image using some technology onto the hologram. Unlike other photopolymer holograms, they can erase theirs and rewrite it.

So there are two different aspects to the refresh. The one is the physical display. Like any display the display has an inherent refresh rate – and all the electronic displays we’ve built around here for years have had a 30 Hz refresh rate – which isn’t quite fast enough, but it’s pretty good. We’d like to get it to 60. We can, but you give up something in return for that. So we’d rather have higher resolution at 30 than lower resolution at 60. The other thing is that we have the computation; so the display can be refreshing at full speed, but you might not be changing what it’s displaying. So when I say we have 15 to 30 Hz, I mean that the GPU can generate 15 to 30 holograms per second. And that’s for scenes that are kind of PS2 quality – they’re not super, super high quality graphics – but it’s pretty good.

On how to make the display technology:

Prof. Bove, Jr.: So the issue is: how do you make the display technology? Because we’re starting with something that, when Steve’s group was working on it, was dining room table size, around $50,000 worth of stuff, and a whole 19-inch rack full of electronics, and what we wanted to do was package all that stuff in something like a CRT monitor. Unfortunately, the way we are doing it right now you can’t make it flat. You can make it relatively shallow, but you can’t make it totally flat. So our technology is going to look like a CRT when it’s packaged up. But, it’ll then just have one or two DVI connectors or HDMI connectors going to your PC or game console as the interconnect.

Now, there’s a catch to all of this. The catch is, with most display technologies, you have a fixed-size image. You know, HD is 1920 by 1080. Well, when you make the screen bigger, you just make the pixels bigger. So it’s still 1920 by 1080 on a big screen or a little screen. With a hologram, because you’re using the physics of diffraction to steer the light, the pixels have to stay the same size, and you need more of them as the display gets bigger.

Michael: Even 4k panels are very expensive. So is packing more pixels a cost issue, a technology issue, or both?

Prof. Bove, Jr.: Essentially, realistically, you need pixels that are about the size of the wavelength of light. So that means for every meter of screen width, you need about 2 million pixels [on each horizontal line]. Now, it’s further the case that you need the same pixel pitch vertically to do a full parallax hologram. And that’s really hard to do, because that's A LOT of pixels. You’re talking about rasters [data structures] that are about 2 million by a million, let's say. And just moving that many pixels around – just building the interconnect for something like that is murderous. It's doable, but it’s not something that’s going to be scaled down to consumer costs any time in the immediate future. So what we and some others have been doing with horizontal-parallax-only holograms is to say that it has 2 million pixels per scan line per meter, but it’s the same vertical resolution as an SDTV or HDTV. What that means is that you don’t have vertical parallax. So, if you move your head left and right you can look around objects. If you move your head up and down, the scene doesn’t change.

Michael: Which you usually don’t do anyway when you’re sitting, watching something.

Prof. Bove Jr.: Right. And given that your eyes are arranged side-by-side, unless you are lying on the couch, sideways, you don’t really notice the lack of vertical parallax. Now, for things like medical visualization, or certain kinds of engineering design, or CAD [computer aided design], or something else, you really want to be able to look over objects and not just around them, so that might not be a suitable trick to use for those high-end applications. But for low end applications, and for consumer applications, it means that you are actually down to something practical. Now, what do you mean by practical? You mean by practical that you could, with the kinds of interconnects we have available, and the kinds of GPUs we have available, you can at least do a small – by small I mean postcard-sized – HDTV display being driven by one GPU, normal GPU right in your PC, over the cables that are coming out of your PC. And so we’re doing proof-of-concept work.

Think of it as post-card-sized SDTV resolution. The actual number of pixels in there is a whole lot more than that. So imagine SDTV with maybe 25 to 30 degrees of look-around. And you can do that on the GPU you’ve already got.

Michael: That’s amazing. I can see that being implemented in a next generation video game console or something similar.

Prof. Bove Jr.: Sure. Given that we can do that – we thought the computation was going to be a lot harder. And two researchers in my group – Dr. Quinn Smithwick and graduate student Jim Barabas have really managed to take advantage of the processing pipelines that are already out there. So we’re doing this – we did it in Open GL, we’re now using CG – but we’re just doing shader programming. So it’s just the same kinds of optimizations that gamers are using.

One of the things that we’re doing - we have to make the scan lines really long, so we’re making the display treat multiple scan lines – because the GPUs tend to have a limit on how long a scan line can be. It’s hardcoded in somewhere, both in the hardware and in the drivers. So we have to make the display treat multiple scan lines as one scan line. So the higher end [off the shelf] GPUs let you do basically make the rasters bigger, so you want to use one of those. Because your frame buffer is just larger.

But it’s not really all that magical, what we’re doing, in terms of the processing. But it turns out that for various coincidental reasons, the kinds of things that people – ATI, NVIDIA – are optimizing for GPUS for regular graphics rendering, turn out to be the same kinds of things we need to do in generating holograms. So that’s all good. So the display itself then poses a problem, because you need something that has the ability to make *extremely* small pixels. And there are a variety of ways of doing that. People have taken a variety of approaches. One of the things you can do – and some people in Japan who we know quite well have taken the chips for HD projectors or the chips for HD electronic viewfinders – which have really small pixels – and you can tile a whole bunch of them together.

And you can imagine that the wiring of something like that has to be monstrous. You’ve got a whole bunch of HDTV rasters in something that’s the size of like a paperback book. The other thing is just to use a very, very different kind of technology. One of my grad students, Dan Smalley, is making chips, and they don’t actually look like much, but what this is – there’s a kind of chip that is in, for example, most wireless phones and other things, called a surface acoustic wave chip, or SAW chip, and they are used as filters for very high frequencies. And the way that they work is that if you have a material that is piezoelectric, you can turn an electrical signal into a sound wave effectively that travels across its surface, and for various reasons you can make very cheap and efficient filters that way. Well, there is a similar phenomenon – if you have a material like that that’s transparent, you can make a chip that’s called a “guided wave device”, where you turn the electrical diffraction pattern into a sound wave that travels across the surface. Now it’s a very funny kind of sound wave because it’s a sound wave at a gigahertz. But it’s a sound wave. And if you then run monochromatic light through the material, the signal going across the surface diffracts it.

So, this is a particularly good match to horizontal-parallax-only holography, because effectively what you do is you shine a laser in here and you get out one whole scan line of a hologram. So your display basically requires just a vertical scanner to scan out, say, a rotating mirror which is moving pretty slowly to scan out all the scan lines. And then you just *look* at this thing – and the scan line diverges out when it comes out of here, so this device can be a lot smaller than the physical scan line. So, these are cheap and we can make them here on campus in a lab across the street.

So this is what we have been pursuing, over about the last 3 and a half year: how to make these into a commercial display. And there’s one other catch. And the catch is that the idea of using diffraction to make a 2D display actually goes back to the 1930’s. There was a company called the Discophany company. When CRT’s were still expensive and unreliable Discophany company was making refractive 2D projection TVs. The problem with diffraction is that the sound waves travelling across the surface of this are travelling at the speed of sound, so the *picture* is travelling horizontally at the speed of sound. And up until now, to do something like this you needed a rotating mirror going in the opposite direction at the same speed so that you get a stationary picture. My student Dan has worked out a way to [get rid of the mirror]. That mirror is moving really, really fast because it’s moving at the horizontal scan rate, not the vertical scan rate. He’s figured out a way to make the system so you don’t need the mirror and you’ve still got a stationary image.

Michael: The very idea of a refreshable hologram is very new. Most people still think of bulky tables and still images. Holovideo would strike many people as almost like magic.

Prof. Bove, Jr.: Remind your readers that holographic video is *exactly* the same as any other kind of hologram in that it’s something you look toward. You can’t make the light go across the room like Princess Leia. Because photons do not want to ravel halfway across the room and suddenly decide to make a left turn toward your eye, so you need to look toward the display, but objects can be out in front of it. And that’s one of the reasons for wanting to scale-up holographic video displays, because obviously as the frame gets bigger, you can have objects seemingly hang out farther in front. And of course you’ve got a view volume that goes back into the screen as well. So, now that we can actually render these in real time, we can do a lot of the stuff we’ve talked about for years – like take advantage of the 3D volume of interactive video games.

If you think about it as you don’t just have screen space the way you do in a normal video game – you have a view volume that’s pretty deep. And so you might imagine something as simple as you’re playing tennis and the ball’s coming toward you, AND you have force feedback in the paddle. When the ball hits the paddle, you can feel it. So we’re just starting to prototype some games. I don’t have anything I can show you along any of those lines yet, we’re probably be showing those off publicly next spring sometime.

The notion of thinking about a volume in which you can interact is something that people really need to start thinking about seriously. I mean it’s hard to do that just with stereoscopic 3D, for a lot of reasons. Including the fact that with stereoscopic 3D you don’t have motion parallax. So you can’t move your head to look around the thing in the front to see the thing in the back. And so that quickly limits your options for laying things out in 3D space.

Michael: Head tracking and motion tracking makes more sense for home use. It seemingly wouldn’t be effective in a large theater.

Prof. Bove, Jr.: With this kind of thing, you don't have to track heads. There is a company that is prototyping holovideo displays called Seereal, and they’re a startup out of Germany. They are solving some of the computation and other problems of the display by tracking your eyes. So they have a camera looking out at you that tracks your eyes, and they only generate the hologram that you could see where your eyes are.

Michael: That’s fascinating. It sounds like it has a lot of potential.

Prof. Bove, Jr.: It’s very interesting. And they’ve produced some lovely demos. But they’re not looking to manufacture this technology from what I can tell. My understanding is that they are looking to license it. And they are making several other things as well. We, on the other hand, don’t want to go through all that trouble. We just want to generate everything everywhere. That means we need to do more computation, but we have a little less complication in the overall system design. So when I say motion parallax, I’m just saying it’s free space. You move your head around and you can look around things [because] it’s generating all those parallax views all the time. But you also have some other things - you can’t use focus cues because in a normal stereoscopic display, your eyes are converged at one distance, and focused at a different distance. In a holographic display, since you’re actually creating wavefronts of light, you can have the material be diverging from any point.

Michael: So you can change your eyes, avoid eyestrain.

Prof. Bove, Jr: That’s right. So you can have an object in the foreground – if you have enough depth - you really can focus your eyes on the thing in the background, then focus your eyes on the thing in the foreground. Which adds some computational complexity. We are just starting to explore some of the perceptual and user interface issues, and how much you can get away with on that. But the fact that you do have focus and parallax - as with any holograms – makes viewing these things just more like looking at an object.

Michael: I’m meeting Prof. Raskar later [read our interview with him here]. He gives the example of a hologram of a flower sitting next to a regular flower, and how the hologram wouldn't respond to outside lighting.

Prof. Bove, Jr.: When you’re making a synthetic hologram, you can relight it all you want. If you have, particularly, a stationary hologram of a flower, and you have the [real] flower, if the sun moves overhead, the flower’s shadow is going to move, and the illumination is going to change, and the hologram is going to look the same. However, if you’re computing the hologram on the fly, you can change it. You can put the lighting wherever you want, you can make a synthetic focal length of whatever you want, you can make a synthetic aperture. You can play with depth of field, you can play with all kinds of stuff. Now, in order to make it actually respond to the sun moving overhead, you obviously need some kind of sensor that is going to detect the sun’s direction- a camera or something-and then re-render the hologram to do that.

But that’s a characteristic of static holograms. You have to be computing the hologram on the fly. A lot of what has been done in holovideo over the years, is that you compute a single synthetic hologram, then you store those pixel values in a very high resolution frame buffer, then you just display that image. You can’t interact with it. It is not changing dynamically. Indeed, fully the first at least 10 years of holovideo around here [MIT] were largely precomputation. Gradually, in the late 1990’s we were getting away from doing total precomputation. But, even doing that with specialized hardware, which we were doing at the time - It was just a whole lot of work. So the fact that GPUs have developed in a direction that allows them to do the kinds of computations we need to do, and that here are other market forces forcing GPUs to get faster and at a higher resolution: that's good news for us.

One of the things that is particularly interesting to us is the fact that a lot of the standardization activities in the regular stereoscopace, are potentially of service to holovideo as well. In particular, now that there’s a multiview extention H.2.6.4’s AVC, that means you can send a lot of parallax views, and you can send them efficiently. And there are codecs for dealing with it. And there’s a standard for dealing with it. So that means somebody can go out with a camera rig, just a whole bunch of webcams and shoot for an integral [integral imaging], or just a lenticular, parallax barrier autostereo screen.

A holovideo display is just sort of a superset of all of those other parallax displays. So we can take those multiple parallax views, and in real time put them on a holovideo display. There’s a point in our processing pipeline where if you’ve got a bunch of parallax views you can stick them in, and then do the computation downstream to make each parallax view come out at a particular direction. So if there is multiview content floating around, that’s perfect!

Michael: What are your optimistic and pessimistic guesses for when holovideo will be on the market for consumers?

Prof. Bove, Jr.: I would say there will probably be some kind of diffractive, holovideo-type display on the market in 5 to 7 years. And it probably won’t be big. [I get this] from projecting what people are doing in displays, and the ability to put that many pixels, using one technology or another, together, and to interface them with the appropriate driver circuitry. That doesn’t require a super-breakthrough.

Michael: It seems Moore’s Law will take care of much of that. We're seeing a push to 4K projectors and panels much faster than expected.

Prof. Bove, Jr.: The other thing that’s happened that we are taking advantage of, is that you can now get red-green-blue laser triplets relatively cheaply – consumer grade cheaply, because they are being developed for 2D projectors, particularly for Pico projectors. And for the particular way we are doing holovideo, we don’t actually need good coherence, we just need monochromiticity. And the only reason for that is that different wavelengths diffract at different angles. So if you have a very wide band source of red light, like what you’d get from a filtered white light source , you’ve got a fuzzy pictures, because things diverge as they get farther from the screen, so you can’t really get a good 3D effect. And even ordinary LEDs aren’t quite monochromatic enough for us. But the fact that there are cheap semiconductor lasers at fairly high power, available in R, G, and B, means again that we don’t have to solve that problem.

So most of what we’re doing around here right now is monochrome. Just because going to color is pretty simple. I mean, color is just a factor of 3. Which means you either have to do the computations 3 times as fast, or stack 3 of your light modulator chips on top of each other. We’re not showing the latest holovideo display publicly yet, because it’s still dim and it’s still fuzzy. The efficiency needs to go up a bit to get more of the input light out into the hologram.

And with that, Prof. Bove, Jr. showed me around the lab before getting back to work.

A huge thanks to Prof. V. Michael Bove, Jr. for meeting with me in his Cambridge office, and to Alexandra Kahn for arranging the meeting!

Head over to the Spatial Imaging Group site for more information on holographic video.

V. Michael Bove, Jr. holds an SBEE, an MS in visual studies, and a PhD in media Technology, all from MIT, where he is currently head of the Object-Based Media Group at the Media Laboratory, co-director or the Center for Future Storytelling, and director of the consumer electronics program CELab.

He is the author or co-author of over 60 journal or conference papers on digital television systems, video processing hardware/software design, multimedia, scene modeling, visual display technologies, and optics.

He is co-author, with the late Stephen A. Benton, of Holographic Imaging (Wiley, 2008). He is on the Board of Editors of the Journal of the Society of Motion Picture and Television Engineers, and associate editor of Optical Engineering.

Contact Me

Jim Dorey
jim (at) marketsaw (dot) com

All contents Copyright © 2006-2018, MarketSaw Media. All Rights Reserved. All copyrights and trademarks on this website belong to their respective owners.