Andrew D. Wilson, Hrvoje Benko. "Combining Multiple Depth Cameras and Projectors for Interactions On, Above, and Between Surfaces". UIST '10 Proceedings of the 23rd annual ACM symposium on User interface software and technology. ACM New York, NY, USA ©2010.
Author Bios
Andrew D. Wilson is a senior researcher at Microsoft Research. He received his bachelor's from Cornell, and subsequently his master's and Ph. D. at MIT. He helped found the Surface Computing group at Microsoft.
Hrvoje Benko is also researcher at Microsoft. He received his Ph.D. from Columbia University. His interests revolve mostly around augmented reality and in discovering new ways to blur the line between 2D computing and our 3D world.
Summary
- Hypothesis - This paper did not have a hypothesis; it was simply a discussion and description of a design concept.
- Method/Content - The main concept behind this design was to allow users to interact with projected displays using normal tables (or other ordinary flat surfaces) as touch-screens. It made use of a suspended apparatus containing 3 IR and depth cameras and 3 projectors. The device decided what the user was doing by creating a 3D mesh of them and simulating their movements in a virtual 3D space. It created the mesh by using its cameras to create depth maps from different angles. Because it used the notion of one virtual space for all 3 cameras and projectors, there was no real discrepancies or major mistakes in gesture recognition. Available interactions included dragging some projected object off of a table, holding said object, putting the object back on the table (or a different one), moving objects from table to vertical screen and back, transferring objects from one person to another, and an interactive menu. The menu worked by moving through options based upon the height of the users hand, and selecting it if the user held it there for 2 seconds. Holding an object worked loosely on the idea of holding a ball; the user held their hand level and carried the "ball" where the wanted it to go. They could let go and drop the ball at any time.
- User feedback on this system was overall very good. During their public demonstration they discovered a number of limitations that were not readily apparent, but were possible to fix. One of the limitations was that if there were more than 6 people in the room, the system would get confused as to who was who because everyone was too close to one another. It was hard for the system to distinguish what person was trying to perform what action. This is easily enough fixed by increasing the space in the room and the range of the cameras/projectors. Another limitation found was that having 3 or more people in the room slowed the system drastically; the refresh rate of the system (and thus the projectors) dropped below the camera's refresh rate. This is also pretty easily fixed, to a point: use a more powerful computer to render the 3D space and meshes used. There is obviously an upper bound on the amount of people the system can accommodate (due to both size constraints and computing power) but it can definitely perform better than what has already been implemented.
Again, I loved this paper. It seems that the more I read, the more I realize that we are much closer to virtual reality environments than I thought. This system has a huge range of applications, from meetings to showcases, from artists to engineers, from product design to video games. The concept of moving objects from one surface to another is not really what excites me; it's the system itself. The fact that they can use relatively simple and inexpensive cameras to track multiple entities without the users wearing external apparatus (ie dots or markers) is amazing. I would absolutely love to have this system in my house, if just to play around with and maybe customize (to perform different actions).