3D Videoconference

It is a teleconferencing system that projects the image onto a mirror-based system and offers a three-dimensional appearance.
Thanks to Information and communications technology a face can be transmitted, as well as any image in real time and interact with them. Pre-recorded images can be rotated, manipulated, and viewed from different points of view while maintaining consistency.
Motivation
Nowadays, a wide range of graphical content is modeled and [ in 3D, although most continue to be represented in 2D planes.
Various forms of 3D displays have been contemplated and constructed for at least one hundred years (Walter Lippman) but only recent advances in digital capture, computation, and display have made functional and practical 3D displays possible.
During a face to face conversation, eye contact and gaze direction provide important visual cues to express emotion, attention and interest, which the 2D video can not do.
When a remote participant looks directly at the camera, everyone views the video stream in the same way regardless of their position in space. 3D videoconferencing point-to-multipoint gets an exact replica of gaze direction and eye contact.
To carry out this communication, the system should have the following fundamental requirements:
* A display that emits light beams in 360 ° with a correct horizontal parallax.
* A face detection system to produce a correct vertical parallax.
* A software and hardware capable of processing data in real time.
* An algorithm capable of rendering different projection centers of anisotropic OpenGL graphics graphics surface with a correct vertical perspective to any user at any point.
Tecnology
The system is based on a display composed of the following elements:
* High-Speed DLP Projector.
* Spinning mirror covered by a holographic diffuser.
* Motion-control motor.
* Standard PC.
The image of the room is reflected in these mirrors sipinning quickly enough to project the 360 degrees image. With proper synchronization, images can be displayed for the left eye and right eye slightly different, and make 3D displayed image.
High-speed Projector
To achieve a high rate of frames per second, the FPGA’s projector takes each 24-bit color frame of video and displays each bit sequentially as separate frames.
Thus, if the incoming digital video signal is 60Hz, the projector displays 60×24 = 1,440 frames per second.
To achieve the optimum system rate, it is set to a frequency of 200Hz, using two DLP projectors, achieving up to 8640 frames per second using a DVI video signal specially coded.
Espejos giratorios
The display works by projecting high-speed video from the projector in a spinning mirror system. As the mirror rotates, reflects a different and accurate image to each viewer. The size, geometry and material of the rotation surface have been optimized for displaying a picture of the size of a human face. Its two-sided form provides two passes of the display surface for each viewer in a complete rotation, getting a visual upgrade of 30 Hz to 900 rpm.
Indeed, the mirror reflects 144 unique views of the scene through a field of view of 180 degrees with an angular separation of 1.25 degrees.
Motor de sincronización
The surface of the mirror rotates synchronously with the images that are being reproduced by the projector, using the PC output signal as master.
The FPGA’s projector decodes each of the frames and communicates directly to the synchronous motor.
Since the frequency of the mirror is rotating 30 times per second, the human visual system captures the light, recreating an image of an object or person floating in the center of the mirror
Real-time 3D Face Scanner
The face of the remote participant is scanned at 30Hz using a structured light scanning system based on the phaseunwrapping technique.
The system uses a monochrome camera capturing frames at 120Hz and a greyscale video projector with a frame rate of 120Hz.
Another possibility of the system is to calculate depth maps of the persons’s face. For this, two images are acquired for each frame, and lit in the opposite way. Then, the two images are substracted to detect the zero crosses to obtain the 3D absolute position of the pixels from the center of the face.
Conveniently, the maximum of these two half-lit images provides a fully illuminated map texture for the face, while the minimum value gives the amount of ambient light in the scene. The result is a depth map for the face, transmitted along with the facial texture maps.
Face Tracking for Vertical Parallax

To provide accurate gaze and eye contact, the rendered face of the remote participant must appear to be rendered correctly into world space coordinates as seen by all audience members.
Rendering the face for the same viewing height and distance for all audience members can make the face appear to be gazing at an inaccurately high or low angle to some viewers, even though the natural horizontal-parallax of the display will provide generally accurate horizontal perspective to all viewers.
Although vertical gaze direction is detected with less sensitivity than horizontal gaze direction
a true sense of eye contact requires both to be within a few degrees of accuracy.
To correct the vertical perspective, OpenCV face detection markers are used, based on Viola-Jones detector, and a Kalman filter to reduce additive white noise. Thus, the parallel horizontal screen offers a binocular stereo image without any delay, while the vertical parallelism is achieved through the detection.
Display Surfaces
The display surface provides information about the behavior of the emitted light towards the spectators.
Flat surfaces and concave / convex can be employed. These surfaces offer different advantages and disadvantages underscoring the utility of being able to project onto arbitrary surfaces.
Flat surface: it has a steeper angle to better match the shape of a face and two sides to effectively double the frame rate of the display. The diverging beam of the projector continues
to diverge horizontally after reflection by the flat display surface, so that approximately a 20_ wedge of the audience area observes some reflected pixels from the projector for any given mirror position .The flat mirror is the simplest to build and calibrate though other shapes can provide more useful optical properties.
Convex surface: The utility of this surface is that, at any time, any audience member can see the light reflected from the projector. This type of surface is useful for the member’s detection. The display can render the proper vertical perspective to each viewer in a direct manner with a single variation in height and distance table.
Different forms also affect the range of the display. The area for an audience is composed of different layers of the mirror that are illuminated when the mirror rotates. For a plane mirror, the focal surface is a cone centered around the axis of the mirror. The concave and convex suèrficies have asymmetric focus may change depending on viewing angle.
Convex mirrors produce a set of concave focal planes; concave mirrors produce a set of convex
focal planes. This represents another advantage of the concave mirror, as the human face is shaped more like a convex cylinder than a concavity. When the shape of the focal surface approximates the object being displayed, accommodation cues are more accurate and aliasing is minimized.
Future
In general, these kind of systems are aimed at the progressive improvement of the quality of image and communication.
The color could be achieved by placing multiple synchronized projectors in the same optical path.
Gray level reproduction may be improved by applying advanced halftoning algorithms, but should be optimized to work at thousands of frames per second.
On the other hand, it is a drawback that the remote participant does not receive a three-dimensional view of the people in the audience, even though the screen is positioned and calibrated to optimally match the actual audience position. Replacing the 2D display by a monitor autostereoscopic could solve the issue.
Overall, a 3D teleconferencing system takes a significant step towards maintaining the many nonverbal cues used in face-to-face human communication.
Links
* Best video conference ever Holograma 3D: La mejor conferencia de vídeo (en inglés)
* Teleconferencia 3D (en inglés)
 
< Prev   Next >