Figure 7: Network traffic since starting server program.
the user’s viewpoint. From Figure 6, we can confirm
that each user can see the same remote site from the
different viewpoint simultaneously. The network traf-
fic of transmission from server is shown in Figure 7
for the experiment of Figure 6. The live video server
was started to transfer images and masks at time 0
and then, 6 seconds later, user1 started telepresence.
User2 joined in the system at 30 seconds after start-
ing server. In spite of increasing the number of users,
the network traffic is constant. A server which is not
connected by clients has to send data in the present
implementation. The server should stop sending data
if any clients do not connect to it.
For use of the multi-cast protocol, multiple videos
are transferred as packets asynchronously so that re-
ception control of them has become difficult. A small
number of packets is better for the reception control
but each image has to be compressed by high ratio. It
means that multiple videos become low quality. The
easiness of reception control and the quality of videos
are trade-off. In this experiment, we decided the rela-
tion empirically.
6 CONCLUSIONS
We have proposed a novel view telepresence sys-
tem with high-scalability using multi-casted omni-
directional videos. In our system, multiple users can
see a virtualized remote site at an arbitrary view-
point and in an arbitrary view-direction simultane-
ously. Furthermore, the network traffic is constant
when increasing the number of users. In experiments
with the prototype system, we have confirmed that our
system can give the feeling of walking in a remote site
to the users. When the viewpoint is in the middle of
the cameras and is close to dynamic objects, the qual-
ity of novel view image becomes low. This is caused
by the geometric error in approximate representation
using visual hulls with few cameras. The future work
includes improvement of quality of novel view image
and development of a method for giving point corre-
spondences among input images automatically.
REFERENCES
Aliaga, D. G. and Carlbom, I. (2001). Plenoptic stitching:
A scalable method for reconstructing 3D interactive
walkthroughs. In Proc. of SIGGRAPH2001, pages
443–451.
Chen, S. E. (1995). Quicktime VR: An image-based ap-
proach to virtual environment navigation. In Proceed-
ings of SIGGRAPH’95, pages 29–38.
Ishikawa, T., Yamazawa, K., Sato, T., Ikeda, S., Nama-
mura, Y., Fujikawa, K., Sunahara, H., and Yokoya, N.
(2005a). Networked telepresence system using web
browsers and omni-directional video streams. In Proc.
SPIE Electronic Imaging, volume 5664, pages 380–
387.
Ishikawa, T., Yamazawa, K., and Yokoya, N. (2005b).
Novel view generation from multiple omni-directional
videos. In Proc. IEEE and ACM Int. Symp. on Mixed
and Augmented Reality, pages 166–169.
Kanade, T., Rander, P., and Narayanan, P. J. (1997). Vir-
tualized reality: Constructing virtual worlds from real
scenes. IEEE MultiMedia Magazine, 1(1):34–47.
Koyama, T., Kitahara, I., and Ohta, Y. (2003). Live mixed-
reality 3D video in soccer stadium. In Proc. of IEEE
and ACM Int. Symp. on Mixed and Augmented Reality,
pages 178–187.
Laurentini, A. (1994). The visual hull concept for
silhouette-based image understanding. IEEE Trans.
on Pattern Analysis and Machine Intelligence,
16(2):150–162.
Matusik, W., Buehler, C., Raskar, R., Gortler, S. J., and
McMillan, L. (2000). Image-based visual hulls. In
Proc. of SIGGRAPH2000, pages 369–374.
Morita, S., Yamazawa, K., and Yokoya, N. (2003).
Networked video surveillance using multiple omni-
directional cameras. In Proc. 2003 IEEE Int. Symp. on
Computational Intelligence in Robotics and Automa-
tion, pages 1245–1250.
Negishi, Y., Miura, J., and Shirai, Y. (2004). Calibration
of omnidirectional stereo for mobile robots. In Proc.
of 2004 IEEE/RSJ Int. Conf. on Intelligent Robots and
Systems, pages 2600–2605.
Onoe, Y., Yamazawa, K., Takemura, H., and Yokoya,
N. (1998). Telepresence by real-time view-
dependent image generation from omni-directional
video streams. Computer Vision and Image Under-
standing, 71(2):154–165.
Saito, H., Baba, S., and Kande, T. (2003). Appearance-
based virtual view generation from multicamera
videos captured in the 3-D room. IEEE Trans. on Mul-
timedia, 5(3):303–316.
NOVEL VIEW TELEPRESENCE WITH HIGH-SCALABILITY USING MULTI-CASTED OMNI-DIRECTIONAL
VIDEOS
155