TY - GEN
T1 - Tele-Aloha
T2 - 2024 Special Interest Group on Computer Graphics and Interactive Techniques Conference - Conference Papers, SIGGRAPH 2024
AU - Tu, Hanzhang
AU - Shao, Ruizhi
AU - Dong, Xue
AU - Zheng, Shunyuan
AU - Zhang, Hao
AU - Chen, Lili
AU - Wang, Meili
AU - Li, Wenyu
AU - Ma, Siyan
AU - Zhang, Shengping
AU - Zhou, Boyao
AU - Liu, Yebin
N1 - Publisher Copyright:
© 2024 Owner/Author.
PY - 2024/7/13
Y1 - 2024/7/13
N2 - In this paper, we present a low-budget and high-authenticity bidirectional telepresence system, Tele-Aloha, targeting peer-to-peer communication scenarios. Compared to previous systems, Tele-Aloha utilizes only four sparse RGB cameras, one consumer-grade GPU, and one autostereoscopic screen to achieve high-resolution (2048x2048), real-time (30 fps), low-latency (less than 150ms) and robust distant communication. As the core of Tele-Aloha, we propose an efficient novel view synthesis algorithm for upper-body. Firstly, we design a cascaded disparity estimator for obtaining a robust geometry cue. Additionally a neural rasterizer via Gaussian Splatting is introduced to project latent features onto target view and to decode them into a reduced resolution. Further, given the high-quality captured data, we leverage weighted blending mechanism to refine the decoded image into the final resolution of 2K. Exploiting world-leading autostereoscopic display and low-latency iris tracking, users are able to experience a strong three-dimensional sense even without any wearable head-mounted display device. Altogether, our telepresence system demonstrates the sense of co-presence in real-life experiments, inspiring the next generation of communication.
AB - In this paper, we present a low-budget and high-authenticity bidirectional telepresence system, Tele-Aloha, targeting peer-to-peer communication scenarios. Compared to previous systems, Tele-Aloha utilizes only four sparse RGB cameras, one consumer-grade GPU, and one autostereoscopic screen to achieve high-resolution (2048x2048), real-time (30 fps), low-latency (less than 150ms) and robust distant communication. As the core of Tele-Aloha, we propose an efficient novel view synthesis algorithm for upper-body. Firstly, we design a cascaded disparity estimator for obtaining a robust geometry cue. Additionally a neural rasterizer via Gaussian Splatting is introduced to project latent features onto target view and to decode them into a reduced resolution. Further, given the high-quality captured data, we leverage weighted blending mechanism to refine the decoded image into the final resolution of 2K. Exploiting world-leading autostereoscopic display and low-latency iris tracking, users are able to experience a strong three-dimensional sense even without any wearable head-mounted display device. Altogether, our telepresence system demonstrates the sense of co-presence in real-life experiments, inspiring the next generation of communication.
KW - human performance rendering
KW - real-time free-view synthesis
KW - telecommunication
KW - telepresence
KW - videoconferencing
UR - https://www.scopus.com/pages/publications/85199921980
U2 - 10.1145/3641519.3657491
DO - 10.1145/3641519.3657491
M3 - 会议稿件
AN - SCOPUS:85199921980
T3 - Proceedings - SIGGRAPH 2024 Conference Papers
BT - Proceedings - SIGGRAPH 2024 Conference Papers
A2 - Spencer, Stephen N.
PB - Association for Computing Machinery, Inc
Y2 - 28 July 2024 through 1 August 2024
ER -