BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Australia/Melbourne
X-LIC-LOCATION:Australia/Melbourne
BEGIN:DAYLIGHT
TZOFFSETFROM:+1000
TZOFFSETTO:+1100
TZNAME:AEDT
DTSTART:19721003T020000
RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:19721003T020000
TZOFFSETFROM:+1100
TZOFFSETTO:+1000
TZNAME:AEST
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260114T163653Z
LOCATION:Meeting Room C4.9+C4.10\, Level 4 (Convention Centre)
DTSTART;TZID=Australia/Melbourne:20231213T142500
DTEND;TZID=Australia/Melbourne:20231213T144000
UID:siggraphasia_SIGGRAPH Asia 2023_sess164_papers_490@linklings.com
SUMMARY:SAILOR: Synergizing Radiance and Occupancy Fields for Live Human P
 erformance Capture
DESCRIPTION:Zheng Dong (State Key Laboratory of CAD & CG, Zhejiang Univers
 ity); Ke Xu (City University of Hong Kong); Yaoan Gao (State Key Laborator
 y of CAD & CG, Zhejiang University); Qilin Sun (The Chinese University of 
 Hong Kong, Shenzhen); Hujun Bao and Weiwei Xu (State Key Laboratory of CAD
  & CG, Zhejiang University); and Rynson W.H. Lau (City University of Hong 
 Kong)\n\nImmersive user experiences in live VR/AR performances require a f
 ast and accurate free-view rendering of the performers. Existing methods a
 re mainly based on Pixel-aligned Implicit Functions (PIFu) or Neural Radia
 nce Fields (NeRF). However, while PIFu-based methods usually fail to produ
 ce photorealistic view-dependent textures, NeRF-based methods typically la
 ck local geometry accuracy and are computationally heavy (e.g., dense samp
 ling of 3D points, additional fine-tuning, or pose estimation). In this wo
 rk, we propose a novel generalizable method, named SAILOR, to create photo
 realistic human free-view videos from very sparse RGBD streams with low la
 tency. To produce photorealistic view-dependent textures while preserving 
 locally accurate geometry, we integrate PIFu and NeRF such that they work 
 synergistically by conditioning the PIFu on depth and then rendering view-
 dependent textures through NeRF. Specifically, we propose a novel network,
  named SRONet, for this hybrid representation to reconstruct and render li
 ve free-view videos. SRONet can handle unseen performers without fine-tuni
 ng. Both geometric and colorimetric supervision signals are exploited to e
 nhance SRONet's capability of capturing high-quality details. Besides, a n
 eural blending-based ray interpolation scheme, a tree-based data structure
 , and a parallel computing pipeline are incorporated for fast upsampling, 
 efficient points sampling, and acceleration. To evaluate the rendering per
 formance, we construct a real-captured RGBD benchmark from 40 performers. 
 Experimental results show that SAILOR outperforms existing human reconstru
 ction and performance capture methods.\n\nRegistration Category: Full Acce
 ss\n\nSession Chair: Parag Chaudhuri (Indian Institute of Technology Bomba
 y)\n\n
URL:https://asia.siggraph.org/2023/full-program?id=papers_490&sess=sess164
END:VEVENT
END:VCALENDAR