Fuwen TAN

Graduate Student

Department of Computer Science

University of Virginia

85 Engineer's Way, Box 400740

Charlottesville, VA 22904


About me

I am a Ph.D. student in the Computer Science Department of U.Va, working with Dr. Vicente Ordóñez Román on Vision and Language. I am especially interested in learning compositional representations of image and language, and their applications to visual recognition, retrieval, and synthesis.

Before joining U.Va., I was a Research Associate at the BeingThere center of Nanyang Technological University, working on Computer Graphics and 3D Telepresence. Here is my CV.


Text2Scene: Generating Compositional Scenes from Textual Descriptions

Fuwen Tan, Song Feng, Vicente Ordonez
IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2019, Oral presentation

We propose Text2Scene, a model that interprets input natural language descriptions in order to generate various forms of compositional scene representations; from abstract cartoon-like scenes to synthetic images. Unlike recent works, our method does not use generative adversarial networks, but a combination of an encoder-decoder model with a semi-parametric retrieval-based approach. ...

[ Arxiv paper ]    [ code (coming soon) ]    [ bibtex ]

Where and Who? Automatic Semantic-Aware Person Composition

Fuwen Tan, Crispin Bernier, Benjamin Cohen, Vicente Ordonez, Connelly Barnes
IEEE Winter Conf. on Applications of Computer Vision (WACV), 2018

Image compositing is a popular and successful method used to generate realistic yet fake imagery. Much previous work in compositing has focused on improving the appearance compatibility between a given object segment and a background image. However, most previous work does not investigate the topic of automatically selecting semantically compatible segments and predicting their locations and sizes given a background image. ...

[ paper ]     [ supplemental PDF ]     [ code ]     [ video ]     [ bibtex ]

FaceCollage: A Rapidly Deployable System for Real-time Head Reconstruction for On-The-Go 3D Telepresence

Fuwen Tan, Chi-Wing Fu, Teng Deng, Jianfei Cai, Tat Jen Cham
ACM Multimedia (ACM MM), 2017, Full paper

This paper presents FaceCollage, a robust and real-time system for head reconstruction that can be used to create easy-to-deploy telepresence systems, using a pair of consumer-grade RGBD cameras that provide a wide range of views of the reconstructed user. A key feature is that the system is very simple to rapidly deploy, with autonomous calibration and requiring minimal intervention from the user, other than casually placing the cameras. ...

[ paper ]    [ video]     [ poster ]    [ bibtex ]

High-Quality Kinect Depth Filtering For Real-time 3D Telepresence

Mengyao Zhao, Fuwen Tan, Chi-Wing Fu, Chi-Keung Tang, Jianfei Cai, Tat Jen Cham
IEEE International Conf. on Multimedia and Expo (ICME), 2013

3D telepresence is a next-generation multimedia application, offering remote users an immersive and natural video­ conferencing environment with real-time 3D graphics. Kinect sensor, a conswner-grade range camera, facilitates the implementation of some recent 3D telepresence systems. However, conventional data filtering methods are insufficient to handle Kinect depth error because such error is quantized rather than just randomly-distributed ...

[ IEEE Xplorer ]     [bibtex]

Field-guided Registration for Feature-conforming Shape Composition

Hui Huang, Minglun Gong, Daniel Cohen-Or, Yaobin Ouyang, Fuwen Tan, Hao Zhang
ACM Transactions on Graphics (Proc. SIGGRAPH Asia), 2012

We present an automatic shape composition method to fuse two shape parts which may not overlap and possibly contain sharp features, a scenario often encountered when modeling man-made objects. At the core of our method is a novel field-guided approach to automatically align two input parts in a feature-conforming manner. The key to our field-guided shape registration is a natural continuation of one part into the ambient field ...

[ project ]    [paper]    [bibtex]