Fuwen TAN

Graduate Student

Department of Computer Science

University of Virginia

85 Engineer's Way, Box 400740

Charlottesville, VA 22904


About me

I am a Ph.D. student in the Computer Science Department of U.Va, working with Dr. Vicente Ordóñez Román on Vision and Language. I am especially interested in learning instance level representations of image and language, and their applications to visual recognition, retrieval, and synthesis.

Before joining U.Va., I was a Research Associate at the BeingThere center of Nanyang Technological University, working on Computer Graphics and 3D Telepresence. Here is my CV.


Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries

Fuwen Tan, Paola Cascante-Bonilla, Xiaoxiao Guo, Hui Wu, Song Feng, Vicente Ordonez
Conf. on Neural Information Processing Systems. (NeurIPS), 2019

This paper explores the task of interactive image retrieval using natural language queries, where a user progressively provides input queries to refine a set of retrieval results. Moreover, our work explores this problem in the context of complex image scenes containing multiple objects. ...

[ paper ]    [ code ]

Text2Scene: Generating Compositional Scenes from Textual Descriptions

Fuwen Tan, Song Feng, Vicente Ordonez
Conf. on Computer Vision and Pattern Recognition (CVPR), 2019, (~Oral presentation + Best Paper Finalist)
Posts from NVIDIA Developer News, IBM Research Blog

We propose Text2Scene, a model that interprets input natural language descriptions in order to generate various forms of compositional scene representations; from abstract cartoon-like scenes to synthetic images. Unlike recent works, our method does not use generative adversarial networks, but a combination of an encoder-decoder model with a semi-parametric retrieval-based approach. ...

[ paper ]    [ code ]    [ poster ]    [ slides ]    [ bibtex ]

Where and Who? Automatic Semantic-Aware Person Composition

Fuwen Tan, Crispin Bernier, Benjamin Cohen, Vicente Ordonez, Connelly Barnes
Winter Conf. on Applications of Computer Vision (WACV), 2018

Image compositing is a popular and successful method used to generate realistic yet fake imagery. Much previous work in compositing has focused on improving the appearance compatibility between a given object segment and a background image. However, most previous work does not investigate the topic of automatically selecting compatible segments and predicting their locations and sizes given a background image. ...

[ paper ]     [ supplemental PDF ]     [ code ]     [ video ]     [ bibtex ]

FaceCollage: A Rapidly Deployable System for Real-time Head Reconstruction for On-The-Go 3D Telepresence

Fuwen Tan, Chi-Wing Fu, Teng Deng, Jianfei Cai, Tat Jen Cham
ACM Multimedia (ACM MM, full paper), 2017

This paper presents FaceCollage, a robust and real-time system for head reconstruction that can be used to create easy-to-deploy telepresence systems, using a pair of consumer-grade RGBD cameras that provide a wide range of views of the reconstructed user. A key feature is that the system is very simple to rapidly deploy, with autonomous calibration and requiring minimal intervention from the user. ...

[ paper ]    [ video]     [ poster ]    [ bibtex ]

High-Quality Kinect Depth Filtering For Real-time 3D Telepresence

Mengyao Zhao, Fuwen Tan, Chi-Wing Fu, Chi-Keung Tang, Jianfei Cai, Tat Jen Cham
Conf. on Multimedia and Expo (ICME), 2013

3D telepresence is a next-generation multimedia application, offering remote users an immersive and natural video­ conferencing environment with real-time 3D graphics. Kinect sensor, a conswner-grade range camera, facilitates the implementation of some recent 3D telepresence systems. However, conventional data filtering methods are insufficient to handle Kinect depth error because such error is quantized ...

[ IEEE Xplorer ]     [bibtex]

Field-guided Registration for Feature-conforming Shape Composition

Hui Huang, Minglun Gong, Daniel Cohen-Or, Yaobin Ouyang, Fuwen Tan, Hao Zhang
ACM Transactions on Graphics (Proc. SIGGRAPH Asia), 2012

We present an automatic shape composition method to fuse two shape parts which may not overlap and possibly contain sharp features, a scenario often encountered when modeling man-made objects. At the core of our method is a novel field-guided approach to automatically align two input parts in a feature-conforming manner. ...

[ project ]    [paper]    [bibtex]