Hello, my name is Siva
MS in Computer Science @ University of Virginia
"Acerrimus ex omnibus nostris sensibus est sensus videndi" (The keenest of all our senses is the sense of sight )
A picture is worth a thousand words. How many words will a video be worth, then? Can we find the right words
to represent a video?
Leverage visual information and associated text to build learning machines
at par with human-level perception and understanding.
I work with
Ordonez as a part of the Vision and Language Research Group
(VISLANG). I am also advised by Prof. Gabriel Robins.
Raspberry Pi + Amazon Echo = recognize faces at your command!
Movie Trailer + Plot Summaries = classify genres using deep learning!
Sorting an album of images and associated caption!