Prof. Yanjun Qi, PhD @ UVA (email: yanjun at   

Go to bottom

Basic Information:

UVA "Machine Learning and Biomedicine" group's research has been focused on developing novel machine-learning techniques on important challenges in biomedicine, especially those dealing with enormous data sets. We strive toward building and sharing benchmarked datasets and open-source releases of research prototypes.
(data shared @ DataSharing ) (code shared @ ToolSharing )

Representative publications:

Category Representative Papers
All papers

Learning Graph from Data

(Helping researchers effectively translate aggregated data into knowledge that take the form of graphs)
  • "A Fast and Scalable Joint Estimator for Learning Multiple Related Sparse Gaussian Graphical Models", AISTATS (2017)(Arxiv)(PDF)
  • "A constrained L1 minimization approach for estimating multiple Sparse Gaussian or Nonparanormal Graphical Models", (2016) ICML Combio ; (PDF )
  • "Learning the Dependency Structure of Latent Factors",NIPS 2012, (PDF)(Talk)
  • "Transfer String Kernel for Cross-Context DNA-Protein Binding Prediction",@ (TCBB) (Arxiv)
Robust Machine Learning
  • "Automatically Evading Classifiers",NDSS 2016 (PDF)(Talk)
  • "A Theoretical Framework for Robustness of (Deep) Classifiers Against Adversarial Samples",(Arxiv)

(Deep) Representation Learning on sequential data

(genome or epigenomic data, product review text, bio-reports text, protein sequence strings, etc.)
  • DeepChrome: Deep-learning for predicting gene expression from histone modifications, ECCB/Bioinformatics 2016 (PDF)
  • "Deep Motif Dashboard: Visualizing and Understanding Genomic Sequences Using Deep Neural Networks", PSB17 (Arxiv)
  • "Deep Motif: Visualizing Genomic Sequence Classifications", ICLR2016, the International Conference on Learning Representations (Arxiv)
  • "MUST-CNN: A Multilayer Shift-and-Stitch Deep Convolutional Architecture for Sequence-based Protein Structure Prediction", AAAI 2016. (PDF) (Talk)

  • "Unsupervised Feature Learning by Deep Sparse Coding", SDM 2014. (PDF)(Talk)
  • "Deep Learning for Character-based Information Extraction",ECIR 2014, (PDF) (SupplementaryDoc)(Talk)
  • "Learning the Dependency Structure of Latent Factors",NIPS 2012, (PDF)(Talk)
  • "A unified multitask architecture for predicting local protein properties", PLoS ONE 2012((PDF))
  • "Sentiment Classification with Supervised Sequence Encoder", ECML 2012, (PDF)(Talk)
  • "Sentiment Classification Based on Supervised Latent n-gram Analysis",CIKM 2011, (PDF)(Talk)
  • "Semi-Supervised Abstraction-Augmented String Kernel for Multi-Level Bio-Relation Extraction", ECML 2010, (PDF)(Talk)
  • Semi-Supervised Bio-Named Entity Recognition with Word-Codebook Learning", SDM 2010, (PDF)
  • "Polynomial Semantic Indexing", NIPS 2009, (PDF)
  • "Semi-Supervised Sequence Labeling with Self-Learned Feature", ICDM 2009, (PDF)(Talk)
  • "Combining Labeled and Unlabeled Data for Word-Class Distribution" , CIKM 2009, (PDF)
Data Fusion for Relational (Graph) Data
  • "Transfer String Kernel for Cross-Context Transcription Factor Binding Prediction", BioKDD 2015, (PDF)
  • "Semi-Supervised Convolution Graph Kernels for Relation Extraction", SDM 2011, (PDF) (Talk)
  • "Semi-Supervised Multi-Task Learning for Predicting Interactions between HIV-1 and Human Proteins", ECCB 2010, (PDF) (Talk)
  • "Protein Complex Identification by Supervised Graph Clustering", Bioinformatics 2008, (PDF)(Talk)
Machine Learning for Health
  • "Kernelized Information-Theoretic Metric Learning for Cancer Diagnosis using High-Dimensional Molecular Profiling Data", TKDD 2016, (PDF)
  • "MAPer: A Multi-scale Adaptive Personalized Model for Temporal Human Behavior Prediction ",CIKM 2015, (PDF) (Talk)
  • "Causal Analysis of Inertial Body Sensors for Enhancing Gait Assessment Separability towards Multiple Sclerosis Diagnosis", IEEE Body Sensor Network (BSN) 2015, (PDF) (Talk)
  • "Piecewise Linear Dynamical Model for Action Clustering from Real-World Deployments of Inertial Body Sensors", BodyNets 2014, (PDF) (Talk)
  • Retrieving Medical Records with sennamed: NEC Labs America at TREC 2012 Medical Record Track, 2012 Text Retrieval Conference, (PDF)

Table of external grant awards we are working on :

Name Role Summary Project Website Year
NSF IIS-1453580: PI CAREER: A Data-Driven Network Inference Framework for Context-Conditioned Protein Interaction Graphs Project Website 2015-2020
NSF CNS-1619098: PI TWC:Small: Automatic Techniques for Evaluating and Hardening Machine Learning Classifiers in the Presence of Adversaries Project Website 2016-2019
ONR: CO-PI Deep Learning of Passage Structure for Scalable Semantic Discovery Project Website 2015-2018

Table of external grant awards we have completed :

Name Role Summary Project Website Year
NSF CNS-1441875: co-PI PI Meeting for Secure and Trustworthy Cyberspace: Matching Meeting Attendees Through Text Mining of their Scientific Reports. Project Website 2014-2015

Back to top

Info Highlights GrantNow GrantThen