Prof. Yanjun Qi, PhD @ UVA (email: yanjun at virginia.edu)   

Our Collected Benchmark Datasets

Mining Biological Sequential Data

Name Type Download Description
Large-scale functional annotation on genome DNA Sequences (string) (Data)
Large-scale Benchmark DNA sequences we used in 2017 paper "DeepMotif".
Large-scale protein local structure property tagging Protein Sequences (string) (Data)
(PDF)
Two large-scale Benchmark Data we used in 2016 paper "MUST-CNN: A Multilayer Shift-and-Stitch Deep Convolutional Architecture for Sequence-based Protein Structure Prediction".
Large-scale protein local structure labeling prediction Protein Sequence (string) (Data)
(bibTex)
Data we used in our PlosOne 2012 paper. Total ten tasks of protein local structural labeling (at each position) based on sequence inputs.

Data for Mining Biological Networks

Name Type Download Description
Cancer Diagnosis using High-Dimensional Molecular Profiling Data Classification (Data)
(PDF)
Kernelized Information-Theoretic Metric Learning for Cancer Diagnosis using High-Dimensional Molecular Profiling Data
Human Protein to HIV-1 Virus Interaction Prediction Graph edge detection (assorted) (Data)
(bibTex)
Data we used in our Bioinformatics-10 paper. To predict human protein interaction partners for HIV-1 Virus; data provided.
Human Protein to Receptor Interaction Prediction Graph edge detection (assorted) (Data)
(bibTex)
Data we used in our Proteomics-09 paper. To predict human interaction partners for membrane receptors, data/code/web-Service provided.
Human Protein-Protein Interaction (PPI) Prediction Graph edge detection (assorted) (Data)
(bibTex)
Data we used in our BMCBio-07 paper. General human protein-protein interaction prediction through information integration, both feature sets and reference sets provided.
Yeast Protein-Protein Interaction Prediction Graph edge detection (assorted) (Data)
(bibTex)
Data we used in our Proteins-06 paper. To predict interaction partners for yeast proteins, data/code/web-Service provided.
Protein Complex (i.e. Group) Detection From PPI Graph SubGraph detection (assorted) (Data)
(bibTex)
Data we used in our Bioinformatics-08 paper for detecting protein groups from PPI networks. Two reference sets of protein complex (i.e. group) in Yeast are shared

Human Behavior Data from Social Media or Mobile

Name Type Download Description
Human behavior data captured by home sensors Temporal Sensor data (Data)
(PDF)
Benchmark sensor data we used in the paper "MAPer: A Multi-scale Adaptive Personalized Model for Temporal Human Behavior Prediction "
Large-scale sentiment classification (version 2011) English text (string) Data
(bibTex)
Two large-scale (Amazon & TripAdvisor) sentimental classification we have used in our CIKM-2011 publications. Both data and data splits are shared.
Large-scale sentiment classification (version 2012) English text (string) Data
(bibTex)
Version2 for Two large-scale (Amazon & TripAdvisor) sentimental classification we have used in our ECML-2012 publications. Both data and data splits are shared.

Back to top