Yijun Sun, Ph.D.

 

Interdisciplinary Center for Biotechnology Research &

Department of Electrical and Computer Engineering

University of Florida

Gainesville, FL 32611

Email: sunyijun {at} biotech.ufl.edu

 

 

I have moved to SUNY Buffalo as an Assistant Professor in bioinformatics. Please visit my new homepage: http://www.acsu.buffalo.edu/~yijunsun/lab/index.html.

 

I received M. S. and Ph.D. degrees in electrical engineering from the University of Florida, in 2003 and 2004, respectively. I am an Assistant Scientist at the Interdisciplinary Center for Biotechnology Research and an affiliated faculty member at the Department of Electrical and Computer Engineering at the University of Florida. My research interests include machine learning/data mining and bioinformatics. I am a co-recipient of the 2005 IEEE M. Barry Carlton Best Transactions Paper Award. One of my papers is selected as the Spotlight Paper in the September 2010 issue of the prestigious TPAMI journal. My research is supported by National Science Foundation, Florida Cancer Research Program, and Susan Komen Breast Cancer Foundation, and my work on metagenomics and feature selection has been used by more than 200 research institutes worldwide. [Google Scholar]

 

Patience and diligence, like faith, remove mountains. - William Penn (1644-1718)

Stay hungry, stay foolish. - Steve Jobs (1955-2011)

My Research Interests

o Bioinformatics: metagenomics, sequence analysis, microarray data analysis, microbial community analysis, molecular classification and genetic network modeling for cancer diagnosis and prognosis, microbial network analysis, phylogenetic analysis

 

o Machine Learning/Data Mining: large margin classification/regression, large-scale clustering analysis, ensemble learning, feature selection/extraction, computational learning theory, network analysis, graphical modeling and Bayesian network.

 

My Projects

o   The D Project funded by NSF

o   The L Project funded by Bankhead-Coley Cancer Research Program

o   The E Project

 

My Postdocs and Students

o   Dr. Yunpeng Cai (Postdoc, 2007 - 2011. Now Associate Professor with the Chinese Academy of Sciences)

o   Dr. Xiaoyu Wang (Postdoc, 2010 - )

o   Dr. Ying Tang (Postdoc, 2011 - 2012)

o   Dr. Karthik Gurumoorthy (Postdoc, 2011 – 2012. Now Research Scientist with GE Global Research - India)

o   Jin Yao (PhD student, 2011-)

o   Hedjazi Lyamine (PhD student at University of Toulouse - France, External committee member)

o   Qian Chen (Ph.D. Student, Committee member)

o   Lei Yang (Ph.D. Student, Committee member)

o   Bing Han (Ph.D. Student, Committee member)

o   Jun Xu (Ph.D. Student, Committee member)

o   Taoran Lu (Ph.D. Student, Committee member)

o   Ming Xue (Ph.D. Student, Committee member)

o   Lin Du (Ph.D. Student, Committee member)

o   Jun Ling (Ph.D. Student, Committee member)

o   Yubo Cheng (Master Student, graduated in 2009)


My Publications

o   Y. Sun and Y. Cai,

Inferring Microbe-microbe Interactions Using Large Collections of 16S rRNA Pyrosequences,

Technical Report, 2011.

 

o   Y. Cai* and Y. Sun*,

ESPRIT-Forest: Taxonomy Independent Analysis of Tens of Millions of 16S rRNA Pyrosequences Using Parallel Computing,

Technical Report, 2011.

 

o   R. Raychoudhury, R. Sen, Y. Cai, Y. Sun, V. Ulrike-Lietze, D. Boucias, and M. Scharf,

Comparative Metatranscriptomic Signatures of Wood and Paper Feeding in the Gut of the Termite Reticulitermes flavipes (Isoptera: Rhinotermitidae)

Genome Biology, submitted, 2012.

 

o   X. Zhang, C. Wang, Y. Zhang, Y. Sun, and Z. Mou,

The Arabidopsis Mediator Complex Subunit 16 Positively Regulates Salicylate-Mediated Systemic Acquired Resistance and Jasmonate/Ethylene-Induced Defense Pathways

The Plant Cell, submitted, 2012

 

o   D. Boucias, Y. Cai, Y. Sun, V. U. Lietze, R. Sen, R. Raychoudhury, and M. Scharf,

The Microbiome of the Lignocellulosedegrading Termite Reticulitermes Flavipes: Resistance to Perturbation in Response to Diet,

Microbiology Ecology, submitted, 2012.

 

o   L. Yin, L. Liu, Y. Sun, W. Hou, A. C. Lowe, B. P. Gardner, M. Salemi, W. B. Williams, W. G. Farmerie, J. W. Sleasman, and M. M. Goodenow,

Novel High Resolution Deep Sequencing Analysis Reveals HIV-1 Biodiversity, Population Structure, and Persistence during Natural History of Infection,

Retrovirology, submitted, 2012.

 

o   C. Rosser, V. Urquidi, Y. Cai, Y. Sun, and S. Goodison,

Molecular Biomarker Signature for the Non-Invasive Detection of Bladder Cancer,

Cancer Epidemiology, Biomarkers & Prevention, submitted, 2012.

 

o   M. Ukhanova, T. Culpepper, D. Baer, D. Gordon, S. Kanahori, J. Valentine, J. Neu, Y. Sun, X. Wang, V. Mai,

Gut Microbiota Correlates with Energy Gain from a Dietary Fiber and Appears Associated with Acute and Chronic Intestinal Diseases,

Clinical Microbiology and Infection, Suppl 4, pp. 62-66, 2012. (impact factor: 4.8)

 

o   X. Wang, Y. Cai, Y. Sun, R. Knight, V. Mai,

Secondary Structure Information Does not Improve OTU Picking for 16S rRNA Sequences,

The ISME Journal, vol. 6, no. 7, pp. 1277-1280, 2012. (impact factor: 6.2)

 

o   A-L. Paul, A. Zupanska, D. Ostrow, Y. Zhang, Y. Sun, J. Li, S. Shanker, W. Farmerie, C. Amalfitano, R. J. Ferl,

Spaceflight Transcriptomes: Unique Responses to a Novel Environment,

Astrobiology, vol. 12, no. 1, pp. 40-56, 2012. (impact factor: 2.4)

 

o   Y. Sun*, Y. Cai*, S. Huse, R. Knight, W. Farmerie, X. Wang and V. Mai, (*equal contribution)

A Large-scale Benchmark Study of Existing Algorithms for Taxonomy-Independent Microbial Community Analysis,

Briefings in Bioinformatics, vol. 13, no. 1, pp. 107-21, 2012. (impact factor: 9.3)

 

o   V. Mai, C. M. Young, M. Ukhanova, X. Wang, Y. Sun, G. Casella, D. Theriaque, N. Li, R. Sharma, M. Hudak, J. Neu,

Fecal Microbiota in Premature Infants Prior to Necrotizing Enterocolitis,

PLoS ONE, vol. 6, no. 6, e20647, 2011. (impact factor: 4.4)

 

o   Y. Cai* and Y. Sun*, [Website]

ESPRIT-Tree: Hierarchical Clustering Analysis of Millions of 16S rRNA Pyrosequences in Quasilinear Time,

Nucleic Acids Research, vol. 39, no. 14, e95, 2011. (impact factor: 7.8)

 

o   Y. Sun and Y. Cai,

Estimating Species Richness Using Large Collections of 16S rRNA Pyrosequences,

Handbook of Molecular Microbial Ecology: Metagenomics and Complementary Approaches (Edited by Frans J. de Bruijn), Wiley-Blackwell, 2011.

 

o   Y. Cai, H. Lyamine, Y. Sun, and S. Goodison,

Fast Implementation of ℓ1 Regularized Learning Algorithms Using Gradient Descent Methods,

IEEE Trans. on Pattern Analysis and Machine Intelligence, submitted, 2010.

 

o   Y. Sun*, Y. Cai*, V. Mai, W. Farmerie, F. Yu, J. Li, and S. Goodison, (*equal contribution)

Advanced Computational Algorithms for Microbial Community Analysis Using Massive 16S rRNA Sequence Data,

Nucleic Acids Research, vol. 38, no. 22, e205, 2010 (impact factor: 7.8)

 

o   Y. Sun*, Y. Cai*, L. Liu, F. Yu, M. L. Farrell, W. McKendree, and W. Farmerie, (*equal contribution)

ESPRIT: Estimating Species Richness Using Large Collections of 16S rRNA Pyrosequences,

Nucleic Acids Research, vol. 37, no. 10, e76, 2009. (impact factor: 7.8)

The algorithm has been used by more than 200 major research institutes worldwide.

 

o   Y. Sun, S. Todorovic, and S. Goodison,

Local Learning Based Feature Selection for High Dimensional Data Analysis,

IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), vol. 32, no. 9, pp. 1610-1626, 2010. (impact factor: 6.0, the overall top-ranked IEEE transactions journal)

This paper is featured as Spotlight Paper in the September 2010 issue of TPAMI.

 

o   S. Goodison, Y. Sun, and V. Urquidi,

Review: Derivation of Cancer Diagnostic and Prognostic Signatures from Gene Expression Data,

Bioanalysis, vol. 2, no. 5, pp. 855-862, 2010.

 

o   Y. Sun, V. Urquidi, and S. Goodison,

Derivation of Molecular Signatures for Breast Cancer Recurrence Prediction Using a Two-way Validation Approach,

Breast Cancer Research and Treatment, vol. 119, no. 3, pp. 593-599, 2010. (impact factor: 5.7)

 

o   C. Pascoe, A. Lawande, H. Lam, A. George, Y. Sun, W. Farmerie, and H. Martin

Reconfigurable Supercomputing with Scalable Systolic Arrays and In-Stream Control for Wavefront Genomics Processing

in Proc 2010 Symposium on Application Accelerators in High-Performance Computing (SAAHPC10), pp. 1-6, July 2010. (oral presentation)

 

o   Y. Cai, Y. Sun, Y. Cheng, J. Li, and S. Goodison,

Fast Implementation of ℓ1 Regularized Learning Algorithms Using Gradient Descent Methods,

in Proc 10th SIAM International Conference on Data Mining (SDM), pp. 862-871, April 2010.

(oral presentation/acceptance rate: 82/351 = 23%)

 

o   F. Yu*, Y. Sun*, L. Liu, and W. Farmerie, (*equal contribution)

GSTaxClassifier: A Genomic Signature Based Taxonomic Classifier for Metagenomics Data Analysis,

Bioinformation, vol. 4, no. 1, pp. 46-49, 2009.

 

o   N. Bandyopadhyay, T. Kahveci, S. Ranka, Y. Sun and S. Goodison,

Pathway based Feature Selection Algorithm for Cancer Microarray Data,

Advances in Bioinformatics, 532989, 2009.

 

o   Y. Duan, L. Zhou, D. Hall, W. Li, H. Doddapaneni, H. Lin, L. Liu, C. Vahling, D. Gabriel, K. Williams, A. Dickerman, Y. Sun, and T. Gottwald,

Complete Genome Sequence of Citrus Huanglongbing Bacterium, Candidatus Liberibacter Asiaticus Obtained through Metagenomics,

Molecular Plant-Microbe Interactions, vol. 22, no. 8, 2009, pp. 1011-1020, 2009. (impact factor: 4.3)

 

o   Y. Sun* and S. Goodison* (equal contribution)

Optimizing Molecular Signatures for Predicting Prostate Cancer Recurrence,

The Prostate, vol. 69, no. 10, pp. 1119-27, 2009. (impact factor: 3.7)

This paper was featured in Medical News Today.

 

o   Y. Cai, Y. Sun, J. Li, and S. Goodison,

Online Feature Selection Algorithm with Bayesian L-1 Regularization,

in Proc. 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD09), vol. 5476, pp. 401-413, April 2009. (oral presentation: 39/338 = 12%)

  

o   C. Rosser, L. Liu, Y. Sun, P. Villicana, M. McCullers, S. Porvasnik, and S. Goodison

Bladder Cancer Associated Gene Expression Signatures Identified by Profiling of Exfoliated Urothelia,

Cancer Epidemiology, Biomarkers and Prevention, vol. 18, no. 2, pp. 444-453, 2009. (impact factor: 4.8)

 

o   Y. Sun, and D. Wu

Feature Extraction through Local Learning,

Statistical Analysis and Data Mining, vol. 2, no. 1, pp. 34-47, 2009.

 

o   Y. Cheng, Y. Cai, Y. Sun, and J. Li

Semi-supervised Feature Selection under the Logistic I-RELIEF Framework,

in Proc. 19th International Conference on Pattern Recognition (ICPR08), pp. 1-4, December 2008.

(oral presentation: 18%)

 

o   Y. Sun, Y. Cai, and S. Goodison

Combining Nomogram and Microarray Data for Predicting Prostate Cancer Recurrence,

in Proc. 8th IEEE International Conference on Bioinformatics and Bioengineering (BIBE08), pp. 1-7, October 2008.

 

o   Y. Sun and D. Wu

A RELIEF Based Feature Extraction Algorithm [Matlab code]

in Proc. 8th SIAM International Conference on Data Mining (SDM08), pp. 188-195, April 2008.

(oral presentation: 40/282 = 13%)

 

o   Y. Sun, S. Todorovic, and S. Goodison

A Feature Selection Algorithm Capable of Handling Extremely Large Data Dimensionality [Matlab code]

in Proc. 8th SIAM International Conference on Data Mining (SDM08), pp. 530-540, April 2008.

(acceptance rate: 72/282 = 25%)

 

o   Y. Sun

Iterative RELIEF for Feature Weighting: Algorithms, Theories, and Applications [Website]

IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), vol. 29, no. 6, pp. 1035-1051, June 2007. (impact factor: 6.0)

 

o   Y. Sun, S. Goodison, J. Li, L. Liu, and W. Farmerie

Improved Breast Cancer Prognosis through the Combination of Clinical and Genetic Markers

Bioinformatics, vol. 23, no. 1, pp. 30-37, January 2007. (impact factor: 5.0)

This paper was among the 50 most-frequently read articles in Bioinformatics. Dec. 2006 (29th), Jan. 2007 (5th), Feb. 2007 (44th). It is featured in MATLAB Digest--Biotech and Pharmaceutical Edition (vol. 1, no. 2, June 2007).

 

o   Y. Sun, S. Todorovic, and J. Li

Unifying Multi-Class AdaBoost Algorithms with Binary Base Learners under the Margin Framework

Pattern Recognition Letters (PRL), vol. 28, no. 5, pp. 631-643, April 2007.

 

o   Y. Sun

Feature Weighting through Local Learning

Computational Methods of Feature Selection, H. Liu and H. Motoda (eds.), Chapman and Hall/CRC Press, October 2007.

 

o   Y. Sun and J. Li

Iterative RELIEF for Feature Weighting

in Proc. International Conference on Machine Learning (ICML06), vol. 29, pp. 1035-1051, June 2006.

(acceptance rate 140/700 = 20%)

 

o   Y. Sun, S. Todorovic, J. Li, and D. O. Wu

Unifying Error-Correcting and Output-Code AdaBoost within the Margin Framework,

in Proc. International Conference on Machine Learning (ICML05), vol. 119, pp. 872-879, August 2005.

(acceptance rate 134/491 = 27%) [Matlab code]

 

o   Y. Sun, Z. Liu, S. Todorovic and J. Li

Adaptive Boosting for Synthetic Aperture Radar Automatic Target Recognition,

IEEE Trans. on Aerospace and Electronic Systems (TAES), vol. 43, no. 1, pp. 112-125, January 2007.

 

o   Y. Sun and S. Goodison

Predicting Breast Cancer Metastasis by Integrating Both Clinical and Genetic Markers,

in Proc. International Conference on Bioinformatics and Computational Biology (BIOCOMP07), vol. 1, pp. 229-235, June 2007.

(acceptance rate = 27%)

 

o   Y. Sun, F. Yu, L. Liu, and W. Farmerie

Estimating Microbial Population Densities Based on Genomic Signatures,

in Proc. International Conference on Bioinformatics and Computational Biology (BIOCOMP07), vol. 1, pp. 163-168, June 2007.

(acceptance rate = 27%)

 

o   Y. Sun, L. Liu, M. Popp, and W. Farmerie

Estimation of Cross-hybridization Signals Using Support Vector Regression,

in Proc. IEEE Symposium of Computations in Bioinformatics and Bioscience (SCBB06), vol. 1, pp. 17- 21, June 2006.

 

o   Y. Sun, S. Todorovic, and J. Li

Reducing the Overfitting of AdaBoost by Controlling its Data Distribution Skewness,

International Journal of Pattern Recognition and Artificial Intelligence (IJPRA), vol. 20, no. 7, pp. 1093-1116, November 2006.

 

o   Y. Sun, S. Todorovic, and J. Li,

Increasing the Robustness of Boosting Algorithms within the Linear-Programming Framework,

Journal of VLSI Signal Processing Systems, vol. 48, no. 1-2, pp. 5-20, August 2007.

 

o   Y. Sun and J. Li

Adaptive Learning Approach to Landmine Detection,

IEEE Trans. on Aerospace and Electronic Systems (TAES), vol. 41, no. 3, pp. 973-985, July 2005.

 

o   Y. Wang, X. Li, Y. Sun, J. Li and P. Stoica

Adaptive Imaging for Forward-looking Ground Penetrating Radar,

IEEE Trans. on Aerospace and Electronic Systems (TAES), vol. 41, no. 3, pp. 922-936, July 2005.

 

o   Y. Sun, X. Li and J. Li

Practical Landmine Detector Using Forward-Looking Ground Penetrating Radar,

IEE Electronics Letters, vol. 41, pp.97-98, January 2005.

 

o   Y. Sun, S. Todorovic, J. Li, and D. O. Wu

A Robust Linear-programming Based Boosting Algorithm,

in Proc. IEEE International Workshop on Machine Learning for Signal Processing (MLSP05), Mystic, CT, September 2005.

 

o   Y. Sun, J. Li, and W. Hager

Two New Regularized AdaBoost Algorithms,

in Proc. International Conference on Machine Learning and Applications (ICMLA04), pp. 41- 48, Louisville, KY, December 2004.

 

o   Y. Sun and J. Li,

Time-frequency Analysis for Buried Plastic Landmine Detection via Forward-looking Ground Penetrating Radar,

IEE Proc.-Radar, Sonar and Navigation, special issue on Time-Frequency Analysis and Feature Extraction, vol. 150, pp. 253-261, August 2003.

 

o   Y. Sun, M. Xue, J. Li, and S. R. Stanfill,

Improving ATR Performance through Distance Metric Learning,

in Proc. of SPIE on Technologies and Systems for Defense and Security (SPIE07), Orlando, FL, April 2007.

 

o   Y. Sun and J. Li,

Landmine Detection Using Forward-looking Ground Penetrating Radar,

in Proc. of SPIE on Technologies and Systems for Defense and Security (SPIE05), vol. 5794, pp. 1089-1097, Orlando, FL, April 2005.

 

o   Q. Liu, Y. Sun, and J. Li,

Automatic Target Recognition with Bayesian Networks for Wide-area Airborne Minefield Detection,

in Proc. of SPIE on Technologies and Systems for Defense and Security (SPIE05), vol. 5794, pp. 1060-1070, Orlando, FL, April 2005.

 

o   Y. Sun, Z. Liu, S. Todorovic, and J. Li,

SAR Automatic Target Recognition Using AdaBoost,

in Proc. of SPIE on Technologies and Systems for Defense and Security (SPIE05), vol. 5808, pp. 282-293, Orlando, FL, April 2005.

 

o   Y. Sun and J. Li,

Boosting a Wavelet Packet Transform Based Landmine Detector,

in Proc. of SPIE on Technologies and Systems for Defense and Security (SPIE04), vol. 5415, pp. 1300-1309, Orlando, FL, April 2004.

 

o   Y. Sun and J. Li,

Detect Buried Plastic Mines Using Time-frequency Analysis,

in Proc. of SPIE on Detection and Remediation Technologies for Mine and Minelike Targets (SPIE03), vol. 5089, pp. 851-862, Orlando, FL, April 2003. 

 

US Patents

o  Y. Sun, and J. Li, Land Mine Detector, Patent No. US 7173560 B2, No. G01V003/12 (International Class), Feb. 2007.

 

o  V. Mai, J. Byatt, and Y. Sun, Microbiota Profiling to Identifying Subjects at Increased Risk for CRC, pending, 2011.

 

o  V. Mai, Y. Sun, and J. G. Morris, Jr. Methods and Systems for Screening for Risk of Colorectal Polyps and Cancer, Serial No. 61/622, 128, 2012.

 

o  Y. Sun, S. Goodison, and L Liu, Methods of Feature Selection through Local Learning; Breast and Prostate Cancer Prognostic Markers, WO/2009/067655, International Application No.: PCT/US2008/084325, 2009.

 

Reviewers for: 

Nucleic Acids Research, PLoS ONE, Bioinformatics, Genome Research, Genome Biology, IEEE Trans. on Pattern Analysis and Machine Intelligence, IEEE Trans. on Knowledge and Data Engineering, Applied and Environmental Microbiology, Frontiers in Bioscience, The ISME Journal, IEEE Trans. on Neural Networks, Applied and Environmental Microbiology, IEEE Trans. on Fuzzy Systems, Journal of Pattern Recognition Research, Neurocomputing, Pattern Recognition, Pattern Recognition Letters, Computational Optimization and Applications, IEE Proceedings -Radar, Sonar and Navigation, IEEE Signal Processing Letters, IEEE Trans. on Aerospace and Electronic Systems, IEEE Trans. on Instrumentation and Measurement, IEEE Trans. Geoscience and Remote Sensing, Biomedical Signal Processing and Control, Chinese Optics Letters, Knowledge and Information Systems, Artificial Intelligence

 

Review Editorial Board:

Frontiers in Evolutionary and Genomic Microbiology (2011-)

Frontiers in Genetics (2011-)

 

Some Useful Links:

o Bioinformatics

o BMC Bioinformatics

o IEEE Trans. on Pattern Analysis and Machine Intelligence

o Journal of Machine Learning Research

o International Conference on Machine Learning

o SIAM International Conference on Data Mining

o KDnuggets-Analytics and Data Mining Resources

 

Some Pictures of Beautiful China:

o   Taken by Dr. Dimitri Bertsekas

o  Taken by R. Todd King

 

 

o   If we knew what it was we were doing, it would not be called research, would it? - Albert Einstein (1879-1955)

o   A stroke of genius: striving for greatness in all you do. - Richard Hamming (1915-1998)

 

 

 

 

 

 

 

 

 

 

 

scription: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: hit counter