Yijun Sun, Ph.D.

 

Interdisciplinary Center for Biotechnology Research &

Department of Electrical and Computer Engineering

University of Florida

Gainesville, FL 32611

Email: sun {at} dsp.ufl.edu

 

 

I received M. Sc and Ph.D. degrees in electrical engineering from the University of Florida, Gainesville, in 2003 and 2004, respectively. I am a research scientist of the Interdisciplinary Center for Biotechnology Research and an affiliated faculty member of the Department of Electrical and Computer Engineering at the University of Florida. My research interests include machine learning/data mining and bioinformatics. I am a co-recipient of the 2005 IEEE M. Barry Carlton Best Transactions Paper Award.

 

Postdoc Association Position

 

Research Interests

o       Bioinformatics: metagenomics, sequence analysis, microbial community analysis, molecular classification, genetic network modeling for cancer diagnosis and prognosis.

 

o       Machine Learning/Data Mining: large margin classification/regression, ensemble learning, feature selection/extraction, computational learning theory, graphical model and Bayesian network.

 
Postdocs and Students

o       Dr. Yunpeng Cai (Postdoc)

o       Yubo Cheng (Ph.D. Student)

o       Bing Han (Ph.D. Student, Committee member)

o       Jun Xu (Ph.D. Student, Committee member)

 

Publications

o        Y. Cai, Y. Sun, Y. Cheng, J. Li, and S. Goodison,

Molecular Profiling for Predicting Disease Outcomes of ER- Breast Cancer Patients,

Technical Report, 2009.

 

o        Y. Sun, and Y. Cai,

Analyzing Microbe-microbe Interactions Using Large Collections of 16S rRNA Pyrosequences,

Technical Report, 2009.

 

o       Y. Sun, S. Todorovic, and S. Goodison,

Local Learning Based Feature Selection for High Dimensional Data Analysis, [Website]

IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), accepted. (impact factor: 6.0, the overall top-ranked IEEE and CS transactions journal)

 

o       Y. Cai, Y. Sun, Y. Cheng, J. Li, and S. Goodison,

Fast Implementation of ℓ1 Regularized Learning Algorithms Using Gradient Descent Methods, [Website]

SIAM International Conference on Data Mining (SDM), submitted.

 

o       Y. Sun, Y. Cai, W. Farmerie, F. Yu, and S. Goodison,

Comparative Community Analysis of Human Gut Flora Reveals Obesity-associated Microbial Signatures,

Technical Report, 2009.

 

o       M. L. Farrell, L. Liu, Y. Sun, V. T. Ryan, K. L. Brown, W. G. Norrie, W. G. Farmerie, R. A. Winegar, W. McKendree,

Mesopotamian Air Genome Reveals Unprecedented Bacterial Diversity,

Applied and Environmental Microbiology, submitted, 2009.

 

o        F. Yu*, Y. Sun*, L. Liu, and W. Farmerie, (*equal contribution)

GSTaxClassifier: A Genomic Signature Based Taxonomic Classifier for Metagenomics Data Analysis, [Website]

Bioinformation, accepted, 2009.

 

o       N. Bandyopadhyay, T. Kahveci, S. Ranka, Y. Sun and S. Goodison,

Pathway based Feature Selection Algorithm for Cancer Microarray Data

Advances in Bioinformatics, accepted, 2009.

 

o        Y. Sun*, Y. Cai*, L. Liu, F. Yu, M. L. Farrell, W. McKendree, and W. Farmerie, (*equal contribution)

ESPRIT: Estimating Species Richness Using Large Collections of 16S rRNA Pyrosequences, [Website]

Nucleic Acids Research, vol. 37, no. 10 e76, May 2009. (impact factor: 7.0)

This paper is among the 50 most-frequently read articles in Nucleic Acids Research: June 2006 (42nd).

 

o        L. Yin, L. Liu, Y. Sun, R. R. Gray, A. C. Lowe, W. Hou, J. W. Sleasman, and M. M. Goodenow,

Ultra-deep Pyrosequencing Captured Low Frequency CXCR4 Virus Populations Co-archived with CCR5 Virus in Peripheral Blood Lymphocytes from HIV-infected Therapy-naive Children,

The 17th Conference on Retroviruses and Opportunistic Infections (CROI 2010), San Francisco, CA, February 2010.

 

o       Y. Duan, L. Zhou, D. Hall, W. Li, H. Doddapaneni, H. Lin, L. Liu, C. Vahling, D. Gabriel, K. Williams, A. Dickerman, Y. Sun, and T. Gottwald,

Complete Genome Sequence of Citrus Huanglongbing Bacterium, Candidatus Liberibacter Asiaticus Obtained through Metagenomics,

Molecular Plant-Microbe Interactions, vol. 22, no. 8, 2009, pp. 1011-1020, 2009. (impact factor: 4.3)

 

o       Y. Sun, V. Urquidi, and S. Goodison,

Derivation of Molecular Signatures for Breast Cancer Recurrence Prediction Using a Two-way Validation Approach,

Breast Cancer Research and Treatment, DOI: 10.1007/s10549-009-0365-6, 2009. (impact factor: 5.7)

 

o       Y. Sun* and S. Goodison* (equal contribution)

Optimizing Molecular Signatures for Predicting Prostate Cancer Recurrence,

The Prostate, vol. 69, no. 10, pp. 1119-27, 2009. (impact factor: 3.7)

This paper is featured in Medical News Today.

 

o       Y. Cai, Y. Sun, J. Li, and S. Goodison,

Online Feature Selection Algorithm with Bayesian L-1 Regularization,

in Proc. 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD09), Bangkok, Thailand, April 2009. (oral presentation: 39/338 = 12%)

 

o       M. L. Farrell, W. Norrie, V. Ryan, K. Brown, R. Winegar, Y. Sun, L. Liu, W. Farmerie, and W. McKendree,

Metagenomic Analysis of Air,

The 2009 Gordon Research Conferences on Chemical and Biological Terrorism Defense, Galveston, TX, January 2009.

 

o       C. Rosser, L. Liu, Y. Sun, P. Villicana, M. McCullers, S. Porvasnik, and S. Goodison

Bladder Cancer Associated Gene Expression Signatures Identified by Profiling of Exfoliated Urothelia,

Cancer Epidemiology, Biomarkers and Prevention, vol. 18, no. 2, pp. 444-453, 2009. (impact factor: 4.8)

 

o       Y. Sun, and D. Wu

Feature Extraction through Local Learning,

Statistical Analysis and Data Mining, vol. 2, no. 1, pp. 34-47, 2009.

 

o       Y. Cheng, Y. Cai, Y. Sun, and J. Li

Semi-supervised Feature Selection under the Logistic I-RELIEF Framework,

in Proc. 19th International Conference on Pattern Recognition (ICPR08), Tampa, FL, December 2008.

(oral presentation: 18%)

 

o       Y. Sun, Y. Cai, and S. Goodison

Combining nomogram and microarray data for predicting prostate cancer recurrence

in Proc. 8th IEEE International Conference on Bioinformatics and Bioengineering (BIBE08), pp. 1-7, Athens, Greece, October 2008.

 

o       Y. Sun and D. Wu

A RELIEF Based Feature Extraction Algorithm [Matlab code]

in Proc. 8th SIAM International Conference on Data Mining (SDM08), pp. 188-195, April 2008.

(oral presentation: 40/282 = 13%)

 

o       Y. Sun, S. Todorovic, and S. Goodison

A Feature Selection Algorithm Capable of Handling Extremely Large Data Dimensionality [Matlab code]

in Proc. 8th SIAM International Conference on Data Mining (SDM08), pp. 530-540, April 2008.

(acceptance rate: 72/282 = 25%)

 

o       Y. Sun

Iterative RELIEF for Feature Weighting: Algorithms, Theories, and Applications [Matlab code]

IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), vol. 29, no. 6, pp. 1035-1051, June 2007. (impact factor: 6.0)

 

o       Y. Sun, S. Goodison, J. Li, L. Liu, and W. Farmerie

Improved Breast Cancer Prognosis through the Combination of Clinical and Genetic Markers

Bioinformatics, vol. 23, no. 1, pp. 30-37, January 2007. (impact factor: 5.0)

This paper was among the 50 most-frequently read articles in Bioinformatics. Dec. 2006 (29th), Jan. 2007 (5th), Feb. 2007 (44th). It is featured in MATLAB Digest--Biotech and Pharmaceutical Edition (vol. 1, no. 2, June 2007).

 

o       Y. Sun, S. Todorovic, and J. Li

Unifying Multi-Class AdaBoost Algorithms with Binary Base Learners under the Margin Framework

Pattern Recognition Letters (PRL), vol. 28, no. 5, pp. 631-643, April 2007.

 

o       Y. Sun

Feature Weighting through Local Learning

Computational Methods of Feature Selection, H. Liu and H. Motoda (eds.), Chapman and Hall/CRC Press, October 2007.

 

o       Y. Sun and J. Li

Iterative RELIEF for Feature Weighting

in Proc. International Conference on Machine Learning (ICML06), vol. 29, pp. 1035-1051, June 2006.

(acceptance rate 140/700 = 20%)

 

o       Y. Sun, S. Todorovic, J. Li, and D. O. Wu

Unifying Error-Correcting and Output-Code AdaBoost within the Margin Framework,

in Proc. International Conference on Machine Learning (ICML05), vol. 119, pp. 872-879, August 2005.

(acceptance rate 134/491 = 27%) [Matlab code]

 

o       Y. Sun, Z. Liu, S. Todorovic and J. Li

Adaptive Boosting for Synthetic Aperture Radar Automatic Target Recognition,

IEEE Trans. on Aerospace and Electronic Systems (TAES), vol. 43, no. 1, pp. 112-125, January 2007.

 

o       Y. Sun and S. Goodison

Predicting Breast Cancer Metastasis by Integrating Both Clinical and Genetic Markers,

in Proc. International Conference on Bioinformatics and Computational Biology (BIOCOMP07), vol. 1, pp. 229-235, June 2007.

(acceptance rate = 27%)

 

o       Y. Sun, F. Yu, L. Liu, and W. Farmerie

Estimating Microbial Population Densities Based on Genomic Signatures,

in Proc. International Conference on Bioinformatics and Computational Biology (BIOCOMP07), vol. 1, pp. 163-168, June 2007.

(acceptance rate = 27%)

 

o       Y. Sun, L. Liu, M. Popp, and W. Farmerie

Estimation of Cross-hybridization Signals Using Support Vector Regression,

in Proc. IEEE Symposium of Computations in Bioinformatics and Bioscience (SCBB06), vol. 1, pp. 17- 21, June 2006.

 

o       Y. Sun, S. Todorovic, and J. Li

Reducing the Overfitting of AdaBoost by Controlling its Data Distribution Skewness,

International Journal of Pattern Recognition and Artificial Intelligence (IJPRA), vol. 20, no. 7, pp. 1093-1116, November 2006.

 

o       Y. Sun, S. Todorovic, and J. Li, Increasing the Robustness of Boosting Algorithms within the Linear-Programming Framework, Journal of VLSI Signal Processing Systems, vol. 48, no. 1-2, pp. 5-20, August 2007.

 

o       Y. Sun and J. Li

Adaptive Learning Approach to Landmine Detection,

IEEE Trans. on Aerospace and Electronic Systems (TAES), vol. 41, no. 3, pp. 973-985, July 2005.

 

o       Y. Wang, X. Li, Y. Sun, J. Li and P. Stoica

Adaptive Imaging for Forward-looking Ground Penetrating Radar,

IEEE Trans. on Aerospace and Electronic Systems (TAES), vol. 41, no. 3, pp. 922-936, July 2005.

 

o       Y. Sun, X. Li and J. Li

Practical Landmine Detector Using Forward-Looking Ground Penetrating Radar,

IEE Electronics Letters, vol. 41, pp.97-98, January 2005.

 

o       Y. Sun, S. Todorovic, J. Li, and D. O. Wu

A Robust Linear-programming Based Boosting Algorithm,

in Proc. IEEE International Workshop on Machine Learning for Signal Processing (MLSP05), Mystic, CT, September 2005.

 

o       Y. Sun, J. Li, and W. Hager

Two New Regularized AdaBoost Algorithms,

in Proc. International Conference on Machine Learning and Applications (ICMLA04), pp. 41- 48, Louisville, KY, December 2004.

 

o       Y. Sun and J. Li,

Time-frequency Analysis for Buried Plastic Landmine Detection via Forward-looking Ground Penetrating Radar,

IEE Proc.-Radar, Sonar and Navigation, special issue on Time-Frequency Analysis and Feature Extraction, vol. 150, pp. 253-261, August 2003.

 

o       Y. Sun, M. Xue, J. Li, and S. R. Stanfill,

Improving ATR Performance through Distance Metric Learning,

in Proc. of SPIE on Technologies and Systems for Defense and Security (SPIE07), Orlando, FL, April 2007.

 

o       Y. Sun and J. Li,

Landmine Detection Using Forward-looking Ground Penetrating Radar,

in Proc. of SPIE on Technologies and Systems for Defense and Security (SPIE05), vol. 5794, pp. 1089-1097, Orlando, FL, April 2005.

 

o       Q. Liu, Y. Sun, and J. Li,

Automatic Target Recognition with Bayesian Networks for Wide-area Airborne Minefield Detection,

in Proc. of SPIE on Technologies and Systems for Defense and Security (SPIE05), vol. 5794, pp. 1060-1070, Orlando, FL, April 2005.

 

o       Y. Sun, Z. Liu, S. Todorovic, and J. Li,

SAR Automatic Target Recognition Using AdaBoost,

in Proc. of SPIE on Technologies and Systems for Defense and Security (SPIE05), vol. 5808, pp. 282-293, Orlando, FL, April 2005.

 

o       Y. Sun and J. Li,

Boosting a Wavelet Packet Transform Based Landmine Detector,

in Proc. of SPIE on Technologies and Systems for Defense and Security (SPIE04), vol. 5415, pp. 1300-1309, Orlando, FL, April 2004.

 

o       Y. Sun and J. Li,

Detect Buried Plastic Mines Using Time-frequency Analysis,

in Proc. of SPIE on Detection and Remediation Technologies for Mine and Minelike Targets (SPIE03), vol. 5089, pp. 851-862, Orlando, FL, April 2003. 

 

US Patents

o       Y. Sun, and J. Li, Land Mine Detector, Patent No. US 7173560 B2, No. G01V003/12 (International Class), Feb. 2007.

 

o       Y. Sun, S. Goodison, and J. Li, Accurate Breast Cancer Prognostic System via Genetic and Clinical Markers, pending, 2006.

 

o       Y. Sun, and S. Goodison, A Feature Selection Algorithm Capable of Handling Extremely Large Data Dimensionality, pending, 2007.

 

 

Some Links:

o Bioinformatics

o BMC Bioinformatics

o IEEE Trans. on Pattern Analysis and Machine Intelligence

o Journal of Machine Learning Research

o International Conference on Machine Learning (ICML09)

o SIAM International Conference on Data Mining (SDM09)

 

 

Some Pictures of Beautiful China:

o Taken by Dr. Dimitri Bertsekas

o Taken by R. Todd King

 

 

If we knew what it was we were doing, it would not be called research, would it? ----- Albert Einstein

 

A Stroke of Genius: Striving for Greatness in All You Do.----- Richard Hamming