|
Yijun
Sun, Ph.D.
Interdisciplinary Department of
Electrical and Computer Engineering University of
Florida Email: sunyijun {at} biotech.ufl.edu |
I have moved to SUNY Buffalo as an Assistant
Professor in bioinformatics. Please visit my new homepage: http://www.acsu.buffalo.edu/~yijunsun/lab/index.html.
I received M. S. and Ph.D. degrees in
electrical engineering from the University of Florida, in 2003 and 2004,
respectively. I am an Assistant Scientist at the Interdisciplinary Center for
Biotechnology Research and an affiliated faculty member at the Department of
Electrical and Computer Engineering at the University of Florida. My research
interests include machine learning/data mining and bioinformatics. I am a
co-recipient of the 2005 IEEE M. Barry Carlton Best Transactions Paper Award.
One of my papers is selected as the Spotlight Paper in the September 2010
issue of the prestigious TPAMI journal. My research is supported by National
Science Foundation, Florida Cancer Research
Program, and Susan Komen Breast Cancer Foundation, and my work on metagenomics
and feature selection has been
used by more than 200 research institutes worldwide. [Google
Scholar]
Patience
and diligence, like faith, remove mountains. - William Penn (1644-1718)
Stay hungry, stay foolish.
- Steve Jobs (1955-2011)
o
Bioinformatics:
metagenomics,
sequence analysis, microarray data analysis, microbial community analysis, molecular
classification and genetic network modeling for cancer diagnosis and prognosis,
microbial network analysis, phylogenetic analysis
o Machine Learning/Data Mining: large margin classification/regression, large-scale clustering analysis, ensemble learning, feature selection/extraction, computational learning theory, network analysis, graphical modeling and Bayesian network.
o The D Project funded by NSF
o The L Project funded by Bankhead-Coley Cancer Research Program
o The E Project
o
Dr.
Yunpeng Cai
(Postdoc, 2007 - 2011. Now Associate Professor with the Chinese Academy of Sciences)
o
Dr.
Xiaoyu Wang (Postdoc, 2010 - )
o
Dr.
Ying Tang (Postdoc, 2011 - 2012)
o
Dr.
Karthik Gurumoorthy
(Postdoc, 2011 – 2012. Now Research Scientist with GE Global Research -
India)
o
Jin
Yao (PhD student, 2011-)
o
Hedjazi Lyamine (PhD
student at University of Toulouse - France, External committee member)
o
Qian Chen (Ph.D. Student, Committee member)
o
Lei
Yang (Ph.D. Student, Committee member)
o
Bing
Han (Ph.D. Student, Committee member)
o
Jun
Xu (Ph.D. Student, Committee member)
o
Taoran Lu (Ph.D. Student, Committee member)
o
Ming
Xue (Ph.D. Student, Committee member)
o
Lin
Du (Ph.D. Student, Committee member)
o
Jun
Ling (Ph.D. Student, Committee member)
o
Yubo Cheng (Master Student, graduated in 2009)
o
Y. Sun and Y. Cai,
Inferring
Microbe-microbe Interactions Using Large Collections of 16S rRNA
Pyrosequences,
Technical
Report, 2011.
o
Y.
Cai* and Y. Sun*,
ESPRIT-Forest:
Taxonomy Independent Analysis of Tens of Millions of 16S rRNA
Pyrosequences Using Parallel Computing,
Technical
Report, 2011.
o
R.
Raychoudhury, R. Sen, Y. Cai, Y. Sun, V.
Ulrike-Lietze, D. Boucias,
and M. Scharf,
Comparative Metatranscriptomic Signatures of Wood and Paper Feeding in
the Gut of the Termite Reticulitermes flavipes (Isoptera: Rhinotermitidae)
Genome Biology, submitted, 2012.
o
X.
Zhang, C. Wang, Y. Zhang, Y. Sun,
and Z. Mou,
The Arabidopsis
Mediator Complex Subunit 16 Positively Regulates Salicylate-Mediated Systemic
Acquired Resistance and Jasmonate/Ethylene-Induced
Defense Pathways
The Plant Cell, submitted, 2012
o
D.
Boucias, Y. Cai, Y. Sun, V. U. Lietze,
R. Sen, R. Raychoudhury,
and M. Scharf,
The
Microbiome of the Lignocellulosedegrading
Termite Reticulitermes Flavipes:
Resistance to Perturbation in Response to Diet,
Microbiology Ecology, submitted, 2012.
o
L.
Yin, L. Liu, Y. Sun, W. Hou, A. C. Lowe, B. P. Gardner, M. Salemi,
W. B. Williams, W. G. Farmerie, J. W. Sleasman, and M. M. Goodenow,
Novel High Resolution Deep Sequencing Analysis
Reveals HIV-1 Biodiversity, Population Structure, and Persistence during
Natural History of Infection,
Retrovirology,
submitted, 2012.
o
C. Rosser, V. Urquidi, Y.
Cai, Y. Sun, and S. Goodison,
Molecular Biomarker
Signature for the Non-Invasive Detection of Bladder Cancer,
Cancer Epidemiology, Biomarkers & Prevention, submitted, 2012.
o
M.
Ukhanova, T. Culpepper, D. Baer, D. Gordon, S. Kanahori, J. Valentine, J. Neu, Y. Sun, X. Wang, V. Mai,
Gut Microbiota Correlates with Energy Gain from a Dietary Fiber
and Appears Associated with Acute and Chronic Intestinal Diseases,
Clinical Microbiology
and Infection,
Suppl 4, pp. 62-66, 2012. (impact
factor: 4.8)
o
X.
Wang, Y. Cai, Y. Sun, R. Knight, V. Mai,
Secondary
Structure Information Does not Improve OTU Picking for 16S rRNA Sequences,
The ISME
Journal,
vol. 6, no. 7, pp. 1277-1280, 2012. (impact factor:
6.2)
o
A-L.
Paul, A. Zupanska, D. Ostrow,
Y. Zhang, Y. Sun, J. Li, S. Shanker, W. Farmerie, C. Amalfitano, R. J. Ferl,
Spaceflight Transcriptomes: Unique Responses to a Novel Environment,
Astrobiology, vol. 12,
no. 1, pp. 40-56, 2012. (impact factor: 2.4)
o
Y. Sun*, Y. Cai*, S. Huse, R. Knight, W. Farmerie, X. Wang and V. Mai, (*equal contribution)
Briefings in Bioinformatics, vol. 13, no. 1, pp.
107-21, 2012.
(impact factor: 9.3)
o
V.
Mai, C. M. Young, M. Ukhanova, X. Wang, Y. Sun, G. Casella, D. Theriaque, N. Li, R. Sharma, M. Hudak,
J. Neu,
Fecal
Microbiota in Premature Infants Prior to Necrotizing Enterocolitis,
PLoS ONE, vol. 6, no. 6, e20647, 2011. (impact
factor: 4.4)
o
Y.
Cai* and Y. Sun*, [Website]
Nucleic Acids Research, vol. 39, no. 14,
e95, 2011. (impact factor: 7.8)
o
Y. Sun and Y. Cai,
Estimating Species Richness Using Large Collections of 16S rRNA Pyrosequences,
Handbook
of Molecular Microbial Ecology: Metagenomics
and Complementary Approaches (Edited
by Frans J. de Bruijn),
Wiley-Blackwell,
2011.
o
Y.
Cai, H. Lyamine, Y. Sun,
and S. Goodison,
Fast Implementation of ℓ1 Regularized Learning Algorithms Using Gradient Descent Methods,
IEEE Trans. on Pattern
Analysis and Machine Intelligence, submitted, 2010.
o
Y. Sun*, Y. Cai*, V.
Mai, W. Farmerie, F. Yu, J. Li, and S. Goodison, (*equal contribution)
Nucleic Acids Research, vol. 38, no. 22,
e205, 2010 (impact factor: 7.8)
o
Y. Sun*, Y. Cai*, L.
Liu, F. Yu, M. L. Farrell, W. McKendree, and W. Farmerie, (*equal contribution)
ESPRIT: Estimating Species Richness Using Large Collections of 16S rRNA Pyrosequences,
Nucleic
Acids Research,
vol. 37, no. 10, e76, 2009. (impact factor: 7.8)
The algorithm has
been used by more than 200 major research institutes worldwide.
o Y. Sun, S. Todorovic, and S. Goodison,
Local Learning Based Feature
Selection for High Dimensional Data Analysis,
IEEE Trans. on Pattern Analysis and Machine
Intelligence (TPAMI),
vol. 32, no. 9, pp. 1610-1626, 2010.
(impact factor: 6.0, the overall top-ranked IEEE
transactions journal)
This paper is featured
as Spotlight Paper in the September
2010 issue of TPAMI.
o
S.
Goodison, Y. Sun, and V. Urquidi,
Review: Derivation
of Cancer Diagnostic and Prognostic Signatures from Gene Expression Data,
Bioanalysis, vol. 2, no. 5, pp. 855-862, 2010.
o Y. Sun, V. Urquidi, and S. Goodison,
Breast Cancer Research and Treatment, vol. 119, no. 3, pp. 593-599, 2010. (impact factor: 5.7)
o C. Pascoe, A. Lawande, H. Lam, A. George, Y. Sun, W. Farmerie, and H. Martin
in Proc 2010 Symposium on Application Accelerators in High-Performance Computing (SAAHPC10), pp. 1-6, July 2010. (oral presentation)
o Y. Cai, Y. Sun, Y. Cheng, J. Li, and S. Goodison,
Fast Implementation of ℓ1 Regularized Learning Algorithms Using Gradient Descent Methods,
in Proc 10th SIAM
International Conference on Data Mining (SDM), pp. 862-871, April 2010.
(oral presentation/acceptance rate: 82/351 = 23%)
o
F.
Yu*, Y. Sun*, L. Liu, and W. Farmerie, (*equal
contribution)
GSTaxClassifier: A Genomic Signature Based Taxonomic Classifier for Metagenomics Data Analysis,
Bioinformation, vol. 4, no. 1, pp.
46-49, 2009.
o
N.
Bandyopadhyay, T. Kahveci,
S. Ranka, Y. Sun and S. Goodison,
Pathway based Feature
Selection Algorithm for Cancer Microarray Data,
Advances
in Bioinformatics,
532989, 2009.
o
Y.
Duan, L. Zhou, D. Hall, W. Li, H. Doddapaneni,
H. Lin, L. Liu, C. Vahling, D. Gabriel, K. Williams,
A. Dickerman, Y. Sun, and T. Gottwald,
Molecular Plant-Microbe Interactions, vol. 22, no. 8, 2009,
pp. 1011-1020, 2009. (impact factor: 4.3)
o Y. Sun* and S. Goodison* (equal contribution)
Optimizing Molecular
Signatures for Predicting Prostate Cancer Recurrence,
The Prostate, vol. 69, no. 10, pp. 1119-27, 2009. (impact
factor: 3.7)
This paper was featured in Medical News Today.
o
Y.
Cai, Y. Sun,
J. Li, and S. Goodison,
Online
Feature Selection Algorithm with Bayesian L-1 Regularization,
in Proc. 13th Pacific-Asia Conference on Knowledge Discovery and Data
Mining (PAKDD09), vol. 5476, pp. 401-413, April 2009. (oral
presentation: 39/338 = 12%)
o
C.
Rosser, L. Liu, Y. Sun, P. Villicana, M. McCullers, S. Porvasnik,
and S. Goodison
Bladder Cancer
Associated Gene Expression Signatures Identified by Profiling of Exfoliated Urothelia,
Cancer Epidemiology,
Biomarkers and Prevention, vol. 18, no. 2, pp. 444-453, 2009. (impact
factor: 4.8)
o
Y. Sun, and D. Wu
Feature Extraction
through Local Learning,
Statistical Analysis and Data Mining, vol. 2, no. 1, pp.
34-47, 2009.
o
Y.
Cheng, Y. Cai, Y.
Sun, and J. Li
Semi-supervised
Feature Selection under the Logistic I-RELIEF Framework,
in Proc. 19th International Conference on Pattern Recognition (ICPR08),
pp. 1-4, December 2008.
(oral
presentation: 18%)
o
Y. Sun, Y. Cai, and S. Goodison
Combining Nomogram and Microarray Data for Predicting Prostate Cancer
Recurrence,
in Proc. 8th IEEE
International Conference on Bioinformatics and Bioengineering (BIBE08), pp.
1-7, October 2008.
o
Y. Sun and D. Wu
A RELIEF Based Feature Extraction
Algorithm [Matlab code]
in Proc. 8th SIAM International Conference on Data Mining (SDM08), pp.
188-195, April 2008.
(oral
presentation: 40/282 = 13%)
o
Y. Sun, S. Todorovic, and S. Goodison
A
Feature Selection Algorithm Capable of Handling Extremely Large Data
Dimensionality [Matlab code]
in Proc. 8th SIAM International Conference on Data Mining (SDM08), pp.
530-540, April 2008.
(acceptance
rate: 72/282 = 25%)
o
Y. Sun
Iterative RELIEF for Feature Weighting: Algorithms, Theories, and Applications [Website]
IEEE Trans. on Pattern Analysis and Machine
Intelligence (TPAMI),
vol. 29, no. 6, pp. 1035-1051, June 2007. (impact
factor: 6.0)
o
Y. Sun, S. Goodison, J. Li, L. Liu, and W. Farmerie
Improved Breast Cancer Prognosis
through the Combination of Clinical and Genetic Markers
Bioinformatics, vol. 23, no. 1, pp. 30-37, January 2007. (impact factor: 5.0)
This paper
was among the 50 most-frequently read articles in Bioinformatics. Dec. 2006
(29th), Jan. 2007 (5th), Feb. 2007 (44th). It is featured in MATLAB
Digest--Biotech and Pharmaceutical Edition (vol. 1, no. 2, June
2007).
o
Y. Sun, S. Todorovic,
and J. Li
Unifying
Multi-Class AdaBoost Algorithms with Binary Base
Learners under the Margin Framework
Pattern Recognition
Letters (PRL),
vol. 28, no. 5, pp. 631-643, April 2007.
o
Y. Sun
Feature
Weighting through Local Learning
Computational Methods
of Feature Selection,
H. Liu and H. Motoda (eds.), Chapman and Hall/CRC
Press, October 2007.
o
Y. Sun and J. Li
Iterative RELIEF for
Feature Weighting
in Proc. International Conference on Machine Learning (ICML06), vol. 29, pp. 1035-1051,
June 2006.
(acceptance rate 140/700 = 20%)
o
Y. Sun, S. Todorovic, J. Li, and D. O.
Wu
Unifying Error-Correcting
and Output-Code AdaBoost within the Margin Framework,
in Proc. International
Conference on Machine Learning (ICML05), vol. 119, pp. 872-879, August
2005.
(acceptance rate 134/491 = 27%) [Matlab
code]
o
Y. Sun, Z. Liu, S. Todorovic and J. Li
Adaptive Boosting for Synthetic
Aperture Radar Automatic Target Recognition,
IEEE Trans. on
Aerospace and Electronic Systems (TAES), vol. 43, no. 1, pp. 112-125, January 2007.
o
Y. Sun and S. Goodison
Predicting
Breast Cancer Metastasis by Integrating Both Clinical and Genetic Markers,
in Proc. International Conference on Bioinformatics and Computational
Biology (BIOCOMP07), vol. 1, pp. 229-235, June 2007.
(acceptance
rate = 27%)
o
Y. Sun, F. Yu, L. Liu, and
W. Farmerie
Estimating Microbial
Population Densities Based on Genomic Signatures,
in Proc. International
Conference on Bioinformatics and Computational Biology (BIOCOMP07), vol. 1,
pp. 163-168, June 2007.
(acceptance rate = 27%)
o
Y. Sun, L. Liu, M. Popp, and W. Farmerie
Estimation of Cross-hybridization Signals
Using Support Vector Regression,
in Proc. IEEE Symposium of Computations in Bioinformatics
and Bioscience (SCBB06), vol. 1, pp. 17- 21,
June 2006.
o
Y. Sun, S. Todorovic, and J. Li
Reducing the Overfitting of AdaBoost by Controlling its Data Distribution Skewness,
International Journal of Pattern Recognition and Artificial
Intelligence (IJPRA),
vol. 20, no. 7, pp. 1093-1116, November 2006.
o
Y. Sun, S. Todorovic, and J. Li,
Increasing
the Robustness of Boosting Algorithms within the Linear-Programming Framework,
Journal
of VLSI Signal Processing Systems, vol. 48, no. 1-2, pp. 5-20, August 2007.
o
Y. Sun and J. Li
Adaptive Learning Approach to Landmine Detection,
IEEE Trans. on Aerospace and
Electronic Systems (TAES), vol. 41, no. 3, pp. 973-985,
July 2005.
o Y. Wang, X. Li, Y. Sun, J. Li and P. Stoica
Adaptive Imaging for Forward-looking Ground Penetrating Radar,
IEEE Trans. on Aerospace and Electronic Systems (TAES), vol. 41, no. 3, pp. 922-936, July 2005.
o Y. Sun, X. Li and J. Li
Practical Landmine Detector Using Forward-Looking Ground Penetrating
Radar,
IEE Electronics Letters, vol. 41, pp.97-98,
January 2005.
o Y. Sun, S. Todorovic, J. Li, and D. O. Wu
A Robust Linear-programming Based Boosting Algorithm,
in Proc. IEEE International Workshop on
Machine Learning for Signal Processing (MLSP05), Mystic, CT, September 2005.
o Y. Sun, J. Li, and W. Hager
Two New Regularized AdaBoost Algorithms,
in Proc. International Conference on
Machine Learning and Applications (ICMLA04), pp. 41- 48, Louisville, KY,
December 2004.
o
Y. Sun and J. Li,
IEE Proc.-Radar, Sonar and Navigation, special issue on Time-Frequency
Analysis and Feature Extraction, vol. 150, pp.
253-261, August 2003.
o Y. Sun, M. Xue, J. Li, and S. R. Stanfill,
Improving ATR Performance through Distance Metric Learning,
in Proc. of SPIE
on Technologies and Systems for Defense and Security (SPIE07), Orlando, FL,
April 2007.
o Y. Sun and J. Li,
Landmine Detection Using Forward-looking Ground Penetrating Radar,
in Proc. of SPIE
on Technologies and Systems for Defense and Security (SPIE05), vol. 5794,
pp. 1089-1097, Orlando, FL, April 2005.
o Q. Liu, Y. Sun, and J. Li,
Automatic Target Recognition with Bayesian Networks for Wide-area
Airborne Minefield Detection,
in Proc. of SPIE
on Technologies and Systems for Defense and Security (SPIE05), vol. 5794,
pp. 1060-1070, Orlando, FL, April 2005.
o Y. Sun, Z. Liu, S. Todorovic, and J. Li,
SAR Automatic Target Recognition Using AdaBoost,
in Proc. of SPIE on Technologies and
Systems for Defense and Security (SPIE05), vol. 5808, pp. 282-293, Orlando,
FL, April 2005.
o Y. Sun and J. Li,
Boosting a Wavelet
Packet Transform Based Landmine Detector,
in Proc. of SPIE on Technologies and Systems
for Defense and Security (SPIE04), vol. 5415, pp. 1300-1309, Orlando, FL,
April 2004.
o Y. Sun and J. Li,
Detect Buried Plastic Mines Using Time-frequency
Analysis,
in Proc. of SPIE on Detection and Remediation Technologies for Mine and Minelike Targets (SPIE03), vol. 5089, pp. 851-862, Orlando, FL, April 2003.
US Patents
o Y. Sun, and J. Li, Land Mine Detector,
Patent No. US 7173560 B2, No. G01V003/12 (International Class), Feb. 2007.
o V. Mai, J. Byatt, and Y.
Sun, Microbiota Profiling to Identifying Subjects
at Increased Risk for CRC, pending, 2011.
o V. Mai, Y. Sun, and J. G. Morris, Jr. Methods and Systems for Screening
for Risk of Colorectal Polyps and Cancer, Serial No. 61/622, 128, 2012.
o Y. Sun, S. Goodison,
and L Liu, Methods of Feature Selection through Local Learning; Breast and
Prostate Cancer Prognostic Markers, WO/2009/067655, International
Application No.: PCT/US2008/084325, 2009.
Nucleic Acids Research, PLoS ONE, Bioinformatics, Genome Research, Genome Biology,
IEEE Trans. on Pattern Analysis and Machine Intelligence, IEEE Trans. on
Knowledge and Data Engineering, Applied and Environmental Microbiology,
Frontiers in Bioscience, The ISME Journal, IEEE Trans. on Neural Networks,
Applied and Environmental Microbiology, IEEE Trans. on Fuzzy Systems, Journal
of Pattern Recognition Research, Neurocomputing,
Pattern Recognition, Pattern Recognition Letters, Computational Optimization
and Applications, IEE Proceedings -Radar, Sonar and Navigation, IEEE Signal
Processing Letters, IEEE Trans. on Aerospace and Electronic Systems, IEEE
Trans. on Instrumentation and Measurement, IEEE Trans. Geoscience and Remote
Sensing, Biomedical Signal Processing and Control, Chinese Optics Letters,
Knowledge and Information Systems, Artificial Intelligence
Review
Editorial Board:
Frontiers in Evolutionary and
Genomic Microbiology (2011-)
Frontiers in Genetics (2011-)
Some Useful Links:
o IEEE
Trans. on Pattern Analysis and Machine Intelligence
o Journal
of Machine Learning Research
o International
Conference on Machine Learning
o SIAM International Conference on
Data Mining
o KDnuggets-Analytics and Data Mining Resources
Some Pictures of Beautiful China:
o Taken by
Dr. Dimitri Bertsekas
o If we knew what it was we were doing, it would not be called research,
would it? - Albert Einstein (1879-1955)
o A stroke of
genius: striving for greatness in all you do. - Richard Hamming (1915-1998)