My main research focus is developing new kernel
based machine learning methods to solve real-life data mining problems arising
in the neuroscience research area. I study optimization problems which constitute
the core of machine learning methods. My objective is to address new problems in
mining and interpreting neural information, develop new models and formulations
and find efficient methods to solve them.
I am currently finishing my Ph.D. degree in the
Industrial and Systems Engineering Department at the
My research experience thus far has involved
working with other disciplines in analyzing their data and solving their
problems using standard and novel methods that involve optimization problems.
Dr. Pardalos and I collaborate with Dr. Mingzhou Ding from the Biomedical
Engineering department at the
I am also interested in other research areas,
especially network optimization theory. I collaborate with Dr. Ravindra Ahuja (
My current and future research interests involve pursuing
potentially many variations of the machine learning algorithms we have developed
recently. I also have access to a large variety of neural data, on which these
new methods can be applied. I would like to extend the application areas to
other complex information domains following a repetitive structure such as
stock market data.
I am also involved in other research related
activities. I was the chair of the local committee for the conference on Data
Mining, Systems Analysis and Optimization in Neuroscience, which was held in
I strongly believe that a researcher should be well
rounded and technically sound, however it is a rarity to solve demanding
problems alone. In my opinion, collaboration with other colleagues is the key
to productive and high quality research in any institution. I think that
interactive seminars on focused discussion topics are one of the best ways to instigate
collaborative projects, from PhD students to senior faculty. I would strongly
advocate the inclusion of a series of parallel focused seminars in the graduate
curriculum. This may be a motivation for all levels of faculty to schedule
weekly meetings with their students and increase the faculty-student
interaction. I also believe that Ph.D. students should be given opportunities
to learn about the requirements of the academic life by involving them in other
research related activities such as conference organizations, proposal writing
and peer review.
Below, I have included more details about my
publications, my current and future research directions and other research
related activities I am involved with.
PUBLICATIONS
Selective
Support Vector Machines
O.
Şeref, E. O. Kundakcioglu, O. A. Prokopyev, P. M. Pardalos
Journal of
Combinatorial Optimization, 2007 (forthcoming)
Neural time series regarding higher level cognitive
tasks have high variation over the time line from one trial to another. In order to reduce this variation and provide
time series alignment, we introduced the selective classification problem. The
solution to this problem led to the development of the Selective Support Vector
Classification (SelSVM), a highly combinatorial problem. We studied a number of
relaxations to the SelSVM formulation. SelSVM dominated well known alignment
methods in reducing variations and produced clear results from which strong
conclusions can be inferred.
Detecting
spatiotemporal effects of a visuomotor task using kernel methods
O.
Şeref, C. Cifarelli, P. M. Pardalos, M. Ding
Annals of
Biomedical Engineering (working paper, estimated completion: January 2007)
This study involves a go, no-go type visuomotor
task in which a macaque monkey responds to a visual input. This cognitive task
involves three main phases, which involve visual processing, categorical
classification and response generation. The neural data can be split into two
classes with respect to each main phase in order to detect differentiation between
them over the time line. We apply kernel methods such as SVM classification to
detect the temporal distribution of these phases and SVM based feature
selection for the contribution of each cortical region. The data is first
processed by the SelSVM method and more relevant points are selected, after
which both classification and feature selection methods perform significantly
better.
Incremental
Network Optimization: Theory and Algorithms
O.
Şeref, R. K. Ahuja, J. Orlin
Networks
(working paper, estimated completion: December 2006)
The motivation for this problem can be summarized
as follows: the real-life practice in a network system, which is usually
suboptimal, may be substantially different compared to the optimal solution to
the underlying network problem, such that changing from the current practice to
the optimal solution may be infeasible or unrealistic. However, given an upper
bound on the allowed change, the current practice may be modified incrementally
that results in the best improvement. This motivation is expanded over to the
incremental versions of the well-known network optimization problems such as
minimum spanning tree, minimum cost flow, shortest path, minimum cost
assignment and minimum cut.
A
Classification Method based on Generalized Eigenvalue Problems
M. R.
Guarracino, C. Cifarelli, O. Şeref, P. M. Pardalos
Optimization
Methods and Software, 2007 (in print)
In this study, we worked on a machine learning
method known as the Generalized Eigenvalue Classifier (GEC). We proposed an
extension to GEC, which we called Regularized Generalized Eigenvalue Classifier
(ReGEC) based on a new regularization technique that require the solution of a
single eigenvalue problem compared to two eigenvalue problems in the original
GEC problem. We showed that the new method is faster has comparable
classification accuracy with the original method and other classification
methods.
A Parallel
Classification Method for Genomic and Proteomic Problems
Mario R.
Guarracino, C. Cifarelli, O. Şeref, Panos M. Pardalos
AINA (2)
Conference Proceedings, January 2006: 588-592
Most of the biomedical data sets are extremely
large, and therefore require special computational techniques to process the
data. In this study we propose an implementation of the ReGEC method to run on
parallel computers. We present computational results of a comparative analysis
on the efficiency of the parallel implementation, especially on large scale genomic
classification problems.
Incremental
Classification with Generalized Eigenvalues
C. Cifarelli,
Mario R. Guarracino, O. Şeref, S. Cuciniello, P. M. Pardalos
Journal of
Classification, May 2006 (submitted)
This is an incremental version of the Regularized
Generalized Eigenvalue Classification (I-ReGEC) method with a fast learning
rate on large training sets. The incremental set of training patterns chosen by
this method constitutes a significantly small subset of the original training
set, which obtains better classification generalization results. I-ReGEC
performs comparably well with respect to the original ReGEC and other
classification methods shows comparable and give more consistent classification
results.
K-T.R.A.C.E:
A Kernel k-means Procedure for Classification
C. Cifarelli,
L. Nieddu, O. Şeref, P. M. Pardalos
Computers and
Operations Research, 2007 (in print)
This method is a kernelized version of a k-means
clustering method known as T.R.A.C.E , and successfully creates nonlinear
associations between potential clusters that are originally separate clusters
found using linear metrics. The nonlinear mapping creates significantly less
number of clusters with increased clustering accuracy.
A System
Approach to Remote Sensing of Vegetation Using Kernel Methods
V. A. Yatsenko,
O. Şeref, P. A. Khandriga
International
Journal of Remote Sensing (working paper)
This study involves an analysis of spectral curves
using support vector regression methods and estimating the chlorophyll content
of the vegetation, which depends on parameters like soil type and the density
of vegetation. SVM classification is also used to assess the validity of these
parameters.
CURRENT AND
FUTURE RESEARCH
My current research involves possible extensions to
the selective classification problems and their applications on a number of
brain related data mining problems. Below is a list of general topics I started
to work on regarding selective classification problems and various neural data
sets that I have, on which I will be applying SelSVM and these new extensions.
Extensions to
Selective Classification
Selective classification algorithms have performed
very well on the neural data, which provides a strong motivation to explore
more options. Due to the structure of the core optimization problem, new
extensions and alternative solutions may be abundant. Below is a list of
current problems that I identified and started working on:
Data Mining
on Various Brain Related Problems
Center for Applied Optimization is in collaboration
with the department of Biomedical Engineering, Mc Knight Brain Institute and
Shands hospital. We have neural data from a wide variety of studies. All of the
neural data sets have the same repetitive characteristic with different phases
over time, which makes standard and selective kernel methods ideal applications.
Network Based
Data Mining
Brain can also be considered as a closely connected
network which follows the structure of social networks. This research area has
been recently discovered and is very promising. I would like to combine my
previous study domain of network optimization research with data mining and
machine learning to consider network based data mining approaches to study the
social network structure in the brain.
RESEARCH
RELATED ACTIVITIES
Research cannot be reduced to only publishing
papers. There are other important activities that foster research. Below is a
list of other activities I am involved such as organizing conferences,
preparing grant proposals, organizing special discussion groups and peer
review.
Organizing
Conferences
I was the chair of local committee of a DIMACS
workshop on Data Mining, Systems Analysis and Optimization in Neuroscience
which was held in
Grant
Proposals
We have recently prepared a grant proposal
regarding our study on the detection of visuomotor integration in the brain,
which is currently under review. Such experiences have been quite educational
in improving my writing skills in research, as well as motivating new research
problems. I believe that these opportunities should be introduced to PhD
students before they graduate.
Leading
Discussion Groups
I have been leading a discussion group on Data
Mining and Optimization in Biomedicine (DaMOB), which meets every week for two
hours. We also exchange ideas, references and presentations on a platform
hosted on internet. I believe that sharing research ideas and attacking new
problems together makes a significant difference in productivity of a
specialized research group. I would be more than willing to initiate a series
of parallel seminars in collaboration with other interested faculty to build a
strong social network among the students, as well as the faculty.
Peer Review
I think that one of the ways to write good quality
research papers is reviewing other papers. It is an excellent way of improving
writing skills for research since the reviewer can see different kinds of
errors and can reflect back on his/her personal writing skills. I am currently
a registered reviewer of the Journal of Combinatorial Optimization, Journal of
Global Optimization and Optimization Letters Journal.