My main research focus is developing new kernel based machine learning methods to solve real-life data mining problems arising in the neuroscience research area. I study optimization problems which constitute the core of machine learning methods. My objective is to address new problems in mining and interpreting neural information, develop new models and formulations and find efficient methods to solve them. 

 

I am currently finishing my Ph.D. degree in the Industrial and Systems Engineering Department at the University of Florida. I work in the Center for Applied Optimization (CAO) with my advisor Dr. Panos Pardalos. I collaborate with many of the students in the CAO on projects such as developing new classification algorithms and their applications to neural data with efficient implementations. I also collaborate with international visiting scholars in developing new kernel based machine learning algorithms and their applications such as classification of large genomic data or remote sensing of vegetation.

 

My research experience thus far has involved working with other disciplines in analyzing their data and solving their problems using standard and novel methods that involve optimization problems. Dr. Pardalos and I collaborate with Dr. Mingzhou Ding from the Biomedical Engineering department at the Univ. of FL and with his graduate students. I attend weekly seminars held by Dr. Ding at the biomedical engineering to improve my background information regarding biomedical problems. We work on a variety of neural data supplied by Dr. Ding’s group, which ranges from intracranial recordings of monkeys to EEG recordings of human subjects. CAO is also in close collaboration with the McKnight Brain Institute at the Univ. of FL. We work on EEG data regarding epilepsy control through vagus nerve stimulation supplied by Dr. George Ghacibeh, and MEG data from dyslexic subjects supplied by Dr. Richard Frye. I have also completed a Ph.D. level Bioinformatics course from the Computer Science department, in which I worked on a project that involved kernel based classification methods on remote homology detection.

 

I am also interested in other research areas, especially network optimization theory. I collaborate with Dr. Ravindra Ahuja (Univ. of FL) and Dr. James Orlin (MIT) on theory and algorithms in incremental network optimization regarding fundamental network problems. I believe that a researcher should be well rounded and open to other research directions. I am planning to direct my interest in network optimization towards developing network based data mining methods.

 

My current and future research interests involve pursuing potentially many variations of the machine learning algorithms we have developed recently. I also have access to a large variety of neural data, on which these new methods can be applied. I would like to extend the application areas to other complex information domains following a repetitive structure such as stock market data.

 

I am also involved in other research related activities. I was the chair of the local committee for the conference on Data Mining, Systems Analysis and Optimization in Neuroscience, which was held in Gainesville, FL in February 2006. I am among the organizers of a similar conference that will be held in March, 2007. I have experience with grant proposal writing, and I also lead a seminar in Data Mining and Optimization in Biomedicine.

 

I strongly believe that a researcher should be well rounded and technically sound, however it is a rarity to solve demanding problems alone. In my opinion, collaboration with other colleagues is the key to productive and high quality research in any institution. I think that interactive seminars on focused discussion topics are one of the best ways to instigate collaborative projects, from PhD students to senior faculty. I would strongly advocate the inclusion of a series of parallel focused seminars in the graduate curriculum. This may be a motivation for all levels of faculty to schedule weekly meetings with their students and increase the faculty-student interaction. I also believe that Ph.D. students should be given opportunities to learn about the requirements of the academic life by involving them in other research related activities such as conference organizations, proposal writing and peer review.

 

Below, I have included more details about my publications, my current and future research directions and other research related activities I am involved with.

 

 

PUBLICATIONS

 

Selective Support Vector Machines

O. Şeref, E. O. Kundakcioglu, O. A. Prokopyev, P. M. Pardalos

Journal of Combinatorial Optimization, 2007 (forthcoming)

Neural time series regarding higher level cognitive tasks have high variation over the time line from one trial to another.  In order to reduce this variation and provide time series alignment, we introduced the selective classification problem. The solution to this problem led to the development of the Selective Support Vector Classification (SelSVM), a highly combinatorial problem. We studied a number of relaxations to the SelSVM formulation. SelSVM dominated well known alignment methods in reducing variations and produced clear results from which strong conclusions can be inferred.

                                                                                                                                    

Detecting spatiotemporal effects of a visuomotor task using kernel methods

O. Şeref, C. Cifarelli, P. M. Pardalos, M. Ding

Annals of Biomedical Engineering (working paper, estimated completion: January 2007)

This study involves a go, no-go type visuomotor task in which a macaque monkey responds to a visual input. This cognitive task involves three main phases, which involve visual processing, categorical classification and response generation. The neural data can be split into two classes with respect to each main phase in order to detect differentiation between them over the time line. We apply kernel methods such as SVM classification to detect the temporal distribution of these phases and SVM based feature selection for the contribution of each cortical region. The data is first processed by the SelSVM method and more relevant points are selected, after which both classification and feature selection methods perform significantly better.

 

Incremental Network Optimization: Theory and Algorithms

O. Şeref, R. K. Ahuja, J. Orlin

Networks (working paper, estimated completion: December 2006)

The motivation for this problem can be summarized as follows: the real-life practice in a network system, which is usually suboptimal, may be substantially different compared to the optimal solution to the underlying network problem, such that changing from the current practice to the optimal solution may be infeasible or unrealistic. However, given an upper bound on the allowed change, the current practice may be modified incrementally that results in the best improvement. This motivation is expanded over to the incremental versions of the well-known network optimization problems such as minimum spanning tree, minimum cost flow, shortest path, minimum cost assignment and minimum cut.

                                                                                                                                         

A Classification Method based on Generalized Eigenvalue Problems

M. R. Guarracino, C. Cifarelli, O. Şeref, P. M. Pardalos

Optimization Methods and Software, 2007 (in print)

In this study, we worked on a machine learning method known as the Generalized Eigenvalue Classifier (GEC). We proposed an extension to GEC, which we called Regularized Generalized Eigenvalue Classifier (ReGEC) based on a new regularization technique that require the solution of a single eigenvalue problem compared to two eigenvalue problems in the original GEC problem. We showed that the new method is faster has comparable classification accuracy with the original method and other classification methods.

 

A Parallel Classification Method for Genomic and Proteomic Problems

Mario R. Guarracino, C. Cifarelli, O. Şeref, Panos M. Pardalos

AINA (2) Conference Proceedings, January 2006: 588-592

Most of the biomedical data sets are extremely large, and therefore require special computational techniques to process the data. In this study we propose an implementation of the ReGEC method to run on parallel computers. We present computational results of a comparative analysis on the efficiency of the parallel implementation, especially on large scale genomic classification problems.

 

Incremental Classification with Generalized Eigenvalues

C. Cifarelli, Mario R. Guarracino, O. Şeref, S. Cuciniello, P. M. Pardalos

Journal of Classification, May 2006 (submitted)

This is an incremental version of the Regularized Generalized Eigenvalue Classification (I-ReGEC) method with a fast learning rate on large training sets. The incremental set of training patterns chosen by this method constitutes a significantly small subset of the original training set, which obtains better classification generalization results. I-ReGEC performs comparably well with respect to the original ReGEC and other classification methods shows comparable and give more consistent classification results.

 

K-T.R.A.C.E: A Kernel k-means Procedure for Classification

C. Cifarelli, L. Nieddu, O. Şeref, P. M. Pardalos

Computers and Operations Research, 2007 (in print)

This method is a kernelized version of a k-means clustering method known as T.R.A.C.E , and successfully creates nonlinear associations between potential clusters that are originally separate clusters found using linear metrics. The nonlinear mapping creates significantly less number of clusters with increased clustering accuracy.

 

 

A System Approach to Remote Sensing of Vegetation Using Kernel Methods

V. A. Yatsenko, O. Şeref, P. A. Khandriga

International Journal of Remote Sensing (working paper)

This study involves an analysis of spectral curves using support vector regression methods and estimating the chlorophyll content of the vegetation, which depends on parameters like soil type and the density of vegetation. SVM classification is also used to assess the validity of these parameters.

 

 

CURRENT AND FUTURE RESEARCH

                                        

My current research involves possible extensions to the selective classification problems and their applications on a number of brain related data mining problems. Below is a list of general topics I started to work on regarding selective classification problems and various neural data sets that I have, on which I will be applying SelSVM and these new extensions.

 

Extensions to Selective Classification

Selective classification algorithms have performed very well on the neural data, which provides a strong motivation to explore more options. Due to the structure of the core optimization problem, new extensions and alternative solutions may be abundant. Below is a list of current problems that I identified and started working on:

 

  • Selective Support Vector Regression: I am working on different variations of selective support vector machines. An immediate extension is selective support vector regression (SelSVR), which has a very similar formulation to that of SelSVM’s.
  • Exact Solutions to Selective Classification: Although the relaxations of selective classification problem perform well even on the brain signal data, solving the original problem still remains a challenge.
  • Selective General Eigenvalue Classifiers: The concept of selective classification can be combined with generalized eigenvalue classifiers. I have been working on different formulations of the Selective Eigenvalue Classification (S-GEC) problem. Our previous work on the regularized methods and its parallel and incremental versions can be extended to the selective case.
  • Speeding up SelSVM: We use state-of-the-art optimization software to solve the core quadratic optimization problem for SelSVM classification; however the size of the data requires faster algorithms. I am working on modifications to a fast decomposition algorithm known as Sequential Minimal Optimizer to make it applicable to SelSVM to make it work faster
  • A comparative study on selective methods: Finally, I am planning to perform a comprehensive comparative study on the selective classification methods applied to a wide variety of neuroscience problems.
  • Other possible applications: The applications of selective classification methods fit very well in the single trial brain signal processing. However, these methods can be applied to any time series with similar repetitive patterns such as the stock market data which resembles the complexity of the brain with cyclic repetitive patterns.

 

 

Data Mining on Various Brain Related Problems

Center for Applied Optimization is in collaboration with the department of Biomedical Engineering, Mc Knight Brain Institute and Shands hospital. We have neural data from a wide variety of studies. All of the neural data sets have the same repetitive characteristic with different phases over time, which makes standard and selective kernel methods ideal applications.

 

  • Epilepsy control: Currently I am involved in a study to analyze the effects of vagus nerve stimulation on epilepsy, which involves continuous EEG recordings from epileptic subjects. The repetitive nature of the neural data allows application of the data mining methods we developed recently.
  • Visuomotor integration on human subjects: The experiments regarding the visuomotor task studied on macaque monkeys have been performed on human subjects using EEG recordings. Although the intensity and resolution of the data would be different due to the recording equipment we expect to achieve similar results with human subjects.
  • Dyslexia: I also work on MEG data recorded from normal and dyslexic people while performing a word recognition task. This study is aimed at distinguishing the differences between a normal brain and a dyslexic brain.

 

Network Based Data Mining

Brain can also be considered as a closely connected network which follows the structure of social networks. This research area has been recently discovered and is very promising. I would like to combine my previous study domain of network optimization research with data mining and machine learning to consider network based data mining approaches to study the social network structure in the brain.

 

 

RESEARCH RELATED ACTIVITIES

 

Research cannot be reduced to only publishing papers. There are other important activities that foster research. Below is a list of other activities I am involved such as organizing conferences, preparing grant proposals, organizing special discussion groups and peer review.

 

Organizing Conferences

I was the chair of local committee of a DIMACS workshop on Data Mining, Systems Analysis and Optimization in Neuroscience which was held in Gainesville, FL in February, 2007.  I am currently editing a special issue of Journal of Combinatorial Optimization with Dr. Panos Pardalos and Dr. Wanpracha Chaovalitwongse that consists of papers that were presented at this conference. I am also among the organizers of the upcoming conference on Data Mining, Systems Analysis and Optimization in Biomedicine, which will be held in March, 2007 in Gainesville, FL.

 

Grant Proposals

We have recently prepared a grant proposal regarding our study on the detection of visuomotor integration in the brain, which is currently under review. Such experiences have been quite educational in improving my writing skills in research, as well as motivating new research problems. I believe that these opportunities should be introduced to PhD students before they graduate.

 

Leading Discussion Groups

I have been leading a discussion group on Data Mining and Optimization in Biomedicine (DaMOB), which meets every week for two hours. We also exchange ideas, references and presentations on a platform hosted on internet. I believe that sharing research ideas and attacking new problems together makes a significant difference in productivity of a specialized research group. I would be more than willing to initiate a series of parallel seminars in collaboration with other interested faculty to build a strong social network among the students, as well as the faculty.

 

Peer Review

I think that one of the ways to write good quality research papers is reviewing other papers. It is an excellent way of improving writing skills for research since the reviewer can see different kinds of errors and can reflect back on his/her personal writing skills. I am currently a registered reviewer of the Journal of Combinatorial Optimization, Journal of Global Optimization and Optimization Letters Journal.