LIN 6932 (Section 2702)
University of Florida
Spring Semester 2007
Instructor: Hana
Filip
Time: T7 (1:55-2:45) & R7-8 (1:55-3:50)
Place: ARCH
(Architecture) 120
Office: 370
Dauer
Office hours: T/R6
(12:50-1:40) & by appointment
E-mail: hana.filip@gmail.com
Office phone: 392-2101
ext 217
Web page: http://plaza.ufl.edu/hfilip
Course Description: This course is an
introductory overview to the field of natural language processing and
computational linguistics. It pursues two main
goals. First,
it covers finite-state methods, parsing and grammars, computational semantics, information
extraction and information retrieval, question
answering, web search, discourse processing, empirical corpus-based
linguistics, including creation and annotation of large-scale corpora.
Second,
the course is a computer literacy class, designed toward enabling you to make a
computer do exactly what you want it to, and to this goal, tools for the
working computational linguist will be introduced. The focus of this class is on
writing scripts to use available online implementations of NLP/CL applications,
rather than on implementing complete applications themselves. In this connection, the class
will be concerned with the operating system Unix.
¥ Homeworks:
7 homeworks (Homework Collaboration Policy, see below). Homework is due
before the class starts on the day it is due. LATE HOMEWORK WILL NOT BE ACCEPTED. I will drop your lowest homework grade.
¥ Readings: To be read before the class period in which they will be discussed. You will be expected to do a significant amount of textbook reading in this course.
¥ 84%
best 6 homeworks out of 7
¥ 16%
class participation
Required Texts
Selected
online PDFs book chapters and articles.
The
abbreviation ÔJ+MÕ in the syllabus refers to Jurafsky, Daniel and James Martin.
2007. Speech and Language Processing:
An Introduction to Natural Language Processing, Computational Linguistics,
and Speech Recognition. Prentice-Hall.
Recommended Texts
Wynne,
Martin. A Course In The Unix Operating System.
http://www.comp.lancs.ac.uk/computing/users/eiamjw/unix/ (online manual)
This is just one among many unix tutorials and manuals available online. You may want to take a look yourself what is available. If youÕd rather have a book in hand, you may consider purchasing:
Peek, Jerry,
Grace Todino-Gonguet, John Strang.
2002. Learning the Unix Operating System. (Fifth
Edition.) O'Reilly Media.
http://www.oreilly.com/catalog/lunix5/ also available at amazon.com.
|
SYLLABUS (subject to
changes) |
||||
Wk |
Date |
HW |
Lec |
Topic and Readings |
|
1 |
Jan 9 |
|
|
Introduction |
|
1 |
Jan 11 |
|
Overview of Computer Speech and Language Processing,
Regular Expressions * J+M Old Chapter 1: Introduction |
|
|
2 |
Jan 16 |
|
Unix |
|
|
2 |
Jan 18 |
Finite Automata * J+M Old Chapter 2:
Regular Expressions and Automata á OPTIONAL
ADVANCED READING J+M New Chapter
3: Finite-State Transducers, Morphology, and Edit Distance á FUN STUFF:
http://www.cs.princeton.edu/introcs/75turing/ Downloadable Turing
Machine Simulator (Princeton University,
Computer Science Department) |
||
|
3 |
Jan 23 |
|
Unix |
|
|
3 |
Jan 25 |
|
|
Part of Speech Tagging and Intro to Probabilistic Modeling * J+M New
Chapter 5: Word Classes and Part of Speech Tagging |
|
4 |
Jan 30 |
|
Unix |
|
|
4 |
Feb 1 |
|
Part of Speech Tagging (II) * J+M New Chapter 5: Word Classes and Part of
Speech Tagging |
|
|
5 |
Feb 6 |
|
|
Unix |
|
5 |
Feb 8 |
|
N-grams * J+M New Chapter
6: Hidden Markov Models and Loglinear Models, page 1-13 only |
|
|
6 |
Feb 13 |
|
|
Unix |
|
6 |
Feb 15 |
|
Grammars and Parsing |
|
|
7 |
Feb 20 |
|
Unix |
|
|
7 |
Feb 22 |
|
Grammars and Parsing (II) * Chomsky Hierarchy |
|
|
8 |
Feb 27 |
|
Unix |
|
|
8 |
Mar 1 |
|
Grammars and Parsing (III) á J+M New Chapter 12: Parsing with Context-Free Grammars |
|
|
9 |
Mar 6 |
|
Unix |
|
|
9 |
Mar 8 |
|
|
Question Answering |
|
10 |
Mar 13 |
|
SPRING BREAK |
|
|
10 |
Mar 15 |
|||
|
11 |
Mar 20 |
|
|
Unix |
|
11 |
Mar 22 |
|
Machine
Translation: Statistical MT
* J+M New Chapter 24 (pages 1-46) |
|
|
12 |
Mar 27 |
|
Unix |
|
|
12 |
Mar 29 |
|
Computational Lexical Semantics 1 |
|
|
13 |
Apr 3 |
Unix |
||
|
13 |
Apr 5 |
|
Computational Lexical Semantics 2 |
|
|
14 |
Apr 10 |
|
Unix |
|
|
14 |
Apr 12 |
|
|
Information Extraction: Faustus, TextPro
* Introduction to Information
Extraction Technology: A Tutorial Prepared
for IJCAI-99 by Douglas E. Appelt * Using
Information Extraction to Improve Document Retrieval * Douglas Appelt:
TextPro Documentation * The (Non)Utility of
Predicate-Argument Frequencies for
Pronoun Interpretation by Douglas Appelt and Andy Kehler |
|
15 |
Apr 17 |
|
|
Unix |
|
15 |
Apr 19 |
|
|
Discourse * J+M New Chapter 20 |
|
16 |
Apr 24 |
|
Unix
|
|
Homework Collaboration Policy (This policy is directly
taken from Chris ManningÕs CS 224N course held at Stanford University.)
¥ You may talk to
anybody you want about the assignments, including working through problems
together in groups. Indeed, we
encourage you to work in groups, and to work with different people through the
quarter.
¥ However, for
written problem sets (there may not be any for this class; I'll let you know):
1. you must
state on your written assignment the people you discussed problems with
2. you are not allowed to take
detailed notes in any group sessions that will appear verbatim in assignment
write-ups. Everybody has to turn
in written homework answers that are written solely by himself/herself.
¥ Programming parts/projects:
These can be done by oneself or in groups of at most 3, and people may submit a
joint submission or identical material, which is assumed to be the joint work
of all partners.
*********
Plagiarism or cheating on homework assignments
will not be tolerated. Any example
of Academic Dishonesty will be subject to the rules and regulations set forth
under the headings, ÒStandard of Ethical ConductÓ, ÒAcademic HonestyÓ, and
ÒStudent Conduct CodeÓ in The University Record 2004-5, Sec.1, pp.8-9.
If you miss more than four sessions, please drop
this course. For details, see
ÒAttendance PoliciesÓ, The University Record 2004-5, Sec.2,
p.13. Failure to attend and
actively participate will result in a lowering of grades.