Guidelines for final projects
Choice of topic.
The project must involve substantial experimentation or research in the
area of machine learning, and must provide some opportunity to write code and a
report containing interesting results and analysis. Other than this, there are no restrictions on
the topic. One possibility is to
identify a practical machine learning problem and try to get some good results
on it by applying various machine learning techniques (e.g. identifying spam
webpages, using a publicly available data set).
In this case, you might need to write some code simply to get the data
in an acceptable format, and you would probably also write some code to
implement a new classifier, or a variant of some existing Weka classifier. Another possibility is to investigate a
general claim about machine learning algorithms, using multiple data sets (e.g.
determine whether pruning helps the accuracy of decision trees, and if so by
how much). In this case, you might need
to write some code to facilitate performing numerous large-scale experiments,
and you might also implement a new classifier or a variant of an existing
one. A third possibility is to replicate
and further investigate the results of a previously-published research
paper. Of course, there are many other
possibilities for topic choices which have not been described explicitly here.
Planning and milestones.
It is essential to define clear objectives and milestones for your
project. In particular, you should
define a preliminary milestone which you are absolutely sure is achievable, and
which will produce some acceptable result, even if you hope to achieve much
more.
Written report.
Your report should be written in the style of a scientific research
paper for publication. It should be
targeted at scientists who are not experts in machine learning, so you should
briefly explain the algorithms you have used and give appropriate
citations. The report must clearly state
the objectives of your project, and the conclusions you reached. The report should be divided into several
sections with a logical structure, including at least the following: an
introduction, a section describing the methods used, a section describing the
results of experiments, and a conclusion. The report's length should be between 8 and 20
pages including graphs, figures and bibliography (yes, you can use double
spacing if you want, but you don't have to).
Presentation.
Presentations should be 10 to 15 minutes in length. You may present your project using any
combination of whiteboard and slides. A
maximum of 10 slides are permitted. Text
slides may have no more than 70 words; slides containing any kind of graph or
figure may have no more than 20 words. The presentation must clearly state the
objectives of your project, and the conclusions you reached.
Grading.
As stated in the syllabus, the
project will be marked out of 100, broken down as follows: 40 for the code, 40
for the written report, and 20 for the presentation.