Guidelines for final projects

Choice of topic.  The project must involve substantial experimentation or research in the area of machine learning, and must provide some opportunity to write code and a report containing interesting results and analysis.  Other than this, there are no restrictions on the topic.  One possibility is to identify a practical machine learning problem and try to get some good results on it by applying various machine learning techniques (e.g. identifying spam webpages, using a publicly available data set).  In this case, you might need to write some code simply to get the data in an acceptable format, and you would probably also write some code to implement a new classifier, or a variant of some existing Weka classifier.  Another possibility is to investigate a general claim about machine learning algorithms, using multiple data sets (e.g. determine whether pruning helps the accuracy of decision trees, and if so by how much).  In this case, you might need to write some code to facilitate performing numerous large-scale experiments, and you might also implement a new classifier or a variant of an existing one.  A third possibility is to replicate and further investigate the results of a previously-published research paper.  Of course, there are many other possibilities for topic choices which have not been described explicitly here.

Planning and milestones.  It is essential to define clear objectives and milestones for your project.  In particular, you should define a preliminary milestone which you are absolutely sure is achievable, and which will produce some acceptable result, even if you hope to achieve much more. 

Written report.  Your report should be written in the style of a scientific research paper for publication.  It should be targeted at scientists who are not experts in machine learning, so you should briefly explain the algorithms you have used and give appropriate citations.  The report must clearly state the objectives of your project, and the conclusions you reached.  The report should be divided into several sections with a logical structure, including at least the following: an introduction, a section describing the methods used, a section describing the results of experiments, and a conclusion.  The report's length should be between 8 and 20 pages including graphs, figures and bibliography (yes, you can use double spacing if you want, but you don't have to). 

Presentation.  Presentations should be 10 to 15 minutes in length.  You may present your project using any combination of whiteboard and slides.  A maximum of 10 slides are permitted.  Text slides may have no more than 70 words; slides containing any kind of graph or figure may have no more than 20 words. The presentation must clearly state the objectives of your project, and the conclusions you reached.

Grading.  As stated in the syllabus, the project will be marked out of 100, broken down as follows: 40 for the code, 40 for the written report, and 20 for the presentation.