SI 618 HW 1
SI 618 Fall 2008
Overview of Homework 1
A study of noun phrase (NP) lists generated from the textual content of web pages, documents, and PDF files. The URLs were gathered by a spider that targeted university pages dealing with institutional diversity.
Objectives
Generate a Monty Lingua NP list from a subset of authors, genres, and/or schools. By studying the list, attempt to reveal something about the targeted audience. Generate another list in a similar way and compare the two lists. Attempt to identify differences between the selected audiences and genres.
Deliverables
Create a report that includes the following:
- an abstract of less than 150 words
- a description of the data used
- a diary of what was done
- the results (lists)
- a statement of what the lists mean
- document everything on a web page and put a link to it in the class wiki
