This is a program made by Steven Kolln which downloads a zip file of news archives, unzips them, and counts the number of articles per newsgroup, counts the number of cross posted articles, searches from how many times the word computer and capital are used and lastly searches from all email addresses without any duplicates. All output is produced to a text file. Ant can be used to download and unzip the file. If not, the files can be downloaded from http://s3.amazonaws.com/depasquale/datasets/newsgroups.zip and unzip into the main directory. If you can not use ant to run the file, compile and run the Drive.java file located in edu/tcnj/kollns1/ directory
-
Notifications
You must be signed in to change notification settings - Fork 0
stee11/newsgroups
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
a program which downloads a newsgroup and searches for specific items.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published