Gutenberg Indexer

Overview

This is a piece of code which, piece-by-piece pulls down books from Project Gutenberg to index them. This index is then used for search and item-item similarity.

Status

Extremely alpha, most of the code is not there.

Tech

We will (or do) make use of, and give thanks for:

XStream XML parsing
Gutenberg's own catalog
Apache Lucene for fast, full-text searching
JUnit

Building

mvn clean install

Why?

I want to have an awesome search, and moreso, I want kick-ass content-based recommendations for books on Project Gutenberg.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
src		src
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src

src

.gitignore

.gitignore

LICENSE.txt

LICENSE.txt

README.md

README.md

pom.xml

pom.xml

Repository files navigation

Gutenberg Indexer

Overview

Status

Tech

Building

Why?

About

Releases

Packages

Languages

License

TinnedTuna/gutenberg-index

Folders and files

Latest commit

History

Repository files navigation

Gutenberg Indexer

Overview

Status

Tech

Building

Why?

About

Resources

License

Stars

Watchers

Forks

Languages