Usage

Trident-CF is an highly scalable recommendation engine. This library is built on top of Storm, a distributed stream processing framework which runs on a cluster of machines and supports horizontal scaling.

This library implements a user-based collaborative filtering algorithm with binary ratings.

Note that Trident-CF is still in a beta phase and isn't production ready. It only lays the fundamental algorithm.

Usage

Trident-CF is based on Trident, a high-level abstraction for doing realtime computing. If you're familiar with high level batch processing tools like Pig or Cascading, the concepts of Trident will be very familiar.

It's recommended to read the Storm and Trident documentation.

Build collaborative filtering topology

The Trident-CF algorithm is build over a TridentTopology and process a stream of binary ratings in order to measure similarity between users. Binary ratings are added in real time while the similarities for changed users have to be re-processed on demand using a trigger stream. The TridentCollaborativeFilteringBuilder helps you to build up the recommendation engine.

For the purposes of illustration, this example processes an existing stream of binary ratings and re-computes users' similarities every time the trigger stream emit a tuple.

// Your trident topology
TridentTopology topology = ...;

// Stream which contain the binary ratings
Stream preferenceStream = ...;

// Stream which emit an empty tuple when user similarities must be re-computed
Stream triggerStream = ...;

// Create collaborative filtering topology
TridentCollaborativeFiltering tcf = new TridentCollaborativeFilteringBuilder()
    .use(topology)
    .process(preferenceStream)
    .updateSimilaritiesOn(triggerStream)
    .build();

Note that the preference stream must contain at least the 2 fields ("user" and "item") while the trigger stream doesn't need any field.

Trident-CF provides 2 spouts implementations which can be used to create the trigger stream : DelayedSimilaritiesUpdateLauncher and PermanentSimilaritiesUpdateLauncher.

Get item recommendations

Item recommendations are generated by aggregating preferences of the most similar users. Here's the code to process a recommendation query stream to retrieve item recommendations :

// The Trident-CF algorithm
TridentCollaborativeFiltering tcf = ...;

// Recommendations parameters
int nbItems = 10;
int neighborhoodSize = 100;

// Stream containing recommendation queries
Stream recommendationQueryStream = ...;

// Create a new stream which contains a single field : "recommendedItems" (a List of RecommendedItem).
Stream recommendationStream = tcf.createItemRecommendationStream(recommendationQueryStream, nbItems, neighborhoodSize);

Note that the recommendation query stream must contain a "user" field containing a user id (long).

Configure the Trident CF topology

You can configure the Trident-CF by providing a custom Options to the TridentCollaborativeFilteringBuilder :

// Custom options
Options options = ...;

// Create collaborative filtering topology
TridentCollaborativeFiltering tcf = new TridentCollaborativeFilteringBuilder()
    .use(topology)
    .with(options)
    .process(preferenceStream)
    .updateSimilaritiesOn(triggerStream)
    .build();

This Options lets you specify others StateFactory implementations and new parallelism configurations.

Trident-CF states

Trident-CF uses some non-transactional memory states by default however it provides non-transactional redis states. You can easily instanciate pre-configured Options with redis states :

Options options = Options.redis();

Maven

To use Trident-CF, you'll need the jar on your classpath. Trident-CF is hosted on Clojars (a Maven repository). You should either download and include the last version jar in the classpath for your project or use Maven to include Trident-CF as a development dependency in your pom.xml :

<repository>
  <id>clojars.org</id>
  <url>http://clojars.org/repo</url>
</repository>

<dependency>
  <groupId>com.github.pmerienne</groupId>
  <artifactId>trident-cf</artifactId>
  <version>0.0.1</version>
</dependency>

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
releases/com/github/pmerienne/trident-cf		releases/com/github/pmerienne/trident-cf
src		src
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

releases/com/github/pmerienne/trident-cf

releases/com/github/pmerienne/trident-cf

src

src

LICENSE

LICENSE

README.md

README.md

pom.xml

pom.xml

Repository files navigation

Usage

Build collaborative filtering topology

Get item recommendations

Configure the Trident CF topology

Trident-CF states

Maven

About

Releases

Packages

Languages

License

jinbochen/iterative-cf

Folders and files

Latest commit

History

Repository files navigation

Usage

Build collaborative filtering topology

Get item recommendations

Configure the Trident CF topology

Trident-CF states

Maven

About

Resources

License

Stars

Watchers

Forks

Languages