A practical Storm Trident tutorial
This tutorial builds on Pere Ferrera's excellent material for the Trident hackaton@Big Data Beers #4 in Berlin
Have a look at the accompanying slides as well.
- Go through Part*.java to learn about the basics of Trident
- Implement your own topology using Skeleton.java, or have a look at other examples
├── src
└── main
├── java
│ └── tutorial
│ └── storm
│ ├── trident
│ | ├── example ------ Complete examples
│ | ├── operations ------ Functions and filters that is used in the tutorial/examples
│ | └── testutil ------ Test utility classes (e.g test data generators)
| | └── TweetIngestor ------ Creates a local Kafka broker that streams twitter public stream
| ├── Part*.java ------ Illustrates usage of Trident step by step.
| └── Skeleton.java ------ Stub for writing your own topology
└── resources
└── tutorial
└── storm
└── trident
└── testutil ------ Contains test data and config files
- Install Java 1.6 and Maven 3 (these versions are recommended, but you can also use Java 1.7 and/or Maven 2)
- Clone this repo (if you don't have git, you can also download the source as zip file and extract it)
- Go to the project folder and execute
mvn clean package
, and see if the build succeeds
These classes are primarily meant to be read, but you can run them as well. Before you run the main method, you should comment out all streams except the one you are interested in (otherwise there will be lots of output)
These toplogies expect a Kafka spout that streams tweets. The Kafka spout needs a Kafka queue. There is a utility class called Tweetingestor
which starts a local Kafka broker, connects to twitter and publishes tweets. To use this class however, you must provide a valid twitter access token in twitter4j.properties
file.
To do that,
- Go to https://dev.twitter.com and register
- Create an application and obtain a consumer key, consumer secret, access token and an access secret
- Copy
twitter4j.properties.template
astwitter4j.properties
and replcace the*******
with real credentials - After that, execute
java -cp target/trident-tutorial-0.0.1-SNAPSHOT-jar-with-dependencies.jar \
tutorial.storm.trident.example.TopHashtagAnalysis