This project is a prototyping bed for real-time NLP work against the twitter firehose. It is designed to support Java, Clojure, Python, Ruby, and Javascript as components within Storm topologies. As I build this framework out, please feel free to fork it and adapt it to your needs.
General purpose TwitterStreaming service. Should be injected into Twitter Spouts to hook into the firehose.
Rolling totals of top n words occurring in scoped FilterQuery of twitter stream.
Captures the following:
Rolling filtered tweet stream, e.g. keyword tracking, location, language
Rolling sentiment analysis against language-specific dictionaries
Rolling hashtags ranking
Rolling entity feed ranking
lein run
Or
mvn clojure:run
See: src/main/clojure/com/joelholder/twitter-fun.clj
(ns com.joelholder.twitter-fun
(:import [com.joelholder.twitter Constants StreamReaderService]
[com.joelholder.topology TwitterFunTopology])
(:gen-class)
)
(require 'clojure.pprint)
(use 'clojure.pprint)
;; run the StreamReaderservice
(defn run-twitter []
(let [service (StreamReaderService.)]
(.readTwitterFeed service)
(Thread/sleep 60000)
))
;; run sentiment and rank analytics toplogy
(defn run-topology []
(. TwitterFunTopology main (make-array String 0)))
In Emacs, C-c M-n on the above buffer to switch to the namespace.
Then you can (run-toplogy)
, etc..
FilterQuery qry = new FilterQuery();
//new String[] keywords = {"#wine", "#cabernet", "#chardonnay"};
//new String[] keywords = {"nfl", "football", "cowboys"};
String[] keywords = { "barbie", "MLP", "monster high" };
qry.track(keywords);
stream.filter(qry);
Got tweet:I adore this concept. :D not sure about it being by barbie but what the hey. https://t.co/nceAEPnYdR Got tweet:Ex-Div Reminder For Front Street US MLP Income Fund Light (MLP) https://t.co/Wu7wpmEJ4q Got tweet:1998 Dance 'til Dawn Barbie 2nd in Great Fashions of 20th Century Series NRFB ! https://t.co/JxY1w83BQF https://t.co/g8gPFj6umJ Got tweet:@thiiiiialves se quiser te dou uma barbie Got tweet:The Amaze Chase | Life in the Dreamhouse | Barbie - https://t.co/8YrAW3IaR3 #sport #sport_review #sport_training https://t.co/LVuorqQmVl Got tweet:RT @Deligracy: WHERE ARE THE BUILDS AND BARBIE? - Update: https://t.co/4n8vT5TPiZ via @YouTube Got tweet:https://t.co/Fi7LpQ9KZh https://t.co/Fi7LpQ9KZh https://t.co/Fi7LpQ9KZh https://t.co/Fi7LpQ9KZh https://t.co/ZVLEXSbemh Got tweet:RT @_jordanmo: Sympa @France2tv qui fait le jeu du #FN et préfère annuler #DPDA plutôt que de laisser la parole aux autres candidats suite … Got tweet:RT @VoceNaoSabiaQ: Se a Barbie fosse real, ela seria magra demais para ter filhos e a desproporcional para andar de cabeça erguida.
// filter language
String[] english = new String[]{"en"};
String[] spanish = new String[]{"es"};
String[] langs = Stream.of(english, spanish)
.flatMap(Stream::of)
.toArray(String[]::new);
FilterQuery langQuery = new FilterQuery();
langQuery.language(langs);
String[] keywords = { "wine", "vino" };
langQuery.track(keywords);
stream.filter(langQuery);
Got tweet:@_SofiaBravo_SNM ya me parecía , seguro te metieron pastillas al vino como a vos tg y quedaste duracel Got tweet:A.M. SMITH 247 249 HENNEPIN AVE MINN MN 1800's LIQUOR WINE BOTTLE JUG GALLON https://t.co/QmWM3tyxyF https://t.co/Eph2ed1FLk Got tweet:RT @Playing_Dad: [At Last Supper] *Jesus raises bread* This is my body *raises wine* & my blood *pulls out 8 of Clubs* & this is your card … Got tweet:Never been happier than when my parents turned up to my flat with 4 bottles of wine, 7L of Irn Bru and two loaves of Parkin. Got tweet:https://t.co/3BWwtVHy8J is for sale via @DomainsMachine buy it now!! #wine #tobacco #alcohol #dutyfree #shopping https://t.co/ZyCd7Y8wk1 Got tweet:RT @Country_Words: She’s a bubble bath and candles, baby come and kiss me, she’s a one glass of wine and she’s feelin’ kinda tipsy. -Brad P… Got tweet:Mulled wine on a Monday afternoon #bestflat #anniemac https://t.co/6x3XuyK9S8 Got tweet:#ibmdidyouknow there is a great wine app called wine4me. Getting it now! #ibminsight https://t.co/YkGmC48sEk Got tweet:RT @Luuccho14: Que hermoso esto de que no vino ningún profesor Got tweet:Whaaaaaat ? #Watson knows wine now? I knew I liked #Watson #ibminsight #race2insight Got tweet:It's our birthday week! The Wine and Cheese chat is sold out, but we still have seats for Thursday's... https://t.co/JV8VmCxchx Got tweet:Amy Gross from VineSleuth is speaking my language. Buying wine at a store - not optimal. #IBMInsight https://t.co/41tYxyScDx Got tweet:RT @IBMServiceMgmt: Looking for the perfect bottle of wine? @vinesleuth provides expertise via mobile app. @AmyCGross #ibminsight https://t… Got tweet:62 million wine drinkers in US. But how easy is it for you to pick your bottle! Wine4me success story from @amycgross #insighteconomy Got tweet:RT @ibminsight: Calling all wine enthusiasts and "aspiring oenophiles" -- let @vinesleuth @Wine4MeApp help pick out your perfect wine. #ibm… Got tweet:now #IBMWatson can help me with wine choices - wow #ibminsight Got tweet:RT @HasnaZarooriHai: Amazing Banner Outside A Wine Shop “If You Love Someone Today, Then You’ll Surely Love Me Someday" Got tweet:RT @joelcomm: @Wine4MeApp helps choose the wine you want #IBMInsight #NewWaytoEngage https://t.co/qhTyoMLxME Got tweet:Chicken?? Whats that? https://t.co/NEaOje0wLQ Got tweet:How would several flights of #Burgundy 2003 wines vary when tasted now, compared to 10 years ago? https://t.co/3pv5AeMOiq #wine #winelover Got tweet:#Job #Nashville Full-Time Cashier Wanted (Midtown Wine & Spirits) (1610 Church Street): Midtown Wine & Spirits... https://t.co/TACIrCkj0x Got tweet:So how have I not heard of the wine4.me app?? Cognitive wine selection makes total sense. #IBMWatson #ibminsight
FilterQuery geoQuery = new FilterQuery();
// cities
double[][] sanFrancisco = new double[][] { { -122.75, 36.8 }, { -121.75, 37.8 } };
double[][] newYorkCity = new double[][] { { -74, 40 }, { -73, 41 } };
double[][] cities = Stream.of(sanFrancisco, newYorkCity).flatMap(Stream::of).toArray(double[][]::new);
// states
double[][] california = new double[][] { { 124.434800, 32.433047 }, { -114.015147, 42.120889 } };
double[][] texas = new double[][] { { -106.728330, 25.745428 }, { -93.438336, 36.605486 } };
double[][] newYork = new double[][] { { -79.842750, 40.495969 }, { -71.783881, 45.072033 } };
double[][] states = Stream.of(california, texas, newYork).flatMap(Stream::of).toArray(double[][]::new);
// countries
double[][] usa = new double[][] { { -125.0011, 24.9493 }, { -66.9326, 49.5904 } };
double[][] closeToNorway = new double[][]{new double[]{3.339844, 53.644638}, new double[]{18.984375,72.395706 }};
double northLatitude = 35.2;
double southLatitude = 25.2;
double westLongitude = 62.9;
double eastLongitude = 73.3;
double[][] pakistan = {{westLongitude, southLatitude},{eastLongitude, northLatitude}};
// set your geofencing ...
// geoQuery.locations(cities);
// geoQuery.locations(texas);
// geoQuery.locations(pakistan);
// geoQuery.locations(usa);
// let's check on Scandinavia... :)
geoQuery.locations(closeToNorway);
stream.filter(geoQuery);
Got tweet:@mattelacchiato yo hatte den falschen Screenshot gepostet und 15s später den Tweet gelöscht ... aber das Netz vergisst nie ;) Got tweet:https://t.co/wzZLYio7NM Got tweet:Her bør FFK følge med, imo. Ikke sett de to i år, men spesielt Hoel var glimrende for 08-jr/2 før Kvik. https://t.co/VBy3Rcv81Y Got tweet:@animizacja ja bym chodzi?a za nimi, boj?c si? ?e co? ukradn? Got tweet:Über Zeug nachdenken. https://t.co/Z6xdoHJ2Kd Got tweet:I just wanna live in harrys hair Got tweet:@TheRealLiont love you Got tweet:@emelieperssson Tack för RT!
run SimplerServer.main()
vertx run com.joelholder.vertx.Dashboard -cp target/my-twitter-play-0.0.1-SNAPSHOT.jar:.
cd src/main/java/com/joelholder/vertx/ vertx run com.joelholder.vertx.Dashboard -cp ../../../../../../target/my-twitter-play-0.0.1-SNAPSHOT.jar:.
cd src/main/rb/com/joelholder/vertx/ vertx run dashboard.rb -cp ../../../../../../target/my-twitter-play-0.0.1-SNAPSHOT.jar:.
cd src/main/js/com/joelholder/vertx/ vertx run dashboard.js -cp ../../../../../../target/my-twitter-play-0.0.1-SNAPSHOT.jar:.
cd src/main/groovy/com/joelholder/vertx/ vertx run dashboard.groovy -cp ../../../../../../target/my-twitter-play-0.0.1-SNAPSHOT.jar:.
https://github.com/vert-x3/vertx-examples
https://github.com/vert-x3/vertx-examples/tree/master/metrics-examples
https://github.com/johanhaleby/perfect-storm
http://www.espertech.com/products/esper.php
https://dev.twitter.com/streaming/overview/request-parameters#locations
http://twitter4j.org/en/code-examples.html
https://en.wikipedia.org/wiki/Category:Geobox_locator_United_States
http://www.gps-coordinates.net/
http://www.nrri.umn.edu/worms/downloads/team/TheGeographicCoordinateSystem.pdf
https://github.com/Sofft/my-own-storm
https://github.com/mbonaci/mbo-storm/wiki/Storm-setup-in-Eclipse-with-Maven,-Git-and-GitHub
https://github.com/mbonaci/mbo-storm
https://github.com/kantega/storm-twitter-workshop
https://github.com/kantega/storm-twitter-workshop/wiki/Basic-Twitter-stream-reading-using-Twitter4j
https://gist.github.com/mbonaci/5996278
https://storm.apache.org/documentation/Tutorial.html
https://github.com/apache/storm/tree/master/examples/storm-starter#getting-started