Skip to content

PatrickCallaghan/datastax-userinteractions-search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Messages Demo

To create the a single node cluster with replication factor of 1 for standard localhost setup, run the following

mvn clean compile exec:java -Dexec.mainClass="com.datastax.demo.SchemaSetup"

To run the insert

mvn exec:java -Dexec.mainClass="com.datastax.user.interactions.Main"

The default runs 10,000 distinct visits (login,event*,logout). To change this use -DnoOfVisits= eg

mvn exec:java -Dexec.mainClass="com.datastax.user.interactions.Main" -DnoOfVisits=30000

Now we can query data using the Solr Admin interface at http://localhost:8983/solr/#/datastax_user_interactions_demo.user_interaction/query Or we can use the following cql in cqlsh

Get all interactions of type 'Login'

SELECT * FROM user_interaction WHERE solr_query='{"q":"event_type:Login"}' limit 100;

Get all interactions within the 10 minutes

SELECT * FROM user_interaction WHERE solr_query='{"q":"*:*", "fq":"date:[NOW-10MINUTE TO *]"}' limit 100; 

Get all interactions of type 'Login' within the last 10 minutes sorted in descending date order

SELECT * FROM user_interaction WHERE solr_query='{"q":"*:*", "fq":"date:[NOW-10MINUTE TO *]", "sort":"date desc"}' limit 100;

Get all interactions where the clientid is 'mobilefx' and the browser agent contains 'OS' within the last day sorted in descending date order

SELECT * FROM user_interaction WHERE solr_query='{"q":"clientid:mobilefx", "fq":"date:[NOW-1DAY TO *]", "fq":"user_agent:*OS*", "sort":"date desc"}' limit 100;

To remove the tables and the schema, run the following.

mvn exec:java -Dexec.mainClass="com.datastax.demo.SchemaTeardown"

For Spark, run the following in a spark shell to find counts of different attributes

case class UserInteraction (id: java.util.UUID, clientid: String, correlationid: String, date: java.util.Date, details: String, event_type: String, user_agent: String, user_agent_filterd: Option[String], userid: String, reference: String )

val table = sc.cassandraTable[UserInteraction]("datastax_user_interactions_demo", "user_interaction").cache;

table.count

val googleUsers = table.filter(f => f.user_agent.toLowerCase().contains("google"));
googleUsers.count

val logins = table.filter(f => f.event_type.toLowerCase().contains("login"));
logins.count	

val loginGoogle = googleUsers.filter(f => f.event_type.toLowerCase().contains("login"));
loginGoogle.count

val results = csc.sql("SELECT * from datastax_user_interactions_demo.user_interaction");

##New Queries based on session path

To start the webserver run

mvn jetty:run

Find sessions that started like

select * from datastax.session_paths where forward_path like '"F001 F002 F003"';

Find sessions that ended like

select * from datastax.session_paths where reverse_path like '"Logout F003"';

Get a distinct list of all the paths in each of the sessions

select * from datastax.session_paths where solr_query='{"q":"*:*","useFieldCache":true,"facet":{"field":"forward_path"}}';

Proximity Search

select * from datastax.session_paths_global WHERE solr_query = '{"q":"path:\"A001 A003\"~2"}';

Find all customers who followed a workflow

select count(*) from datastax.session_paths_global WHERE solr_query = '{"q":"path:\"A001 A002 A003\""}';	

Find all customers who followed a workflow but haven't gone into a Branch

select count(*) from datastax.session_paths_global WHERE solr_query = '{"q":"path:\"A001 A002 A003\" NOT path:BRANCH"}';

Find all sessions that have a certain event

select count(*) from datastax.session_paths where forward_path like '"A001"';

select * from datastax.session_paths where forward_path like '"A001"';

Find all incomplete journeys

select * from datastax.session_paths WHERE solr_query = '{"q":"forward_path:\"B002 B003\" NOT forward_path:\"B002 B003 B004\""}';

Find all incomplete journeys in the last day

select * from datastax.session_paths WHERE solr_query = '{"q":"forward_path:\"B002 B003\" NOT forward_path:\"B002 B003 B004\"", "fq":"date:[NOW-1HOUR TO *]"}';

How many people when to a branch with 20 steps of selecting A001

select count(*) from datastax.session_paths_global WHERE solr_query = '{"q":"path:\"A001 BRANCH\"~20"}';

Group count journeys with a session for particular channel eg group count all journeys from F001 to F023

http://localhost:8080/datastax-userinteractions-search/rest/analysis-session/F001/F023	 

Group count journeys across channels eg group count all journeys from A001 to C001

http://localhost:8080/datastax-userinteractions-search/rest/analysis/A001/C001

Using Curl we can find the distinct number of users

curl -d "q=*:*&rows=0&useFieldCache=true&json.facet={ x:'unique(userid)' }" http://localhost:8983/solr/datastax.user_interactions/select

About

A short demo showing how to use DSE to search over millions of user interactions. The main queries are by event type, date and multiple filters.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages