Skip to content

rmpvilaca/UBlog-Benchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Description:

UBlog is a performance evaluation toolkit that mimics the usage of the Twitter social network. The workload may be used with both key-value stores and relational databases. Due to its modular design, anyone is free and welcome to write any implementation targeting any other system and have it easily plugged into the framework.

Our workload definition has been shaped by the results of recent studies on Twitter. We consider just the subset of the seven most used operations from the Twitter API (Search and REST API as of March 2010):

  • statuses_user_timeline
  • statuses_friends_timeline
  • statuses_mentions
  • search_contains_hashtag
  • statuses_update
  • friendships_create
  • friendships_destroy

About:

The current version has implementations for Cassandra, Voldemort, and MySQL.

Requirements

Almost all dependencies are automatically fetched by Maven. The others must be placed in a folder named lib, depending on the target implementation:

Cassandra

  • uuid-3.1.jar
  • apache-cassandra-0.6.1.jar
  • libthrift-r917130.jar
  • hector-0.6.0-11.jar

Voldemort

  • voldemort-0.70.1.jar

Build

Once you have downloaded the source directory you should change dir into it and issue:

  1. $ cd Bench
  2. $ mvn install
  3. $ cd ..
  4. $ cd GraphServer
  5. $ mvn assembly:assembly
  6. $ cd ..
  7. $ cd TargetImplementation: Cassandra, MYSQL or Voldemort
  8. $ mvn assembly:assembly

This creates a tar.gz with the GraphServer and the target implementation.

Configuration

Inside the tar.gz of each target implementation there are the required libraries, a windows script, a UNIX script, and a folder conf with the configuration files. This conf folder contains the:

  • log4j.properties : To configure the proper logger level.

  • policy : To define the access policy to the RMI server.

  • ublog.properties : That allows to configure several parameters of the benchmark:

    • benchmark.social.initialTweetsFactor : A initial tweet factor of n means that a user with f followers will have n x f initial tweets.
    • benchmark.social.maximumMessagesTimeline : The maximum number of tweets in the timeline of a user.
    • benchmark.social.seedNextOperation, benchmark.social.seedOwner, benchmark.social.seedTopic, benchmark.social.seedStartFollow : The random seeds used in the benchmark. If not defined, System.nanoTime() is used.
    • benchmark.social.probabilities.probabilitySearchPerTopic, benchmark.social.probabilities.probabilitySearchPerOwner, benchmark.social.probabilities.probabilityGetRecentTweets, benchmark.social.probabilities.probabilityGetFriendsTimeline, benchmark.social.probabilities.probabilityStartFollowing, benchmark.social.probabilities.probabilityStopFollowing : The probability of a given type of operation occur. Sum must be less or equal than one. The remaining probability is for new tweet operation.
    • benchmark.server.name, benchmark.server.port : The RMI server name and port.
  • A properties file to configure the data connection to the target implementation.

Cassandra

The cassandra.properties allows to configure the following parameters:

  • node : The list of nodes in the form hostname:port.
  • maxActiveConnections : The maximum number of active connections per node.
  • partitioner : The used partitioner, random or ordered.

The folder conf also contains a example of a storage-conf.xml configured to run this workload on Cassandra.

MySQL

The mysql.properties allows to configure the following parameters:

  • host.name, host.port : The host name and port of the MySQL server.
  • dbName : The database name.
  • userName : The user name.
  • password : The password.

Voldemort

The voldemort.properties allows to configure the following parameters:

  • node : The list of nodes in the form hostname:port.

The folder conf also contains a example of a stores.xml configured to run this workload.

Usage

  1. GraphServer

$ ./run.sh hostname

  1. Benchmark

$ ./run.sh sizeTotal size usernameStarter nOperations thinkTime

Where:

  • size: is the number of concurrent clients.
  • usernameStarter: The id of the first used to be emulated.
  • nOperations: is the number of total operations to be generated by the workload.
  • thinkTime: is the time between operations for each client.

Feedback

Updated source and an issue tracker are available at:

https://github.com/rmpvilaca/UBlog-Benchmark

Your feedback is welcome.

#Contact

Ricardo Vilaça (rmvilaca@di.uminho.pt)

Francisco Cruz (fmcruz@di.uminho.pt)

About

Evaluation toolkit that mimics the usage of the Twitter social network. The workload may be used with both key-value stores and relational databases.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published