Go to apache cassandra installation folder > tools > bin > cassandra-stress
Below is a simple stress test script that prints out the stats to a file in Ubuntu with dstat installed.
dstat -t -r -m -s -d -c -y -p > /root/$1-vmstat.log &
./cassandra-stress -c 10 -S 100 -n 30000000 -i 1 -o INSERT > /root/$1-stress.log
dstat -t -r -m -s -d -c -y -p > /root/$1-read-vmstat.log &
./cassandra-stress -c 10 -S 100 -n 30000000 -i 1 -o READ > /root/$1-read-stress.log
dstat -t -r -m -s -d -c -y -p > /root/$1-rangeslice-vmstat.log &
./cassandra-stress -c 10 -S 100 -n 30000000 -i 1 -o RANGE_SLICE > /root/$1-rangeslice-stress.log
Pretty self explanatory.
Below are some references taken from http://www.datastax.com/docs/1.1/references/stress_java with regards to the stress tool
The cassandra-stress tool is a Java-based stress testing utility for benchmarking and load testing a Cassandra cluster. The binary installation of the tool also includes a daemon, which in larger-scale testing can prevent potential skews in the test results by keeping the JVM warm.
There are different modes of operation:
- Inserting: Loads test data.
- Reading: Reads test data.
- Indexed range slicing: Works with RandomParititioner on indexed column families.
You can use these modes with or without the cassandra-stressd daemon running (binary installs only).
- Packaged installs: cassandra-stress [options]
- Binary installs: <install_location>/tools/bin/cassandra-stress [options]
The available options are:
|Generate column values of average rather than specific size.|
|Number of unique values stored in columns. Default is 50.|
|Number of columns per key. Default is 5.|
|Size of column values in bytes. Default is 34.|
|Specifies which compaction strategy to use.|
|Specifies which column comparator to use. Supported types are: TimeUUIDType, AsciiType, and UTF8Type.|
|Specifies the compression to use for SSTables. Default is no compression.|
|Consistency level to use (ONE, QUORUM, LOCAL_QUORUM, EACH_QUORUM, ALL, ANY). Default is ONE.|
|Type of index to create on column families (KEYS).|
|Perform queries using CQL (Cassandra Query Language).|
|–family-type <TYPE> -y <TYPE>||Sets the column family type.|
|Write output to a given file.|
|Ignore errors when inserting or reading. When set, –keep-trying has no effect. Default is false.|
|Retry on-going operation N times (in case of failure). Use a positive integer. The default is 10.|
|Number of keys to per call. Default is 1000.|
|Nodes to perform the test against. Must be comma separated with no spaces. Default is localhost.|
|File containing host nodes (one per line).|
|Set replicate_on_write to false for counters. Only for counters with a consistency level of ONE (CL=ONE).|
|Number of different keys. If less than NUM-KEYS, the same key is re-used multiple times. Default is NUM-KEYS.|
|Number of keys to write or read. Default is 1,000,000.|
|Operation to perform: INSERT, READ, INDEXED_RANGE_SLICE, MULTI_GET, COUNTER_ADD, COUNTER_GET. Default is INSERT.|
|Thrift port. Default is 9160.|
|The interval, in seconds, at which progress is output. Default is 10 seconds.|
|Comma-separated list of column names to retrieve from each row.|
|Use random key generator. When used –stdev has no effect. Default is false.|
|Replication Factor to use when creating column families. Default is 1.|
|Replication strategy to use (only on insert and when a keyspace does not exist.) Default is: SimpleStrategy.|
|Sends the command as a request to the cassandra-stressd daemon at the specified IP address. The daemon must already be running at that address.|
|Fraction of keys to skip initially. Default is 0.|
|Standard deviation. Default is 0.1.|
|Replication strategy properties in the following format: <dc_name>:<num>,<dc_name>:<num>,… For use with NetworkTopologyStrategy.|
|Number of threads to use. Default is 50.|
|Use unframed transport. Default is false.|
|(CQL only) Perform queries using prepared statements.|
Using the Daemon Mode¶
Usage for the daemon mode in binary installs:
<install_location>/tools/bin/cassandra-stressd start|stop|status [-h <host>]
During stress testing, you can keep the daemon running and send it commands through it using the --send-to option.
- Inserts 1,000,000 rows to given host:
/tools/bin/cassandra-stress -d 192.168.1.101
When the number of rows is not specified, one million rows are inserted.
- Read 1,000,000 rows from given host:
tools/bin/cassandra-stress -d 192.168.1.101 -o read
- Insert 10,000,000 rows across two nodes:
/tools/bin/cassandra-stress -d 192.168.1.101,192.168.1.102 -n 10000000
- Insert 10,000,000 rows across two nodes using the daemon mode:
/tools/bin/cassandra-stress -d 192.168.1.101,192.168.1.102 -n 10000000 --send-to 188.8.131.52
Interpreting the output of cassandra-stress¶
The cassandra-stress tool periodically outputs information about the running tests. For example:
Each line reports data for the interval between the last elapsed time and current elapsed time, which is set by the --progress-interval option (default 10 seconds). The following explains this information:
- total: the total number of operations since the start of the test.
- interval_op_rate: the number of operations performed during the interval.
- interval_key_rate: the number of keys/rows read or written during the interval (normally be the same as interval_op_rate unless doing range slices).
- latency: the average latency for each operation during that interval.
- elapsed: the number of seconds elapsed since the beginning of the test.