Syntax Highlighter

Thursday, August 5, 2010

Howto: Build and scale a Cassandra cluster in five minutes

Introduction

In the spirit of the software recipes we've been posting recently, I decided to give Cassandra a shot. Cassandra is a distributed key-value store that was open-sourced by Facebook in 2008 and is now under the umbrella of the Apache foundation. It advertises high-performance and robustness to individual node failures while providing eventual consistency. It's been receiving quite a bit of attention recently, and Riptano has appeared to provide commercial support.

Cassandra is a bit of an odd name for a piece of software in my opinion. Cassandra is a figure from Greek mythology, who was both blessed with the ability to see the future and the curse that no one would ever believe her predictions. I struggled for a while to come up with some hilarious software-equivalent joke, but I think we are all better off without one.

I spent a bit of time and installed Cassandra on a Copper virtual cluster. I then wrote a few quick scripts to automatically have clones join a Cassandra cluster when they are created automatically. GridCentric Copper supports persistent local storage through an awesome VFS abstraction. As with Hadoop, I've omitted this part of the setup for now to simplify this post, but I reserve the right to make that post in the near future.

Cassandra does not have different classes of nodes, so I had no need to run anything special on the master of the virtual cluster. Cassandra only requires a seed node, so that a freshly started instance can learn about the others. I use the master for this purpose, since we can assume that it's always around. Other than that, this article details an unmodified Cassandra installation running inside a virtual cluster -- with transient nodes that can be created and destroyed in seconds.

Requirements

A GridCentric Copper deployment (free trial available here, software downloadable here).

Just as in my Hadoop post, I'll assume that you have a virtual cluster already. If you don't know how to create a virtual cluster, you can see the tutorials. I just used the cluster-in-a-box script to fire up a new Ubuntu cluster.

Installing Packages

Cassandra needs a Java6 JRE installed and configured. In Ubuntu, I just did an apt-get install sun-java6-jre, but the exact steps will vary depending on your distribution. The next step is to grab the latest Cassandra packages. Choose a mirror from
http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.6.4/apache-cassandra-0.6.4-bin.tar.gz. Once I had downloaded the latest Cassandra, I extracted the contents and moved the directory to /opt, so I could make a nice little system that corresponds to the Linux Standard Base (or at least closer).
wget <url>
tar -zxvf apache-cassandra-0.6.4-bin.tar.gz
mv apache-cassandra-0.6.4 /opt

Cassandra is actually ready to go with the default configuration and your Java install can be tested at this point. Simply run /opt/apache-cassandra-0.6.4/bin/cassandra -f and you should see it start up. You can press Ctrl-C to kill it, because we want to write a few more scripts and make a cluster.

Tweaking files and writing Scripts

Okay, so one annoying thing about Cassandra is that you need to specify an address inside the configuration file for binding. This doesn't lend itself well to any kind of cluster environment and it actually confused me -- an interface would make much more sense here. We need to abstract away this detail so we can tune the bind address on the clone. In order to do this, we copy /opt/apache-cassandra-0.6.4/conf/storage-conf.xml to /opt/apache-cassandra-0.6.4/conf/storage-conf.xml.in and change the following keys in our new storage-conf.xml.in:
...
  <Seeds>
      <Seed>127.0.0.1</Seed>
  </Seeds>
  ...
  <ListenAddress>localhost</ListenAddress>
  ...
to:
...
  <Seeds>
      <Seed>10.0.0.1</Seed>
  </Seeds>
  ...
  <ListenAddress>PRIVATE_ADDRESS</ListenAddress>
  ...
I changed the Seed key because in our virtual cluster I am going to use the master as the seed. I changed the ListenAddress to PRIVATE_ADDRESS because we are going to write a quick script that replaces that token and generates a unique storage-conf.xml on each clone with the correct addresses.

In order to write that script, I need a quick helper. First, I create a script named get-private-ip and put it in /usr/local/bin.
#!/bin/bash
ifconfig eth0| grep inet | cut -d: -f2| cut -d' ' -f1

Next, I create
recreate-cassandra-config in /usr/local/bin to actually grab the IP we want, and recreate storage-conf.xml from our tokenized storage-conf.xml.in.
#!/bin/bash
cd /opt/apache-cassandra-0.6.4
address=`get-private-ip`
cat conf/storage-conf.xml.in | sed -e "s/PRIVATE_ADDRESS/$address/" > conf/storage-conf.xml

Finally, because I am not going to be using persistent local storage (saving for a future post), I create a script clear-cassandra-data in /usr/local/bin in order to clear the data directory. We will run this on clones before kicking Cassandra.
#!/bin/bash
rm -rf /var/lib/cassandra/data/*

I'm not a fan of hacking ways of starting and stopping services. Before running my mini-Cassandra cluster, I threw together a script that lets me start and stop nicely (at least somewhat nicely). I put this in /etc/init.d/cassandra and then run update-rc.d cassandra defaults in order to install the appropriate symlinks from /etc/init.d/rc*.d.
#!/bin/bash
#
# Simple Cassandra init script.
#
# chkconfig: 345 99 99
# description: Cassandra

### BEGIN INIT INFO
# Provides: 
# Required-Start: $network
# Required-Stop: 
# Should-Start: 
# Should-Stop: 
# Default-Start: 3 4 5
# Default-Stop: 0 1 2 6
# Short-Description: start and stop cassandra daemon
# Description: Cassandra
### END INIT INFO

PROG="/opt/apache-cassandra-0.6.4/bin/cassandra -f"
PROGNAME="Cassandra"
LOGFILE=/var/log/cassandra.log
PIDFILE=/var/run/cassandra.pid

running(){
    PID=`cat $PIDFILE 2>/dev/null`

    # Check that the pid is sane.
    if [ "x$PID" == "x" ] ; then
        return 1
    fi

    # Check that the process is alive.
    ps $PID >/dev/null 2>&1 || return 1

    # Looks okay.
    return 0
}

start(){
    echo -n $"Starting $PROGNAME: "

    # Try to start the program.
    if running; then
        echo "Failed.  Maybe remove $PIDFILE?"
        return 1
    fi

    mkdir -p `dirname $LOGFILE`
    $PROG > $LOGFILE 2>&1 &
    PID=$!
    mkdir -p `dirname $PIDFILE`
    echo $PID > $PIDFILE

    echo "Success."
    return 0
}

stop(){
    echo -n $"Stopping $PROGNAME: "

    # Check if it's already stopped.
    if ! running ; then
        echo "Failed.  Already stopped."
        return 1
    fi 

    # Find the PID and kill it.
    PID=`cat $PIDFILE 2>/dev/null`
    if [ "x$PID" == "x" ] ; then
        echo "Failed."
        return 1
    fi
    # (Try five times to kill it).
    for i in `seq 0 5`; do
        kill $PID
        sleep 1
        if ! running ; then
            break
        fi
    done

    # Check if it is finished.
    if running ; then
        echo "Failed."
        return 1
    fi

    # Clear out the pidfile.
    echo "Success."
    rm -f $PIDFILE
    return 0
}

restart(){
    stop
    start
}

status(){
    echo -n $"Status of $PROGNAME: "

    if running ; then
        echo "Running."
        return 0
    else
        echo "Not running."
        return 1
    fi
}

# See how we were called.
case "$1" in
    start)
 start
 RETVAL=$?
 ;;
    stop)
 stop
 RETVAL=$?
 ;;
    status)
 status
 RETVAL=$?
 ;;
    restart)
 restart
 RETVAL=$?
 ;;
    *)
 echo $"Usage: $0 {start|stop|status|restart}"
 RETVAL=2
esac

exit $RETVAL

Tying everything together, I create a simple post-clone script that will stop Cassandra, reconfigure, reset the data directory, and start Cassandra. This I will put in /etc/gridcentric/post-clone-on-clone.d/cassandra so that it runs on every clone immediately after creation. Recall that we set the Cassanda seed to point to the master, so after cloning each clone will automatically join the ring and discover the others.
/etc/init.d/cassandra stop
recreate-cassandra-config
clear-cassandra-data
/etc/init.d/cassandra start

Creating the cluster

We're finally ready to create the cluster! To start a single instance of Cassandra on the master, we simply run:
/etc/init.d/cassandra start
Now I am free to use nodetool to query Cassandra and start storing data, etc. As an example, you can run /opt/apache-cassandra-0.6.4/bin/nodetool -h localhost ring. This shows the current nodes that are in the Cassandra cluster (should be just one). In order to automatically grow this cluster, we simply clone this virtual machine. Here, I'll create a cluster of size 5:
gc clone 4 # Creates a cluster of size 5.
Now check out /opt/apache-cassandra-0.6.4/bin/nodetool -h localhost ring!

Check out the video below.  I start just before starting Cassandra and creating the cluster. I show the ring continuously after starting Cassandra. This is all in real-time (except when I pause, so you can read the labels!), including the cloning of virtual machines.



Update: Sorry for the somewhat poor quality of the video. I'll see if I can get a higher quality version uploaded soon so that it's clear what's going on. Note that you can click on the video and it will take you to Youtube.

0 comments:

Post a Comment