How To Set Up Cassandra Cluster in Less Than 2 Minutes

by Shahid Ashraf

In this article, I’ll walk you through the process to build a Cassandra cluster. Cassandra, is a highly scalable open source database system that achieves great performance when setup with multiple-nodes – even on different data centers. For the tutorial below, I’ll be using 4 Ubuntu 12.04 Linux machines that I’ve setup on Linode.

1. First Things First

There are a few prerequisites that you’ll need to make sure are on your boxes before you begin installing Apache Cassandra.

  •      Java 1.6 or higher
  •      Python
  •      Python fabric module

Installing Java :

sudo apt-get update -y"
sudo apt-get install software-properties-common python-software-properties -y
sudo add-apt-repository ppa:webupd8team/java -y
sudo apt-get update –y
echo oracle-java8-installer shared/accepted-oracle-license-v1-1 select true | sudo /usr/bin/debconf-set-selections
sudo apt-get install oracle-java8-set-default –y

Installing Cassandra:

wget -O - http://debian.datastax.com/debian/repo_key | sudo apt-key add -")
echo "deb http://debian.datastax.com/community stable main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
sudo apt-get update
sudo apt-get install cassandra –y

2. Configure the Cassandra config file(cassandra.yaml):

It includes setting up following configurations,

seeds: 
Listen_address
Rpc_address

2. Making things Easy:

In order to setup the cluster, I have created a code snippet, which uses python fabric module. It will setup the cluster and as well configure it. Clone it from here git@github.com:shahidash/Cassandra-Cluster.git.

from fabric.api import *
from fabric.contrib.files import exists
from fabric.api import env
import sys
from StringIO import StringIO
__author__ = "shahid@trialx.com"
 
config_file = """clone git repo for code”””
 
def install_java(hostip,pswd):
	env.user = "root"
	env.host_string = hostip
	env.password =pswd
	run("sudo apt-get update -y")
	run("sudo apt-get install software-properties-common python-software-properties -y")
	run("sudo add-apt-repository ppa:webupd8team/java -y")
	run("sudo apt-get update -y")
	run("echo oracle-java8-installer shared/accepted-oracle-license-v1-1 select true | sudo /usr/bin/debconf-set-selections")
	run("sudo apt-get install oracle-java8-set-default -y")
 
 
def install_cassandra(hostip,pswd):
	env.user = "root"
	env.host_string = hostip
	env.password =pswd
	run("wget -O - http://debian.datastax.com/debian/repo_key | sudo apt-key add -")
    run("""echo "deb http://debian.datastax.com/community stable main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list""")
	run("sudo apt-get update")
	run ("""sudo apt-get install cassandra -y """)
 
def setup_cluster(hostip,pswd,seeds):
	env.user = "root"
	env.host_string = hostip
	env.password =pswd
	text = config_file.format(','.join(seeds),ip,ip)
	obj = StringIO(text)
    put(obj,"/etc/cassandra/cassandra.yaml")
	run("sudo service cassandra stop")
	sudo("rm -rf /var/lib/cassandra/data/system/*")
	run("sudo service cassandra start")
 
 
 
if __name__ == '__main__':
	env.forward_agent = True
    env.disable_known_hosts = True
   host_passwords = “password”
	cluster_ip_list = []	#Put list of ip address on which you want to setup the cluster, in my case it were four nodes.
 
	for ip in cluster_ip_list:
        install_java(ip, "password")
        install_cassandra(ip, "password")
        setup_cluster(ip, host_passwords, seeds=cluster_ip_list[0:2])

You have successfully set up Cassandra cluster on your system!

Conclusion

In this blog post i described the steps to manually create a cassandra cluster and also provided the python script which eases out all manual work and enables to create cassandra cluster in less the two minutes.

 

One thought on “How To Set Up Cassandra Cluster in Less Than 2 Minutes”

  1. Pingback: kursus online

Leave a Reply

Your email address will not be published. Required fields are marked *

Data Science & PopHealth

Methods, tools, systems for healthcare data analysis

Contact us now

Popular Posts