This tutorial will walk you through running your first “Hello World” job on cfncluster.
If you haven’t yet, you will need to following the getting started guide to install cfncluster and configure your CLI.
First, we’ll verify that cfncluster is correctly installed and configured.
$ cfncluster version
This should return the running version of cfncluster. If it gives you a message about configuration, you will need to run the following to configure cfncluster.
$ cfncluster configure
Now it’s time to create our first cluster. Because our workload isn’t performance intensive, we will use the default instance sizes of t2.micro. For production workloads, you’ll want to choose an instance size which better fits your needs.
We’re going to call our cluster “hello-world”.
$ cfncluster create hello-world
You’ll see some messages on your screen about the cluster creating. When it’s finished, it will provide the following output:
Starting: hello-world
Status: cfncluster-hello-world - CREATE_COMPLETE
Output:"MasterPrivateIP"="192.168.x.x"
Output:"MasterPublicIP"="54.148.x.x"
Output:"GangliaPrivateURL"="http://192.168.x.x/ganglia/"
Output:"GangliaPublicURL"="http://54.148.x.x/ganglia/"
The message “CREATE_COMPLETE” shows that the cluster created successfully. It also provided us with the public and private IP addresses of our master node. We’ll need this IP to log in.
You’ll use your OpenSSH pem file and the ec2-user to log into your master instance.
ssh -i /path/to/keyfile.pem ec2-user@54.148.x.x
Once logged in, run the command “qhost” to ensure that your compute nodes are setup and configured.
[ec2-user@ip-192-168-1-86 ~]$ qhost
HOSTNAME ARCH NCPU NSOC NCOR NTHR LOAD MEMTOT MEMUSE SWAPTO SWAPUS
----------------------------------------------------------------------------------------------
global - - - - - - - - - -
ip-192-168-1-125 lx-amd64 2 1 2 2 0.15 3.7G 130.8M 1024.0M 0.0
ip-192-168-1-126 lx-amd64 2 1 2 2 0.15 3.7G 130.8M 1024.0M 0.0
As you can see, we have two compute nodes in our cluster, both with 2 threads available to them.
Now we’ll create a simple job which sleeps for a little while and then outputs it’s own hostname.
Create a file called “hellojob.sh” with the following contents.
#!/bin/bash
sleep 30
echo "Hello World from $(hostname)"
Next, submit the job using “qsub” and ensure it runs.
$ qsub hellojob.sh
Your job 1 ("hellojob.sh") has been submitted
Now, you can view your queue and check the status of the job.
$ qstat
job-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
1 0.55500 hellojob.s ec2-user r 03/24/2015 22:23:48 all.q@ip-192-168-1-125.us-west 1
The job is currently in a running state. Wait 30 seconds for the job to finish and run qstat again.
$ qstat
$
Now that there are no jobs in the queue, we can check for output in our current directory.
$ ls -l
total 8
-rw-rw-r-- 1 ec2-user ec2-user 48 Mar 24 22:34 hellojob.sh
-rw-r--r-- 1 ec2-user ec2-user 0 Mar 24 22:34 hellojob.sh.e1
-rw-r--r-- 1 ec2-user ec2-user 34 Mar 24 22:34 hellojob.sh.o1
Here, we see our job script, an “e1” and “o1” file. Since the e1 file is empty, there was no output to stderr. If we view the .o1 file, we can see any output from our job.
$ cat hellojob.sh.o1
Hello World from ip-192-168-1-125
We can see that our job successfully ran on instance “ip-192-168-1-125”.