Terraforming the clouds

Terraform is an infrastructure configuration manager by HashiCorp (Vagrant) like CloudFormation or Heat, supporting
various infrastructure providers including Amazon, VirtuaBox, …

Terraform reads *.tf and creates an execution plan containing all resources:

– instances
– volumes
– networks
– ..

You can check an example configuration here on github:

Unfortunately, it uses a custom but readable format instead of yaml.

# Create a 75GB volume on openstack
resource "openstack_blockstorage_volume_v1" "master-docker-vol" {
  name = "mastervol"
  size = 75
}

# Create a nova vm with the given colume attached
resource "openstack_compute_instance_v2" "machine" {
  name = "test"
  region = "${var.openstack_region}"
  image_id = "${var.master_image_id}"
  flavor_name = "${var.master_instance_size}"
  availability_zone = "${var.openstack_availability_zone}"
  key_pair = "${var.openstack_keypair}"
  security_groups = ["default"]
  metadata {
    ssh_user = "cloud-user"
  }
  volume {
    volume_id = "${openstack_blockstorage_volume_v1.master-docker-vol.id}"
  }
}


Further resources (eg. openstack volumes|floating_ip, digitalocean droplets, docker containers, ..)
can be defined via plugins.

At the end of every deployment cycle, terraform updates the `terraform.tstate` state file (which may
be stored on s3 or on shared storage) describing the actual infrastructure.

Upon configuration changes, terraform creates and shows a new execution plan,
that you can eventually apply.

As there’s no ansible provisioner, a terraform.py script can be used to extract an inventory file from a `terraform.tstate`.

MySQL Cluster: dimensioning redo logs

A rule of thumb of redo log dimensioning is: enough to store all incoming create/update transaction between three LCP. In MySQL Cluster jargon, the periodical snapshot of the database is name Local Checkpoint or LCP.

That is:

2 * (DataMemory / DiskSpeed) * Incoming Thruput

During some test importing the famous Employees Database on a two-node amazon cluster I ran into the following error:

ERROR 1297 (HY000) at line 38 in file: ‘xao’: Got temporary error 410 ‘REDO log files overloaded (decrease TimeBetweenLocalCheckpoints or increase NoOfFragmentLogFiles)’ from NDBCLUSTER

It was strange, because the REDO log size was 512M, while the DB was only 130M. No, I definitely didn’t want to increase NoOfFragmentLogFiles (aka REDO log size): 512M was enough. Even decreasing TimeBetweenLocalCheckpoints was not useful – since I saturated AMZ machine I/O that was ~ 5MBps

So I started monitoring the REDO log occupation with the following command:

while sleep 5; do
mysql -e “select node_id, 100*used/total, used/1024/1024 from ndbinfo.logspaces where node_id=4;”;
done

I found that the logs – which are split in 4 logical chunks of 128M named “fragments” – were not uniformly used.
Each table was using always the same fragment, and the error was given while inserting a big table which exhausted “its” 128M chunk.

| node_id | 100*used/total | used/1024/1024 |
+---------+----------------+----------------+
|       4 |        96.7500 |   123.00000000 |
|       4 |         1.2500 |     2.00000000 |
|       4 |         1.2500 |     2.00000000 |
|       4 |         1.2500 |     2.00000000 |
+---------+----------------+----------------+

I asked to MySQL Guru Matthew Montgomery, which suggested the following reason:
“All transactions within a given LCP for a given fragment are stored within the same fragment log.”

The hints he provided follow:

Hint 1 :distribute REDO log usage, manually partition the table. This requires removing a foreign key to add the partition using ALTER TABLE:

CREATE TABLE salaries (
emp_no INT NOT NULL,
salary INT NOT NULL,
from_date DATE NOT NULL,
to_date DATE NOT NULL,
KEY (emp_no),
FOREIGN KEY (emp_no) REFERENCES employees (emp_no) ON DELETE CASCADE,
PRIMARY KEY (emp_no, from_date)
) ;
— We have to split the PARTITION statement from the CREATE one
— due to http://bugs.mysql.com/bug.php?id=67492
— Thx to Matthew “MontgoDB” Montgomery for the workaround!
ALTER TABLE salaries PARTITION BY KEY() PARTITIONS 8;

Edit: Thanks to this post, Matthew filed this bug and suggested the forementioned workaround!

Using a single statement you would have get this error from mysqld:

ERROR 1506 (HY000) at line 97: Foreign key clause is not yet supported in conjunction with partitioning

This is because the mysqld parser – which is InnoDB oriented – blocks this request because InnoDB doesn’t still support FK+Partitioning.

Obviously

Hint 2: use the max_rows directive and find a suitable value. In our case, the table should contain ~3M rows but we needed to overbook allocating space for 40M:

CREATE TABLE salaries (
emp_no INT NOT NULL,
salary INT NOT NULL,
from_date DATE NOT NULL,
to_date DATE NOT NULL,
KEY (emp_no),
FOREIGN KEY (emp_no) REFERENCES employees (emp_no) ON DELETE CASCADE,
PRIMARY KEY (emp_no, from_date)
) max_rows=40000000;

The 40M number was just heuristic: a script modified the “CREATE TABLE” statement until the inserting was successful 😉

After the change, fragment usage was doubled:

+---------+----------------+----------------+
| node_id | 100*used/total | used/1024/1024 |
+---------+----------------+----------------+
|       4 |        98.4000 |   126.00000000 |
|       4 |         1.2500 |     2.00000000 |
|       4 |         1.2500 |     2.00000000 |
|       4 |        99.2150 |   127.00000000 |
+---------+----------------+----------------+
+---------+----------------+----------------+
| node_id | 100*used/total | used/1024/1024 |
+---------+----------------+----------------+
|       4 |        57.0310 |    73.00000000 |
|       4 |         1.8750 |     3.00000000 |
|       4 |         1.8750 |     3.00000000 |
|       4 |        57.0310 |    73.00000000 |

S33 the cloud

After all that hype about cloud and Amazon services, I decided to set up a simple S3 server on my laptop. On google code I found little-s3, a funny webapp which implements the s3 protocol and I check it out.

The guide is straightforward and the protocol implemented is mainly a GET/PUT server: a free place for testing your “cloud” infrastructure.

Here are some example methods:

1- create the object foo.html.

curl –data “@foo.html” –request PUT –header “Content-Type: text/html” “http://localhost:8080/littleS3-2.1.0/firstBucket/foo.html”

2- Retrieve an object

curl “http://localhost:8080/littleS3-2.1.0/firstBucket/foo.html”

3- Delete an object
curl –request DELETE “http://localhost:8080/littleS3-2.1.0/firstBucket/foo.html”

4- Delete a bucket
curl –request DELETE “http://localhost:8080/littleS3-2.1.0/firstBucket”