MySQL 8.0 Innodb Cluster looks at MongoDB

MySQL turns 8.0 and the technical preview integrates a new “InnoDB Cluster”. The overall architecture reminds MongoDB:

– group replication with a single master, similar to replica-sets;
– a mysqlsh able to create replication group and local instances supporting js and python;
– a MySQL Router as a gateway to appservers, to be deployed on each client machine like the mongos.

Once installed, you can create a RG with a few commands:

su - rpolli

\py  # enable python mode. Create 3 instances in  ~/sandbox-dir/{3310,3320,3330}

for port in (3310, 3320, 3330, 3340, 3350):

Now we have 5 mysql instances listening on various ports. Create a cluster and check the newly created mysql_innodb_cluster_metadata schema.

\connect root:root@localhost:3310

cluster = dba.create_cluster('polli', 'pollikey');

\sql  # switch to sql mode


| Database                      |
| information_schema            |
| mysql                         |
| mysql_innodb_cluster_metadata |
| performance_schema            |
| sys                           |

Go back to the python mode and add the remaining instances to the cluster.

\py  # return to python mode again

# Eventually re-get the cluster.
cluster = dba.get_cluster('polli',{'masterKey':'pollikey'})  # masterKey is a shared secret between nodes.

# Add the other nodes
for port in ports[1:]:
    cluster.add_instance('root@localhost:' + str(port),'secret');

# Check status
cluster.status()  # BEWARE! The output is a str :( not a dict
    "clusterName": "polli",
    "defaultReplicaSet": {
        "status": "Cluster tolerant to up to 2 failures.",
        "topology": {
            "localhost:3310": {
                "address": "localhost:3310",
                "status": "ONLINE",
                "role": "HA",
                "mode": "R/W",
                "leaves": {
                    "localhost:3320": {
                        "address": "localhost:3320",
                        "status": "ONLINE",
                        "role": "HA",
                        "mode": "R/O",
                        "leaves": {}
                    "localhost:3330": {
                        "address": "localhost:3330",
                        "status": "ONLINE",
                        "role": "HA",
                        "mode": "R/O",
                        "leaves": {}

Now check the failover feature.

dba.kill_local_instance(3310)  # Successfully killed

# Parse the output with...
import json
json.loads(cluster.status())["defaultReplicaSet"]["topology"].keys()  # localhost:3320 WOW!

Once set up, created users will span the whole group.

CREATE USER 'admin'@'%' IDENTIFIED BY 'secret';

Now let’s connect to different cluster nodes.

mysql -uadmin -P3310 -psecret -e 'create database this_works_on_master;'  # OK
mysql -uadmin -P3320 -psecret -e 'create database wont_work_on_slave_even_if_admin;'  
ERROR 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement

The default setup allows writings only on master *even for admin|super users* that can be overriden as usual.

mysql> SHOW VARIABLES LIKE '%only' 
mysql> show variables like '%only';
| Variable_name                 | Value |
| read_only                     | ON    |
| super_read_only               | ON    |
mysql> set global super_read_only = OFF;  -- just for root
mysql> set global super_read_only = ON;  

mysql> set global read_only = OFF;  -- for all allowed users

Mongodb python driver is topology-aware. MySQL connectors instead rely on mysql-router for connecting to the right primary.

Provisioning openstack on vmware infrastructure.

As I didn’t found extensive docs about provisioning Red Hat Openstack on a vmware infrastructure, I browsed the python code.

Python is a very expressive and clear language and you can get to the point in a moment!

I then was able to create the following instack.json to power-management a set of vmware machines.

Despite the many ways to pass ssh_* variables via ironic, the right way of running it via the instack.json is to:

– use the `pm_virt_type` instead of `ssh_virt_type`;
– express the ssh_key_content in the pm_password parameter like shown in the docs;
– set capabilities like profile and boot_option directly.

The key should be json-serialized on one line, replacing CR with ‘\n’.

            "capabilities": "profile:control,boot_option:local"
            "pm_virt_type": "vmware",
            "pm_password":"-----BEGIN RSA PRIVATE KEY-----\nMY\nRSA\nKEY\n-----END RSA PRIVATE KEY-----"
{..other nodes..} 

FullText Indexing IPv6 addresses with MySQL 5.7

MySQL 5.7 supports generated fields. This is particularly useful for searching the string representation of numeric stored ip addresses:

ip varbinary(16) not null,
hostname varchar(64) not null,
label varchar(64),
ip_ntoa varchar(64) generated always as (inet6_ntoa(ip)) STORED, -- generate and store fields with the address representation
fulltext key (hostname, ip_ntoa, label)

When inserting values

INSERT INTO catalog(ip,hostname,label) VALUES
(inet6_aton(''), 'localhost', 'lo'),
(inet6_aton(''), 'gimli', 'stage,ipv4'),
(inet6_aton('fdfe::5a55:caff:fefa:9089'), 'legolas', 'router,ipv6'),
(inet6_aton('fdfe::5a55:caff:fefa:9090'), 'boromir', 'router,ipv6')

you can search in OR mode with

SELECT hostname FROM catalog WHERE
  MATCH(ip_ntoa, hostname, label)
  AGAINST('9089 router');
-- returns every entry matching ANY needle
hostname: legolas
hostname: boromir

Or exact matches

SELECT hostname FROM catalog WHERE
  MATCH(ip_ntoa, hostname, label)
  AGAINST('+9089 +router' in boolean mode);
-- returns ONE entry matching ALL needles
hostname: legolas

Adding docker images to openshift 3.1

Openshift 3.1 is based on Kubernetes and Docker, and provides a small set of images including jboss EAP 6.4.

You can add new images in two steps:

1- create an ImageStream, that’s a docker image + a set of labels
2- create a Template using that ImageStream

To create the ImageStream read carefully the following description.

# Create the ImageStream
oc create -f - <<EOF
apiVersion: v1
kind: ImageStream
  name: wildfly9-openshift
  namespace: openshift        # Set this to "openshift" if you want to make this image globally visible
  dockerImageRepository:  # The original docker hub repo
  - annotations:
      description: Wildfly 9.0 S2I images.
      iconClass: icon-jboss
      sampleRef: 9.0.x 
      supports: wildfly:9,javaee:7,java:8,
      tags: builder,javaee,java,jboss
      version: "1.0"
    name: "1.0"
  dockerImageRepository: ""

Back from MongoDB Essentials Training

This week I joined MongoDB Essentials training in Roma.

Mongo is a fast document oriented database supporting consistency, replication for HA and sharding for scaling read OR writes.

Transactions are at document level, so no joins and isolation levels.

A nice training – covering many database design technologies and giving even a theoretical overview of performance and scalability issues.

Concepts like Working Set, Replication types and issues, Indexes side-effects, Sharding and Hashing were introduced both theoretically and pratically.

Being a class of 10+ people with mixed background (MS, Linux, Oracle) it was hard to squeeze all this theory and practice in 3 days. The instructor asked us the parts we’d like to cover more: we pick Schema Design, Replication and Sharding.
Besides, such a large class give us a lot discussion and networking opportunity: we even created a freenode chatroom!

People interested in the subject can drop me a line and have a look at this github repo

Enjoy! R.

docker multihost network: an epiphany of namespaces.

Playing with docker multihost network this week-end.

With multihost networking you can run communicating containers on different docker nodes.
The magic relies on:
– a shared kv store (Eg. consul) for ipaddresses;
– a netns for vxlan for communication with a bridge and no processes attached.

Every network created using the Overlay driver has its own network namespace.
And for every network (& its subnet combination), we create a linux bridge inside that dedicated namespace.
The host end of the veth pair is moved into this namespace and attached to the bridge (inside of that namespace).
Hence, if you look for the veth pair in the host namespace, you wont find any :-).

If you look for vxlan setup on the boot2docker distro you have to dig deep ;).
1- docker netns is stored in /var/run/docker/netns. To access it you need to

#ln -s /var/run/docker/netns /var/run;

2- Now you can look for the vxlan netns, which has the same id on every machine:

#ip netns ls | while read a; do
    ip netns exec $a ip l | grep vxlan -q && echo $a;done

The vxlan references the UDP port for communication (eg. dstport 46354).

87: vxlan1:  mtu 1500 qdisc noqueue master br0 state UNKNOWN mode DEFAULT group default
    link/ether da:69:8d:4d:b9:39 brd ff:ff:ff:ff:ff:ff promiscuity 1
    vxlan id 256 srcport 0 0 dstport 46354 proxy l2miss l3miss ageing 300

3- Every container with EXPOSEd ports has a veth paired with a veth in the vxlan netns;

4- the veth in vxlan netns are slaves of br0;

5- br0 has an ip, and is the default gw for containers.

My Fedora is getting fat…

Monitor your Fedora disk usage:

abrt-cli [list|rm]
journald --disk-usage
[sudo] yum clean all

Store vm|docker data outside /var

virsh pool-edit default
OPTIONS="-g /data/docker"

Enable virsh for `wheel` users (from

sudo tee -a /etc/polkit-1/rules.d/80-libvirt.rules << EOF
# Don't ask password if user is in the wheel group.
polkit.addRule(function(action, subject) {
  if ( == "org.libvirt.unix.manage" && subject.local && && subject.isInGroup("wheel")) {
      return polkit.Result.YES;


# Set context.
chcon --reference=/etc/polkit-1/rules.d /etc/polkit-1/rules.d/80-libvirt.rules