Brief OpenShift troubleshooting

If you have issues after an automagic openshift-on-openstack deployment:

1. Remember: every buildconfig created *before* the registry is not authorized to push the images

2. Remember: hawkular is a java application. Startup is slow. Just click there and wait for the startup

3. Ansible is your friend. To get container logs, just


ansible all -m shell -a 'ls /var/log/containers/CONTAINER_NAME*'

ansible all -m shell -a 'cat /var/log/containers/CONTAINER_NAME*' > CONTAINER_NAME.log

4. If a container don’t startup during the deployment, a broken image may have been downloaded

Jun 1 23:30:36 dev-7-infra-0 atomic-openshift-node: I0601 23:30:36.234103 32913 server.go:608] Event(api.ObjectReference{Kind:"Pod", Namespace:"default", Name:"router-1-deploy", UID:"033670a9-470e-11e7-878f-fa163eac2bf7", APIVersion:"v1", ResourceVersion:"936", FieldPath:""}): type: 'Warning' reason: 'FailedSync' Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: Error response from daemon: {\"message\":\"invalid header field value \\\"oci runtime error: container_linux.go:247: starting container process caused \\\\\\\"exec: \\\\\\\\\\\\\\\"/pod\\\\\\\\\\\\\\\": stat /pod: no such file or directory\\\\\\\"\\\\n\\\"\"}"

Cleanup docker repo


docker ps -aq | xargs docker rm
docker rmi 90e9207f44f0 --force

5. Run oadm diagnostics on the master 😉

6. Check #oc get hostsubnet

OpenShift cockpit quickstart

Enabling openshift cockpit with the latest releases is quite simple, but requires using a local system account.

1- Install cockpit


yum install cockpit cockpit-kubernetes
iptables -A INPUT -p tcp -m state --state NEW -m tcp --dport 9090 -j ACCEPT
systemctl enable cockpit.service --now

2- Create a custom user to be used for cockpit administration


useradd -m -k /home/cloud-user cockpit
passwd cockpit #

3- access cockpit via a tunnel from the management network using the user cockpit credentials.


ssh -D11111 cloud-user@bastion
firefox http://master-ip:9090

Openshift 3.4: broken ansible dependencies

The new ansible openshift 3.4 installation playbook is very nice.

Just set deploy variables in the inventory and everything will raise from the ground magically…

Well, not immediately tough. Due to this bug you need to:

– downgrade ansible to 2.2.0.0 (the latest is 2.2.1.0)

Or the playbook will try do serialize python objects which are actually strings.

Eg. if your configuration contains:

– name: “MyServer”

Ansible looks for a MyServer() class instead of using str(“MyServer”)

Trace http calls with python-requests

Today python-requests is the de-facto standard library for rest calls.

As everything goes on TLS, you can trace api calls with the following:


import httplib as http_client
http_client.HTTPConnection.debuglevel = 1
requests_log = logging.getLogger("requests.packages.urllib3")
requests_log.setLevel(logging.DEBUG)
requests_log.propagate = True

Set command output as facts with ansible

Having to check ntp configuration on a distributed cluster, I had to parse the “`timedatectl“` output into a dict and apply various checks.

I did this via the (infamous) 😉 jinja templates|pipelines.

# This is the check_time.yml playbook.

- name: Register the timedatectl output even in check mode. This command doesn't modify server configuration.
  shell: "timedatectl | grep ': '"
  check_mode: no
  register: timedatectl_output

# Note that:
#  - to use timedatectl_output into with_items we need to QUOTE-AND-BRACE it
#  - we can default the previously indefined timedatectl_status dictionary via
#       variable | default(VALUE)
#  - 
- name: Process timedatectl_output lines one at a time and update repeatedly the timedatectl_status variable using combine().
  set_fact:
    timedatectl_status: >
      {{
        timedatectl_status | default({}) |
        combine(
          dict([ item.partition(': ')[::2]|map('trim') ])
        )
      }}
  with_items: "{{timedatectl_output.stdout_lines}}"

Now we can check 😉


- name: Clock synchronized
  fail: msg="Clock unsynchronized {{timedatectl_status}}"
  when: "{{timedatectl_status['NTP synchronized'] == 'no' }}"

- name: All hw clocks are utc
  fail: msg="hwclock not utc {{timedatectl_status}}"
  when: "{{timedatectl_status['RTC in local TZ'] == 'no' }}"


MySQL JSON fields on the ground!

Having to add a series of custom fields to a quite relational application, I decided to try the new JSON fields.

As of now you can:

– create json fields
– manipulate them with json_extract, json_unquote
– create generated fields from json entries

You can not:

– index json fields directly, create a generated field and index it
– retain the original json datatype (eg. string, int), as json_extract always returns strings.

Let’s start with a simple flask app:

# requirements.txt
mysql-connector-python
Flask-SQLAlchemy==2.0
SQLAlchemy>=1.1.3

Let’s create a simple flask app connected to a db.

import flask
import flask_sqlalchemy
from sqlalchemy.dialects.mysql import JSON

# A simple flask app connected to a db
app = flask.Flask('app')
app.config['SQLALCHEMY_DATABASE_URI']='mysql+mysqlconnector://root:secret@localhost:3306/test'
db = flask_sqlalchemy.SQLAlchemy(app)

Add a class to the playground and create it on the db. We need sqlalchemy>=1.1 to support the JSON type!

# The model
class MyJson(db.Model):
    name = db.Column(db.String(16), primary_key=True)
    json = db.Column(JSON, nullable=True)

    def __init__(self, name, json=None):
        self.name = name
        self.json = json

# Create table
db.create_all()

Thanks to flask-sqlalchemy we can just db.session 😉

# Add an entry
entry = MyJson('jon', {'do': 'it', 'now': 1})
db.session.add(entry)
db.session.commit()

We can now verify using a raw select that the entry is now serialized on db

# Get entry in Standard SQL
entries = db.engine.execute(db.select(columns=['*'], from_obj=MyJson)).fetchall()
(name, json_as_string), = first_entry  # unpack result (it's just one!)
assert isinstance(json_as_string, basestring) 

A raw select to extract json fields now:

entries = db.engine.execute(db.select(columns=[name, 'json_extract(json, "$.now")'], from_obj=MyJson)).fetchall()

(name, json_now), = first_entry  # unpack result (it's just one!)
assert isinstance(json_now, basestring) 
assert json_now != entry.json['now']  # '1' != 1 

Terraforming the clouds

Terraform is an infrastructure configuration manager by HashiCorp (Vagrant) like CloudFormation or Heat, supporting
various infrastructure providers including Amazon, VirtuaBox, …

Terraform reads *.tf and creates an execution plan containing all resources:

– instances
– volumes
– networks
– ..

You can check an example configuration here on github:

Unfortunately, it uses a custom but readable format instead of yaml.

# Create a 75GB volume on openstack
resource "openstack_blockstorage_volume_v1" "master-docker-vol" {
  name = "mastervol"
  size = 75
}

# Create a nova vm with the given colume attached
resource "openstack_compute_instance_v2" "machine" {
  name = "test"
  region = "${var.openstack_region}"
  image_id = "${var.master_image_id}"
  flavor_name = "${var.master_instance_size}"
  availability_zone = "${var.openstack_availability_zone}"
  key_pair = "${var.openstack_keypair}"
  security_groups = ["default"]
  metadata {
    ssh_user = "cloud-user"
  }
  volume {
    volume_id = "${openstack_blockstorage_volume_v1.master-docker-vol.id}"
  }
}


Further resources (eg. openstack volumes|floating_ip, digitalocean droplets, docker containers, ..)
can be defined via plugins.

At the end of every deployment cycle, terraform updates the `terraform.tstate` state file (which may
be stored on s3 or on shared storage) describing the actual infrastructure.

Upon configuration changes, terraform creates and shows a new execution plan,
that you can eventually apply.

As there’s no ansible provisioner, a terraform.py script can be used to extract an inventory file from a `terraform.tstate`.

MySQL 8.0 Innodb Cluster looks at MongoDB

MySQL turns 8.0 and the technical preview integrates a new “InnoDB Cluster”. The overall architecture reminds MongoDB:

– group replication with a single master, similar to replica-sets;
– a mysqlsh able to create replication group and local instances supporting js and python;
– a MySQL Router as a gateway to appservers, to be deployed on each client machine like the mongos.

Once installed, you can create a RG with a few commands:

su - rpolli
mysqlsh

\py  # enable python mode. Create 3 instances in  ~/sandbox-dir/{3310,3320,3330}

for port in (3310, 3320, 3330, 3340, 3350):
    dba.deploy_local_instance(port,{"password":"secret"});

Now we have 5 mysql instances listening on various ports. Create a cluster and check the newly created mysql_innodb_cluster_metadata schema.

\connect root:root@localhost:3310

cluster = dba.create_cluster('polli', 'pollikey');

\sql  # switch to sql mode

SHOW DATABASES;

| Database                      |
+-------------------------------+
| information_schema            |
| mysql                         |
| mysql_innodb_cluster_metadata |
| performance_schema            |
| sys                           |

Go back to the python mode and add the remaining instances to the cluster.

\py  # return to python mode again

# Eventually re-get the cluster.
cluster = dba.get_cluster('polli',{'masterKey':'pollikey'})  # masterKey is a shared secret between nodes.

# Add the other nodes
for port in ports[1:]:
    cluster.add_instance('root@localhost:' + str(port),'secret');

# Check status
cluster.status()  # BEWARE! The output is a str :( not a dict
{
    "clusterName": "polli",
    "defaultReplicaSet": {
        "status": "Cluster tolerant to up to 2 failures.",
        "topology": {
            "localhost:3310": {
                "address": "localhost:3310",
                "status": "ONLINE",
                "role": "HA",
                "mode": "R/W",
                "leaves": {
                    "localhost:3320": {
                        "address": "localhost:3320",
                        "status": "ONLINE",
                        "role": "HA",
                        "mode": "R/O",
                        "leaves": {}
                    },
                    "localhost:3330": {
                        "address": "localhost:3330",
                        "status": "ONLINE",
                        "role": "HA",
                        "mode": "R/O",
                        "leaves": {}
                    }
                    ....
                }
            }
        }
    }
}

Now check the failover feature.

dba.kill_local_instance(3310)  # Successfully killed

# Parse the output with...
import json
json.loads(cluster.status())["defaultReplicaSet"]["topology"].keys()  # localhost:3320 WOW!


Once set up, created users will span the whole group.

\sql
CREATE USER 'admin'@'%' IDENTIFIED BY 'secret';
GRANT ALL ON *.* TO 'admin'@'%'  WITH GRANT OPTION;

Now let’s connect to different cluster nodes.

mysql -uadmin -P3310 -psecret -e 'create database this_works_on_master;'  # OK
mysql -uadmin -P3320 -psecret -e 'create database wont_work_on_slave_even_if_admin;'  
ERROR 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement

The default setup allows writings only on master *even for admin|super users* that can be overriden as usual.

mysql> SHOW VARIABLES LIKE '%only' 
mysql> show variables like '%only';
+-------------------------------+-------+
| Variable_name                 | Value |
+-------------------------------+-------+
...
| read_only                     | ON    |
| super_read_only               | ON    |
...
+-------------------------------+-------+
mysql> set global super_read_only = OFF;  -- just for root
mysql> set global super_read_only = ON;  

mysql> set global read_only = OFF;  -- for all allowed users
mysql> 

Mongodb python driver is topology-aware. MySQL connectors instead rely on mysql-router for connecting to the right primary.

Provisioning openstack on vmware infrastructure.

As I didn’t found extensive docs about provisioning Red Hat Openstack on a vmware infrastructure, I browsed the python code.

Python is a very expressive and clear language and you can get to the point in a moment!

I then was able to create the following instack.json to power-management a set of vmware machines.

Despite the many ways to pass ssh_* variables via ironic, the right way of running it via the instack.json is to:

– use the `pm_virt_type` instead of `ssh_virt_type`;
– express the ssh_key_content in the pm_password parameter like shown in the docs;
– set capabilities like profile and boot_option directly.

The key should be json-serialized on one line, replacing CR with ‘\n’.


{
    "nodes":[
        {
            "mac":[
                "00:0c:29:00:00:01"
            ],
            "capabilities": "profile:control,boot_option:local"
            "cpu":"8",
            "memory":"16384",
            "disk":"60",
            "arch":"x86_64",
            "pm_type":"pxe_ssh",
            "pm_virt_type": "vmware",
            "pm_addr":"172.18.0.1",
            "pm_user":"vmadmin",
            "pm_password":"-----BEGIN RSA PRIVATE KEY-----\nMY\nRSA\nKEY\n-----END RSA PRIVATE KEY-----"
        },
{..other nodes..}