Use ALOM type commands on an ILOM firmware server

Sun/Oracle servers give access to different CLIs to manage hardware settings and the console.

In this case, we’ll have a look at two common CLIs found in most servers: ILOM and ALOM.

ILOM is a newer CLI, it supports a wider range of commands and it doesn’t require a reset to the Service Processor to commit changes. ALOM, which is found on “older” servers, on the contrary is simplier and user friendlier. A little example is console access: on ILOM you have to type:

-> start /SP/console

while on ALOM you just use this command:

sc> console

When operating with ILOMs, you have the chance to switch to the ALOM CLI (through backwards compatibility), which is not a commonly known fact.

Why would you do that? Well, one reason is because Oracle Support personnel happen to instruct you to run ALOM commands on servers with ILOM.

They do not even tell you how to do that.

So, let’s roll and see how:

Login to the SP as root user, as usual.

XXXXXXXXXXXXXXXXXX login: root
Password:
Waiting for daemons to initialize…

Daemons ready

Integrated Lights Out Manager

Version 2.0.4.n

Copyright 2008 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.

Warning: password is set to factory default.

We have to create an administrative account (whatever name is fine, but we’ll stick with the standard admin user) and assign the CLI mode to alom.

-> create /SP/users/admin role=Administrator cli_mode=alom
Creating user…
Enter new password: ********
Enter new password again: ********
Created /SP/users/admin

If the user admin with the Administrator role already exists, you need only to change the CLI mode and (optional) reset its password.

-> create /SP/users/admin role=Administrator cli_mode=alom
create: /SP/users/admin already exists
Create failed

-> set /SP/users/admin cli_mode=alomSet ‘cli_mode’ to ‘alom’
Set ‘cli_mode’ to ‘alom’

-> set /SP/users/admin password
Enter new password: ********
Enter new password again: ********

Now you can login again to the ILOM, this time use the admin account:

XXXXXXXXXXXXXXXXXX login: admin
Password:
Waiting for daemons to initialize…

Daemons ready

Sun(TM) Integrated Lights Out Manager

Version 2.0.4.X

Copyright 2008 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.

sc>

The sc> prompt indicates you are using the ALOM shell, and you can use most of its commands.

Vsphere: VM with RDM migration across Virtual Datacenters

Usually, a migration of a virtual machine from one VMware Datacenter to another is a piece of cake job. You just have to present the LUNs which contain the VM data to the new Esx servers, and you can cold migrate the VM in a matter of minutes, or even hot migrate it using a few tricks (with no downtime at all).

There’s only one thing one may not consider in the equation: the pesky Raw Device Mapping disk attached to the VM.
What is a RDM disk? It’s a LUN just mapped to the VM, without a VMFS, its pointers stored along the VM in a special VMDK file.

So you’ve just powered off your VM, browsed the datastore in the destination datacenter, and added the VM to the Inventory. As you try to power on, an error comes before you:

“Virtual disk ‘Hard Disk X’ is a mapped direct-access LUN that is not accessible.”

The RDM strikes with nonsense. You may have already checked the storage for the correct presentation, and Vsphere for visibility of the LUN: it’s all there. Why should it be not accessible?

Easier to say than to discover at first, the problem is the different LUN ID in the destination datacenter. Lun presentation, as a matter of fact, follows a numerical order and VSphere uses the specific LUN ID to map the disk to the VM.

In my case, the RDM LUN ID in the source datacenter was 23, while it was 49 on the destination.

You can check the source ID in the “Physical LUN and Datastore Mapping File” area in the RDM disk properties in the VM Settings. There are many ways you can check the correspondence in the destination datacenter, via both Vsphere and command line. In Vsphere, the easier way is to check in Configuration > Storage > view: Device and sort by the size of the disk: mine was 1.7TB, easy to spot. If you have a trickier, more common size, you have to identify and compare the UID of the disk with commands such as “esxcli storage core path list”.

So, there are ways to solve this problem but VMware’s proposed solution is actually the less favorable, as they ask to remove the LUN and present it with the correct ID. But in my case ID 23 was already in use.
It would be better to just map the RDM to the VM with the new ID, but VSphere won’t even allow to see the RDM from the Add Disk Wizard: those already initialized are filtered by default.

What we’re going to do is disable the LUN filtering so we can attach the same LUN with the destination ID.
In the vSphere Client, select Administration > vCenter Server Settings > Advanced Settings.
Then add the following key and value: config.vpxd.filter.rdmFilter; false
Click Add, and click OK.
Now, in the VM properties, first of all detach the source ID LUN from the VM, then click Add > Hard Disk > Raw Device Mappings and select the correct LUN.

The RDM is now properly attached, and the VM will finally boot on the destination datacenter.

Unable to login as a user on a 4.1 ESX server

By default, a 4.1 ESX server denies logins of standard users, while root access via ssh is enabled without problems. This has changed from 4.0 and has caused many headaches for those systems upgraded to 4.1.

Obviously, this is a security problem and something we do not want.

To protect your ESX server and restore standard user access, you have to replace the system-auth config file. In this event, an older 4.0 version of the file will do the job. Always remember to make a backup just in case something goes wrong (if it does and you don’t notice..you’re screwed, so pay attention)

#vi /etc/pam.d/system-auth

paste this content inside the file:

#%PAM-1.0
# Autogenerated by esxcfg-auth

account    required    /lib/security/$ISA/pam_unix.so

auth          required    /lib/security/$ISA/pam_env.so
auth          sufficient           /lib/security/$ISA/pam_unix.so        likeauth nullok
auth          required    /lib/security/$ISA/pam_deny.so

password    requisite     pam_cracklib.so try_first_pass retry=3 dcredit=-1 ucredit=0  ocredit=-1 lcredit=-1 minlen=8
password           required    /lib/security/$ISA/pam_cracklib.so            retry=3
password           sufficient           /lib/security/$ISA/pam_unix.so        nullok use_authtok md5 shadow
password           required    /lib/security/$ISA/pam_deny.so

session      required    /lib/security/$ISA/pam_limits.so
session      required    /lib/security/$ISA/pam_unix.so

You can now login to your 4.1 ESX server using standard login. Now go and harden your server!

Easy vmdk file system extension under redhat (LVM)

As a VMware sysadmin, you may be asked to extend a file system not by adding another virtual disk, but by extending the vmdk itself.

Such operation is relatively risk free but has to be done with the VM powered down, so the first thing to do is shut down the VM.

Log in any esx in your cluster, go to the vm datastore path and run:

# vmkfstools -X nnG vmname.vmdk

In such command, nn represents the new size of the disk, and obviously vmname is the name of your VM.

Now, you have to mount any redhat CD/DVD to your VM, power it on, and boot it from CD/DVD.

Run the installer with the command: linux rescue

Obtain the recovery shell but skipping any network and disk mounting options, then we’re ready to go.

Assuming you will be extending the root disk, you will have to do the following with fdisk:

# fdisk /dev/sda

remove sda2 partition(d, 2)
create  new sda2 partition(n, p, 2)
change partition type in Linux LVM (t, 2, 8e)
write changes and exit (w)

Now, let’s resize the physical partition to the max:

# lvm pvresize /dev/sda2

If you want to check if everything is ok, run:

# lvm pvdisplay

Now, activate the Logical Volumes:

# lvm vgchange -a y

If you want to check the configured Logical Volumes, run:

# lvm lvdisplay

It is mandatory to fscheck the integrity of the Logical Volume to be extended:

#e2fsck -f /dev/VolGroup00/LogVol00

Now, we can extend the Volume Group:

lvm lvextend -L+10G /dev/VolGroup00/LogVol00

And then  the file system:

resize2fs /dev/VolGroup00/LogVol00

Finished, reboot the system and that’s it!

Release the lock from a hung vm on Vmware

After an HA event or a network/storage outage with VMware ESX servers (3.5/4.1 alike), you may have a situation in which the VM is down and cannot be powered on, even if you try to migrate it, or to deregister/register it again on the Virtual Center.
On closer inspection, you might notice that the vswp file is still on the VM folder (a sign the VM might be still active somewhere), yet you cannot delete the file because it is “locked”. Actually, one of the ESX in the cluster owns the lock, even if the VM is not running.
So, how to understand what to do with several hosts in the cluster? Let’s find out.
First of all, we have to know which esx is preventing the poweron.
Log in whatever esx, and run:

tail -f /var/log/vmkernel &

Now go to the locked VM datastore, and try to run:

cat vmname.vmdk

You should get some errors referring to the lock, but, more importantly, some vmkernel logs, such as:

Apr 5 09:45:26 Hostname vmkernel: 17:00:38:46.977 cpu1:1033)Lock [type 10c00001 offset 13058048 v 20, hb offset 3499520
Apr 5 09:45:26 Hostname vmkernel: gen 532, mode 1, owner 45feb537-9c52009b-e812-00137266e200 mtime 1174669462]
Apr 5 09:45:26 Hostname vmkernel: 17:00:38:46.977 cpu1:1033)Addr <4, 136, 2>, gen 19, links 1, type reg, flags 0x0, uid 0, gid 0, mode 600
Apr 5 09:45:26 Hostname vmkernel: 17:00:38:46.977 cpu1:1033)len 297795584, nb 142 tbz 0, zla 1, bs 2097152
Apr 5 09:45:26 Hostname vmkernel: 17:00:38:46.977 cpu1:1033)FS3: 132:

Now, that part identifies the host locking the file. That bold part is nothing but the MAC Address of the ESX!
Now, to the boring part: you have to login in every esx of the cluster and check if any network card matches this MAC:

/sbin/ifconfig -a |grep -i 00:13:72:66:e2:00

As soon as identified, the host should be placed in maintenance from the Virtual Center (DRS should do all the work for migrating the virtual machines) and the rebooted. This will release any lock and allow the VM to be finally powered on.