The title is clickbait. Did anyone upgrade 100 VIOSes in one day? I don’t know. I didn’t do it. I usually try to reduce the number of upgrades to a manageable number. I think the maximum number I did was something like 20. Could I do more? Yes, definitely. The procedure below is simple and you can start it in parallel for 100 or 200 VIOSes. It takes time but not so much as you may fear.
Let’s dive into it.
Assumptions
I assume that you read my previous newsletters. I described there how I do backups, remove old drivers, and install service packs on VIOS. Just a short list of some previous articles:
This is my more or less usual procedure for upgrading VIOS. I execute all of the previous steps one after another. The today’s step is the last in the procedure.
Oops, I'm lying. Sorry. It is not the last step. There is one more step, at least I usually do. It is updating the microcode on the adapters, but it is not today.
Prerequisites
You need an mksysb image with the new VIOS. It can be an official image from IBM’s site or your own. The image must be available on some NFS share. The NFS server is defined in the variable nfs_server
. The NFS share is defined in the variable nfs_repo
. And the image name is defined in the variable mksysb
.
You must know the version of your mksysb image. It is defined in the variable target_level
.
Your VIOS must have an additional disk for the upgrade. We define the disk name in the variable alt_hdisk
.
I hope you have all prerequisites and did all previous steps mentioned before. We can go deeper.
Check VIOS version
As usual, we first check the VIOS version.
- name: Get current VIOS version
ansible.builtin.command:
cmd: /usr/ios/cli/ioscli ioslevel
register: ioslevel
changed_when: false
If the VIOS is already upgraded, we can quit without doing anything.
- name: Check if we already upgraded
when: "ioslevel.stdout is version(target_level, '>=')"
block:
- name: Print message
ansible.builtin.debug:
msg: "VIOS is already upgraded to {{ ioslevel.stdout }}. Aborting upgrade."
- name: Abort upgrade
meta: end_play
Check the available disk for the upgrade
As you remember, we defined our disk in the alt_hdisk
variable. We must check if the disk really exists. We update the facts first:
- name: Update facts
ansible.builtin.setup:
gather_subset: devices
We have actual information about all devices now. If the disk exists, it must be in the Available state:
- name: Check that hdisk for alt_rootvg exists
meta: end_play
when: ansible_facts.devices[alt_hdisk].state != 'Available'
Before going further, we must clean up the disk:
- name: Clean up hdisk for alt_rootvg
ansible.builtin.command:
cmd: "chpv -C {{ alt_hdisk }}"
If you used the disk earlier, adding commands to remove old_rootvg or old altinst_rootvg before cleaning up the disk may be a good idea.
Copy configuration files
We usually have at least two additional files we need before an upgrade:
list of configuration files to preserve
post-upgrade script
Please check out Jaqui Lynch's article about the VIOS upgrade. She published a list of her important files there.
My list of important files is smaller:
/etc/environment
/etc/netsvc.conf
/etc/resolv.conf
/etc/hosts
/etc/inittab
/etc/ntp.conf
If you need a post-upgrade script or not, depends on your environment and your VIOS image. This is part of my post-upgrade script for the reference:
#!/bin/ksh
/usr/ios/cli/ioscli license -accept
/usr/ios/cli/ioscli rules -o modify -t disk/fcp/mpioosdisk -a reserve_policy=no_reserve
mkuser roles=PAdmin,CacheAdm,FSAdmin,pkgadm,SysBoot,isso default_roles=PAdmin,CacheAdm,FSAdmin,pkgadm,SysBoot,isso ansible
chuser su=true root
echo 'ansible:abc123' | chpasswd -c
echo 'root:abc123' | chpasswd -c
echo 'padmin:abc123' | chpasswd -c
mkdir -p /home/ansible/.ssh
chmod 0700 /home/ansible/.ssh
echo 'ssh-rsa AAAA... ansible-key' >/home/ansible/.ssh/authorized_keys
chmod 0600 /home/ansible/.ssh/*
chown -R ansible:staff /home/ansible/.ssh
niminit -a name=$(hostname | cut -f1 -d.) -a master=nim -a master_port=1058 || true
I use the official mksysb image for my upgrade and have to perform some additional tasks in the script:
Accept license and set device parameters.
Create an Ansible user and set all required passwords and keys. Otherwise, the automation fails directly after starting the upgrade.
Re-register VIOS as NIM client.
Copying of the files is easy:
- name: Create /tmp/filelist with configuration files to preserve
ansible.builtin.copy:
src: filelist
dest: /tmp/filelist
- name: Create our post-upgrade script
ansible.builtin.copy:
src: postupgrade.sh
dest: /tmp/postupgrade.sh
mode: "0755"
Mount NFS and check the image
The last check is to make sure that the image exists. First mount the NFS share:
- name: Mount NFS repository with mksysb
ibm.power_aix.mount:
state: mount
node: "{{ nfs_server }}"
mount_dir: "{{ nfs_repo }}"
mount_over_dir: /mnt
Now get information about the image:
- name: Check that mksysb image exists
ansible.builtin.stat:
path: "/mnt/{{ mksysb }}"
register: mksysb_stat
Fail if the image is not found:
- name: Clean up if mksysb image doesn't exist
when: not mksysb_stat.stat.exists
block:
- name: Print message
ansible.builtin.debug:
msg: "The mksysb image {{ nfs_server }}:{{ nfs_repo }}/{{ mksysb }} was not found. Aborting upgrade."
- name: Unmount NFS repository
ibm.power_aix.mount:
state: umount
mount_over_dir: /mnt
- name: End execution
meta: end_play
Remove old log files
I had a problem when I developed and tested the playbook. The log files grew, and I couldn’t find any useful information anymore. Worse, I had some systems that were upgraded from 2.2 to 3.1. I decided that I didn’t need information about previous upgrades.
- name: Remove old upgrade logs
ansible.builtin.file:
path: "{{ item }}"
state: absent
loop:
- /home/ios/logs/viosupg_status.log
- /home/ios/logs/viosupg_restore.log
Start viosupgrade
Everything is fine. We can start viosupgrade.
- name: Start viosupgrade
ibm.power_vios.viosupgrade:
cluster: false
filename: /tmp/filelist
post_install_binary: /tmp/postupgrade.sh
image_file: "/mnt/{{ mksysb }}"
mksysb_install_disks: "{{ alt_hdisk }}"
wait_reboot: true
As you see, we use all the information we collected and checked previously. The only side note here - I don’t use shared storage pools on VIOS side. That’s why the parameter cluster is set to false. If you use shared storage pools and have information on what to check or do additionally for the upgrade, write it down in the comments.
It is easy to find another system administrator today, but it is difficult to find an engineer who can build robust and scalable automation!
Want to become the most valuable professional in your company? Learn to build robust and scalable automation! Yes, it takes time and money, but be sure—the investment pays off. Read more and sign up for the program!
Wait!
Yes, we must wait till viosupgrade is finished. Of course you can ignore waiting. You started the viosupgrade and when it finishes, it finishes.
My wait procedure consists of two tasks. I first wait for a “connection”:
- name: Wait till VIOS is up
ansible.builtin.wait_for_connection:
delay: 300
sleep: 30
timeout: 2400
become: false
vars:
ansible_python_interpreter: /usr/bin/python3
Then I check that viosupgrade is successfully completed:
- name: Wait till viosupgrade is finished
ansible.builtin.wait_for:
path: /home/ios/logs/viosupg_status.log
search_regex: "|COMPLETED"
become: false
vars:
ansible_python_interpreter: /usr/bin/python3
In my experience, if the first task succeeded, the second task was almost always successful too in the same second. But not always. That’s why these are two different tasks.
As you may notice, I use become: false and the ansible_python_interpreter variable in these two tasks. I don’t need root privileges in these two tasks, and I wouldn’t probably have them even if I did. Remember, I use the official image, which doesn’t have any idea about my needs.
The Ansible module wait_for_connection doesn’t only wait for an SSH connection to the host but also tries to check if Python is available and can be used. That’s why it is important to show where it can find the Python interpreter. The second task needs Python interpreter anyway. But the official image has Python interpreter under /usr/bin/python3. If you upgrade VIOS from 3.1 to 4.1 your previous Python interpreter was /opt/freeware/bin/python3, which doesn’t exist on the upgraded system. We must specify the path to the new interpreter.
Your turn!
As I already wrote, I don’t use some VIOS features like shared storage pools. I assume that the system doesn’t have any fixes because I remove them earlier during service pack installation. I don’t remove the rests of the old_rootvg or altinst_rootvg, because I don’t have them when I upgrade VIOS.
But you may have them. That’s why I am interested in what problems you had upgrading VIOS and how you solved them. I hope I don’t do anything wrong here (I didn’t get any permission to do it) and refer to the playbook, that another IBM Champion Mark Steele sent to me earlier this week. He makes some black magic to determine which files he should preserve during the upgrade and copy viosbr backup back to his Ansible controller node. You can also learn how to check for altinst_rootvg correctly from his playbook. His notes are very helpful too.
Have fun upgrading VIOS!
Andrey
Hi, I am Andrey Klyachkin, IBM Champion and IBM AIX Community Advocate. This means I don’t work for IBM. Over the last twenty years, I have worked with many different IBM Power customers all over the world, both on-premise and in the cloud. I specialize in automating IBM Power infrastructures, making them even more robust and agile. I co-authored several IBM Redbooks and IBM Power certifications. I am an active Red Hat Certified Engineer and Instructor.
Follow me on LinkedIn, Twitter and YouTube.
You can meet me at events like IBM TechXchange, the Common Europe Congress, and GSE Germany’s IBM Power Working Group sessions.
Nice automation, personally we like d to upgrade sequencetional. Reason for that we cannot afford io timeouts. Therfore we first disable the npiv pathts for the partitions. Other reason we are with a small team and if something goes wrong, we do not have hands enough.
But you're right Andrey thats my fear.