One of the more nerve-racking tasks as a storage admin is performing upgrades to your storage arrays. Through the years I’ve done a few upgrades to my NetApps and even when following directions on how to do it, I still worry that something isn’t going to go right and I’ll be left restoring a lot of data. I don’t exactly love working in the CLI either which adds to the nerves.
While working with this Tegile HA2400 we had a software update available (126.96.36.199.140802 to 188.8.131.52.140925) and it seemed like a great time to document this process. Software updates are done through the web interface and are done in just a few clicks on each controller. With failovers that allow for minimal interruption, I was able to perform this upgrade towards the end of the working day and we never had any application interruptions.
Below are the steps to peform this system upgrade.
1. If running in Active/Passive, login to the web interface of the passive Zebi node (default credentials are admin/tegile)
2. Verify that it’s the passive node by viewing the available pools. If there are no pools running on this node, you will only see “Zebi System” as the pool name
3. Click on “Settings” then “Administration”
4. On the left side, click “System Upgrade”
5. Click the link for “Check for Upgrades”
6. If there are any available updates, they will appear next to “Update Available”
7. Click the “Upgrade” button, click “Upgrade Local” and then click “OK” to confirm upgrading to the latest version
8. The installation will begin and show the status of the tasks it is performing followed by a notification that the node is rebooting.
9. After the node has rebooted, log back in to the web interface
10. Click on the Node name in the top right corner to verify the new version is running
11. Click the Flag icon in the top right and then the “ACK” button for the upgrade events that are generated.
12. Click on the Node name again and then click “Go to peer node” (this will open a new tab to connect to the other node in the cluster)
13. Click on “Settings” and then “HA”
14. Click “Switch Over All Resources” and click “OK” to confirm
15. Once you receive this message on controller A, all resources have been migrated
16. Click on “Settings” then “Administration”
17. Click on “System Upgrade” and then ensure that the “Update Available” version matches the version applied to the partner node
18. Click the “Upgrade” button, then click “Upgrade Local” (not that it recognizes the peer has already been upgraded) and click “OK”
19. The current task status will display just as before and then you’ll be notified once the node is rebooting just as before
20. After the reboot, log back in to the web interface and click the node in the top right corner to verify the version
21. Click on “Settings” and then “HA”
22. After the last upgrade, all the resources sitting on Controller B moved back to Controller A and now Controller B shows standby
That is all there is to it. The whole process from start to finish was under 15 minutes (I think closer to 10 if I didn’t screenshot the whole process). The steps for an active/active setup would be essentially the same, but you would move all the resources off one controller and on to the other prior to performing the first upgrade. Interestingly, despite not having auto failback enabled (Settings -> HA -> Advanced Options) after the upgrade completed all the resources that were on controller B moved back to controller A. During the next upgrade I will see if that happens again or was just a fluke this time around. I might even do that upgrade with a heavier load on the box just to see what happens.