Bye, bye VMware ESXi: Field report on the migration to Proxmox

[German]Are you tired of being bullied by VMware with regard to virtualization solutions or is the discontinuation of VMware ESXi already foreseeable? Many users are flirting with Proxmox as an alternative. A blog reader took a look at it and sent me his impressions of a test. I'll post the whole thing here in the blog.


Advertising

Since Broadcom took over VMware and cleaned up its product portfolio, many administrators have been thinking about switching from VMware ESXi Server to the Proxmox virtualization environment. A few weeks ago, blog reader Jochen mentioned in my report on BlueSky that he was testing Proxmox. I then asked him if he could provide me with a few keywords once the tests were complete so that I could prepare a post for the readership. Jochen has now complied and sent me some impressions this week.

The initial situation

Jochen's starting point is that he has been using VMware ESXi servers for himself and his customers in the K(M)U sector for many years. Most recently, however, due to the size of the installations, only the free version of the VMware ESXi server was used. He and his customers are still using VMware ESXi Server versions 6.5, 6.7 and 7.0.

The hardware used is an HPE ML 350 Gen9 server with 2 * E5-2667 v3 CPUs and 128 GB RAM memory. Over the years, the SAS datastores have been migrated to SSD. An outdated Fujitsu 1U rack server with an E3-1270 v3 CPU and 32 GB RAM was available as an emergency backup server.

What VMs are running?

Now you can ask yourself what is running in 34 VMs. In addition to the systems for our own business, customer monitoring and other things such as home automation, this is also related to the hobby of amateur radio and the operation of a free WLAN network (MOEWA-NET), writes Jochen. "So a few VMs accumulate for many small services. And as an IT company, you naturally also have TEST systems at the ready."

Depending on their importance, all VMs were backed up daily to weekly to a QNAP NAS with NFS using XSIBackup software. And the restore to the backup server was tested regularly. The employees of the company that Jochen works for implemented this in a similar way for the customers, sometimes with an additional online backup.


Advertising

The exit from Broadcom/VMware is imminent

Jochen writes: "We have occasionally looked at other hypervisors, such as Hyper-V, Proxmox, VirtualBox, XEN … However, we didn't see any advantages over VMware ESXi until 2021, so we continued to pursue this route."

After the purchase of Broadcom went through, alarm bells started ringing in the company. After all, Broadcom's approach is well known in the industry, says Jochen – and I also mentioned it here in the blog. So the IT service provider had to come up with an ESXi exit strategy for the company itself and its customers.

In this context, various hypervisor systems were tested again. The backup server was used initially, as the company was familiar with its performance.

The first question was then: "Do we want to switch from one dependency to another, i.e. switch to Hyper-V?" The answer was clear: "No, we don't want to, especially as Hyper-V is also on the MS hit list and MS's behavior gives us food for thought every day." With this premise, only the free systems remained, of which Jochen and his colleagues took a closer look at Xen and Proxmox.

Stayed with Proxmox 8

Jochen describes the result of the tests as follows: "We stuck with Proxmox 8 VE, the hypervisor has left the best impression since the last tested version 5. The Proxmox forum, partly in German, also makes a very good and helpful impression."

The employees of the IT service provider spent three weeks testing Proxmox. The aim was not just to switch from VMware ESXi to Proxmox. They also wanted to use new features after the switch. As the company had not been using VMware V-Sphere for years, they had to take a fresh look at failover and HA strategies, which are included in Proxmox VE.

Proxmox cluster for testing

The employees of the IT service provider then put together Proxmox clusters from discarded PCs and ran through various scenarios. Experience has shown that this really took some time (it ran alongside the normal business) and led to some aha effects. In any case, writes Jochen, we gained the following insights, which we have now implemented.

1) Hardware raid or ZFS software raid
2) RAM configuration
3) Hard disk/SSD configuration
4) Test Windows, Linux and other clients such as Mikrotik CHR and OpenWRT
5) USB hardware (dongle, audio cards, etc.) can be passed through as with ESXi

According to Jochen, with VMware ESXi you had to be extremely careful to ensure that the storage controller was properly supported and that hardware raid could also be used functionally by ESXi. To do this, you had to use a real raid controller with BBU in any case, otherwise there was no performance.

Using Proxmox with ZFS

If you want to use Proxmox with ZFS (Zero File System), then the storage controller (required under ESXi) has to be removed again and you use simple HBA controllers to pass the storage through, according to Jochen's experience. There are controllers that can be switched cleanly, he writes, some can be reflashed (but this is out of the question for customer use), but ZFS does not work in the raid variant.

Question: Does it have to be ZFS with Proxmox?

Jochen writes about this: Of course you can also use ext4 or LVM storage (from hardware raid), and according to our tests this works well, even with suitable performance. But at some point you get to the point where you would like to have failover, replication or HA, even if only to try it out. And that's the point at which you need ZFS. You could also use btrfs, but Jochen and his colleagues have not tested that. And ZFS needs RAM, namely a few GB to make it really fast.

The migration

"Once we had clarified the hypervisor and the hardware, we started the migration," writes Jochen. There are many descriptions and reports online, including on the Proxmox website itself. According to Jochen's observations, this information has increased significantly in recent weeks, but not all of it matches the current Proxmox 8 version.

The service provider's migration strategy for its own purposes, including for customers in the near future, is as follows. As the IT service provider does not want to use new server hardware, an intermediate step is needed. Jochen describes it like this:

  • 'We use an older HP Z240 i7 workstation with 2*1 TB SSD for OS and 2*2 TB SSD data storage and 32GB RAM.
  • An additional network card was also installed. The system runs Proxmox 8 with ZFS storage.
  • We also used a QNAP with 6 TB Raid1 and NFS sharing.
  • The Proxmox intermediate server also gets access to the NFS on the QNAP via a dedicated network port, which is the same solution we used for the ESXi!

The migration itself then proceeded relatively quickly as follows. Yes, you can optimize one or the other point, but better two backups (once ESXi and then Proxmox) than no backup.

1) Shut down the virtual machine.
2) Backup the complete VM to the QNAP (via shell or GUI) NFS Share
3) Create a new VM in parallel on the intermediate server with the required hardware parameters, discard the disk created
3a) If necessary, use the MAC address of the ESXi for the network card. This preserves DHCP-set addresses
3b) Windows: Add CDROM with the Proxmox drivers (virtio-win-0.1.173.iso)
4) Import the ESXI .vmdk disk from the Proxmox shell
5) Windows: Add the disk to the VM as IDE !!! and set the boot order
5) Linux: Add the disk as a VirtIO disk, set the boot order and WriteBack Cache here as well
6) Start VM, with Linux everything usually works. Install QEMU-GUEST
7) The following steps for Windows:
7a) Log in and uninstall the VMWARE client, reboot !
7b) Log in and install the virtio client package complete with QEMU-GUEST, Reboot !
7c) Log in and check in the device manager whether all hardware has been recognized
7d) Now add a virtIO disk to this VM during operation (Windows must have seen such a disk once)
7e) Shut down the VM, delete the auxiliary disk, now add the main disk as a virtIO device with WriteBack Cache
8) Start VM, test, it should actually run correctly again

Windows loses its activation and some software (such as Lexware) too, Jochen has observed. Since you usually have the licenses, reactivation is not a problem. The current virtio drivers do not work with Windows 7, so the employees of the IT service provider use the virtio-win-0.1.137.iso version

The employees have now done this with all VMs. Due to the limited main memory, RAM allocations were sometimes shortened and only the important VMs were started. "That worked just fine," they say. Jochen describes the experience of his own migration as follows:

  • After all 34 VMs were finished in this way, we switched off the HPE ML 350 Gen9 main server and put the SSDs aside (you never know).
    We then switched the controller to HBA and installed and configured Proxmox 8 VE in ZFS mode. Both Proxmox servers also have access to the QNAP via a dedicated network card and NFS file system.

Now all that remained was to move all the VMs from the intermediate server back to the main server. According to Jochen, there are two ways to do this:

1) Switch off the VM on the intermediate server, back up to NFS and then restore on the main server and restart.
2) If ZFS is used: Make a cluster from both Proxmox and move the VM during operation.

Jochen used the second variant for the important VMs. It takes longer, but there is no further downtime, writes the reader. Apart from the last point, the migration of the 34 own VMs with a net disk volume of approx. 1.5 TB took a weekend. The return transport from the intermediate server to the main server during operation took another day or so.

Jochen writes: "There are certainly optimizations to shorten the downtime, for example by running the ESXi VM on the NFS of the QNAP instead of on internal storage. But without V-Sphere there would also be a downtime, which could be done at night."

Jochen's advice is to think about the CPU type that a new Proxmox VM will have when moving. The IT staff have already seen during the test that there are small, but manageable, differences in performance in their test scenario. Jochen writes about this:

  • "The fastest is the CPU type "host", which passes the CPU directly through, is the experience from the tests.
  • After some tests in advance, however, we opted for x86-64-v2-AES, where the performance loss is not so high.

However, the advantage of x86-64-v2-AES is that you can do HA and failover, even if you have servers with different CPUs. This was the case internally in the company with the i7-based intermediate server. It also prevents the software from having to be constantly activated when the CPU type is changed again, writes Jochen.

The conclusion after the migration

Jochen draws the following conclusion: "After 10 days of error-free operation of 27 of the existing VMs, we have no regrets about the changeover. On the contrary, the Proxmox web interface is fast and fun, and the new features are also fun for us. We are still struggling with a few ancient VMs, but these are historical 16-bit systems that I had kept for nostalgic reasons.

Of course, we immediately reactivated the backup (and tested it). We are currently working with the Proxmox internal vzdump on the QNAP via NFS. Incidentally, the speed is more than double that of the ESXi. As mentioned above, the connections between the servers and the QNAP run via dedicated gigabit network lines."

What remains is to deal with HA, replication and failover. But Jochen hasn't done that yet either. And then people have to get to grips with the Proxmox Backup Server (PBS). This would also be another step forward in terms of incremental backups and shorter recovery times, says the reader.

"If you move the systems live, you must have dedicated network lines between the servers, otherwise the running systems will slow down significantly. The same is recommended for the NAS connection for backups. In addition, 10 GBits would be nice, that's also on the to-do list." Jochen describes his experiences.

At this point, my thanks to Jochen, who has offered to provide a few more details (system commands) and screenshots relating to VM migration if it is helpful. However, I would like to wait for the feedback from the readership – perhaps there will be specific questions to which Jochen can then respond with details and screenshots, if possible for him.

Similar articles:
Broadcom acquires VMware for 61 billion US-$
Broadcom plans to sell VMware end-user computing and carbon black businesses
Contracts for all VMware partners terminated by Broadcom for 2024
VMware OEM portal offline, customers cannot activate VMware licenses
Broadcom ends perpetual licenses for VMware products – End of the free ESXi server?
Statement from Broadcom on issue after Symantec acquisition
Symantec acquisition by Broadcom ends in license/support chaos
After discontinuation: VMware Player, Workstation and Fusion seems to remain
Microsoft survey on virtualization: Migration from VMware
Private equity firm KKR buys VMware end customer business for 4 billion dollars
VMware product portfolio: Licensing internals; and Lenovo has been out since Feb. 27, 2024


Cookies helps to fund this blog: Cookie settings
Advertising


This entry was posted in Virtualization and tagged . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *