The ability to run a virtual machine with a self-contained computing environment has major advantages. Users can
- Choose the operating system that’s best for the application
- Execute programs that require elevated privileges
- Install any software they need
- Dynamically configure machine attributes like the number of cores to suit the host environment
The Engage VO is beginning to see researchers new to OSG whose default mode of operation is to spin up a VM on EC2. They quickly get used to having complete control of the computing environment.
This capability has been explored for the Open Science Grid before. Clemson built Kestrel which supports KVM based virtualization with an XMPP communication architecture. Then STAR used Kestrel with great success. Clemson now also provides OneCloud based on OpenNebula.
There’s also been work by Brian Bockelman, Derek Weitzel and others to configure virtual machines running Condor to join the submit host’s pool. Infrastructure background for that work and lots of great information is available at the team’s blog.
Recently, I’ve had new Engage users who are heavy users of virtualization. As mentioned before, they tend to assume control over the environment. This background can make the need to specially prepare executables for the OSG by static compilation and other packaging seem onerous. Many Engage users, it should be added, have input and output file sizes in the low number of gigabytes and are not familiar with High Throughput Computing or a command line approach to virtualization.
They asked if it was possible to run virtual machines on the OSG so I set out to look for an approach that would allow researchers to
- Create virtual machines on their desktops using simple graphical tools
- Deploy virtual machines onto the OSG
- Transfer input files to and from the virtual machine
- Avoid complex interactions with HTC plumbing like configuring X.509 certs, Condor, etc.
Virtualization on RENCI-Blueberry
Since virtualization is not a currently supported technology on the OSG, step one is to create a small area in which we change that. RENCI’s Blueberry is a new cluster made of older machines that we’ve recently brought online. It’s a ROCKS, Centos 5.7, Torque/PBS cluster with a small number of virtualization capable nodes.
Here’s an overview of changes we made to the cluster:
First, we installed these packages on the virtualization capable nodes:
We configured QEMU to allow the engage user to execute virsh.
- Job Requirements: && (CAN_RUN_VIRTUAL_MACHINE == TRUE)
- Job Attributes: +RequiresVirtualization=TRUE
A new resource was added to the GlideinWMS factory with RSL pointing at the virt queue on RENCI-Blueberry.
Creating and Running a VM
Virtual machines were created using virt-manager, the Virtual Machine Manager. It’s a graphical application providing a wizard like interface for creating and managing VMs.
We used the command line virsh tool to export an XML description of the running virtual machine. Then the XML description and the disk image (the large file containing all of the VMs data) were moved to the Engage submit host.
The OSG job was designed to
- Download the virtual machine’s XML description
- Determine the number of CPUs on the machine
- Modify the XML description to specify
- The appropriate number of CPUs
- The correct location for the VM image file
- Download the virtual machine
- Execute the virtual machine
This works. Jobs configured to run in the GlideinWMS virt group on the Engage submit node map to glideins on RENCI-Blueberry. There, the jobs download the XML config and the image, make the needed edits and spawn the virtual machine.
Getting Work to the Virtual Machine
Now, if you’ve tried to do this kind of thing before, you realize this is where things get tricky.
When the virtual machine launches, it has no idea what to do. This is part of the reason that some previous approaches put Condor on the machine. That way, it can join an existing Condor pool and has all the good things that Condor brings us in terms of file transfer, matching and so on. But getting credentials into the virtual machine securely to allow it to join the Engage pool is tricky. If you know how to do that, please leave a comment.
Alternatives … and OS Versions
Now, in principle, there are two other ways to do this that would work fine. If we can get files onto and off of the machine, it would be Ok to transfer them into and out of the worker node the old fashioned way – globs-url-copy. So here are two other mechanisms for file exchange between a host and a guest:
Shared Host/Guest Filesystem: More recent versions of Libvirt/QEMU/KVM that support sharing filesystems between the host and guest. In this model, the guest’s XML description can specify a directory on the host that should be mounted within the guest. But, as I mentioned, the RENCI-Blueberry cluster runs CentOS 5.7. As such, only a significantly older version of the virtualization stack is supported. We discussed upgrading to a newer version but that would prevent this solution from being generally reproducible on OSG.
Guestfish: Next, there’s libguest and the associated interactive shell guestfish. Guestfish lets you mount a disk image in user space. That is, there’s no need to use root privileges. It also has convenient wrapper scripts for copying a file into and out of an image. But, again, it requires a version of CentOS significantly greater than 5.7.
From this angle, it looks like VMs on OSG could be an every day occurrence if it were not for very low OS version numbers.
File Sharing REST API
Before giving the approach up for dead, I decided to try something off the beaten path.
Beanstalk: I installed beanstalk on the submit node. Beanstalk is a very simple HTTP based queue. You can put messages on the queue and get them off. You can name queues – which it refers to, weirdly – as tubes. Beanstalk does not have a notion of authentication so that’s not great.
Beanstalkc: This is the Python client for Beanstalk.
Box.net: One of many file sharing sites with a REST API.
A command line Box.net authentication script does token negotiation with Box.net mostly from the command line using wget and curl.
Then, Box.net URLs are published into the Engage event queue.
When virtual machines run, they install and run the boot script.
It installs the Beanstalk client and reads a single item from the event queue which it downloads from Box.net and processes. Queues are appropriately named so that different users and jobs never collide.
Finally, it converts the download URL to an upload URL and publishes the results of the run via the file sharing API.
So at the end of my run of 3 VM’s on RENCI-Blueberry, there were three files waiting for me at Box.net.
I invite comments on how others have secured communication to a VM on OSG. I’d love to hear.
In particular, as mentioned above, I’d love to hear how others have gotten X.509 credentials onto a VM in this environment.
Anyone else running VMs on the OSG?