The OSG Compute Element – A Trip Report

In this post I’ll cover a few details, in hindsight, of setting up the Blueridge CE.

There is plenty of documentation on how to set up an OSG compute element. Some of it is referenced from other posts on this blog. However, I’ve gotten a few questions about the experience of setting up the compute element for the RENCI Blueridge cluster.

So this account is a mixture of general CE setup concepts with a description of how we experienced them with the Blueridge CE.

Big Picture

The general idea of a CE is that there’s a machine which acts as a specialized OSG gateway or portal into a cluster. It prepares the cluster to receive, execute and report statistics regarding OSG jobs. To put it differently, a CE is the basic nuts and bolts mechanism that makes a cluster part of the OSG.

This is accomplished by installing and configuring the Virtual Data Toolkit (VDT) software to communicate with the local batch scheduling system native to the cluster. Common batch schedulers include Condor, Torque/PBS and SGE. OSG clusters are generally understood as being in one of three tiers. The tier system designates clusters as Tier-1, Tier-2 or Tier-3. This is in order of diminishing expectations of reliability and the availability of advanced services.

On Blueridge

But details matter. Here I’ll enumerate a few items either not directly addressed in the planning and setup guides I’ve seen or which turned out to have a prominent effect on the process. Naturally, there’s a really good chance none of these will be an issue for your cluster – but some of it might be helpful to someone.

Cluster Management Framework

Blueridge is a ROCKS cluster running CentOS-5. ROCKS is a Linux cluster management suite. It allows administrators to simplify the creation and maintenance of compute clusters. As with most tools, the cluster administrators have developed over time a series of best practices and policies for provisioning compute and storage resources. It’s important to understand how the CE installation process interacts with these standards.

For example, it was decided to use ROCKS to create a virtual machine for the CE gateway. That is, the machine is patterned after a standard Blueridge node so that its operating system and installed software can be centrally managed.

Machine Requirements

The CE virtual machine is set up along these lines. Bear in mind that you want more space than what’s required for a single installation of the VDT. There needs to be room to upgrade and the reasonable way to do this is to leave the old version completely intact and install and test the new. If the new version is unsatisfactory after some probation period, shut it down and start up the old one. This is only possible if you have space for a couple. There are also log files and other things to consider.

  • OS: CentOS-5
  • VDT install: About 5GB reserved
  • On an NFS share called /osg:
    • The OSG Worker Node Client (WN) install: 3GB
    • The certificate authority (CA) certificates installation: 2GB
    • The OSG_DATA area: 500GB
    • The OSG_APP area: About 300GB

Supported Platform

Among other things this decision selects CentOS-5 as the platform. Fortunately this is a supported platform of the VDT stack. You’ll want to verify that your platform is supported before venturing too far.

Early Installation Problems

Here’s an overview of some basic problems encountered during and after installation:

Certificates: OSG security is implemented via a certificate based public key infrastructure. As such, pretty much nothing will work without (a) first getting host and http service certificates as the documentation describes. When I got these, the host certificate worked without a hitch. For reasons unknown the command line method of retrieving the host cert failed with an error after repeated attempts. I eventually used the web browser interface and copied the cert from that. This approach worked fine.

EDG Mkgridmap vs LDAP: The VOMS authentication method was used in preference to GUMS because, as a Tier-3 site, VOMS is simpler and adequate for our needs. This method uses /etc/grid-security/grid-mapfile as its main database for mapping grid user identities (DNs) to local Unix users. In at least one circumstance it was observed that an existing process on the cluster periodically overwrote this file. This was a result of  a previous configuration of the cluster to automatically deliver this file to support a pre-existing Globus gateway to the cluster which was for internal use rather than OSG use. Naturally, for the CE to work, this automated overwriting needs to stop on the CE machine.

Trusted CA Location and Method: While there is a good deal of documentation on installing a Trusted CA, it is a complicated subject in part because there are so many options. Without delving into each, I chose to install the Trusted CA in a shared location. That is, it’s on an NFS share. This way, when the automated processes that are installed as part of the VDT update certificates and certificate revocation lists (CRLs) both the worker nodes and CE machines see the updates without further intervention. Without narrating the details of the exact issues I saw, suffice to say that you should understand which locations in the VDT (/$VDT_LOCATION/globus/shared for example) and which locations in the worker node client need to point to the Trusted CA so that links can be established appropriately.

Worker Node Client: This is software used by grid jobs at the compute nodes. As such, it needs to be installed on or visible to each compute node. Here again, there are many possible configurations. We chose to install it on an NFS mount point visible at the compute nodes. This is well worth discussing with site system administrators since standard cluster policies may or may not readily accommodate this approach.

Reporting

Once up, the CE should report its existence to the operations center. A couple of items need to happen before this works correctly which escaped my attention on a reading of the installation guide:

Create Local Unix Users: Any unix users that will be required by supported VOs must be created so that they are visible on the CE and on the worker nodes. Also, while it’s obvious that the user should have read access to its home directory, bear in mind that errors like this can be exceptionally difficult to find when running  a grid job. So it’s good to log in as the created users to verify that everything’s in order. A problem with users not being created resulted in errors in the GIP tool and subsequent problems with CEMon, the CE statistics reporting component.

Register The CE: Make sure the CE is registered with OIM as prescribed in the planning guide.

Storage

As a Tier-3 site we opted for the NFS mounted disk based storage. That is, we don’t have a storage element (SE).

General Observations

The VDT install can be quite complex. As a result it’s advisable to follow some general good practices a little more closely in this case as chances that something bad will happen along the way are fairly high.

Automation: I automated the installation so that it runs an uninterrupted, download, install and configuration of the CE. The scripts work for multiple machines but they implement the configuration described above so they are likely to be informative but not immediately reusable. For what they’re worth they’re hosted at the RENCI-CI project.

Backups: Create a secure backup of the host and service certs you created. These take some manual intervention to obtain so losing these can slow down the process quite  a bit. I checked the configuration files into a source control system along with the scripts used.

Versioning: Version the install of the VDT (main and worker node client) and use symbolic links to point to these.

Feel free to leave questions, comments and corrections.

 

This entry was posted in Compute Grids, Globus, grid, High Throughput Computing (HTC), multicore, OSG. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s