Virtualizing Engage Central

Well, this is just not going to be an interesting post to most of you. That’s just the way it is sometimes.

The OSG Matchmaker is still an important part of OSG job submission infrastructure for many users. One part of the OSGMM ecosystem is a component that runs maintenance and validation jobs. Validation jobs discover the condition of OSG sites. Maintenance jobs manage applications and data on the sites needed by jobs.

We need to move engage-central off of the old physical hardware it’s on to a new virtual machine. The old machine will remain online until we’re comfortable everything on the new one is working properly.

Here’s a diagram of the resulting system including the reporting web interface.

Here are the steps involved in (re-)building engage-central.renci.org on its new VM:

  • Create the virtual machine
    • Operating System: CentOS 5.5
    • Architecture: x86_64
    • Memory: 2GB
    • Disk
      • root (/): 8GB
      • /opt: 4GB
  • Copy /etc/grid-security from old to new machine.
  • Install OSG client (v-1.2.13)
    • Get pacman and install the client software:
      • Deploy RCI
      • envinit, osginit
      • mkdir -p $VDT_ROOT
      • osg_install_pacman
      • osg_install_worker_node_client
      • osg_ca_setup
    • Enable fetch-crl
  • Add condor Unix account
  • Install Condor (see config on old machine)
    • rci_require condor
    • cndr_install_all (7.4.2)
    • copy and edit
      • condor_config and
      • engage-central.condor_config
  • Install MyProxy
    • rci_require osg
    • cd $VDT_LOCATION
    • osg_install_myproxy
    • copy myproxy-server.config from engage-central
    • copy /var/myproxy from engage-central
    • /etc/init.d/myproxy start
  • Add osgmm Unix account
  • Install osgmm (see existing config)
    • mkdir $CONDOR_LOCATION/local.engage-central/log
    • mkdir $CONDOR_LOCATION/local.engage-central/spool
    • chown -R condor:condor $CONDOR_LOCATION/local.engage-central
    • chown root:root /osg/condor/current/local.engage-central-vm/pool_password
    • Copy old osgmm.conf to new machine
  • Verify osgmm is running
    • By checking output in Condor MasterLog
    • By looking at /home/osgmm/installs/current/var/log/osgmm.log
  • Configure the firewall to allow Globus and Apache services.
  • Configure services
    • Add Condor, MyProxy and Apache services to /etc/init.d
    • Use chkconfig command or similar to register services for restart

FAQ

_______________________________________________________________________________

Q: password authentication failure:

10/26 11:18:38 condor_write(): Socket closed when trying to write 13 bytes to collector at <152.54.1.145:9618>, fd is 610/26 11:18:38 Buf::write(): condor_write() failed10/26 11:18:38 AUTHENTICATE: handshake failed!10/26 11:18:38 ERROR: AUTHENTICATE:1002:Failure performing handshake|AUTHENTICATE:1004:Failed to authenticate using PASSWORD

A: Needs to be owned by the real uid, the one that ran condor_master.

chown root:root /osg/condor/current/local.engage-central-vm/pool_password

10/26 11:50:08 error: SEC_PASSWORD_FILE must be owned by Condor’s real uid

_______________________________________________________________________________

Q: Can’t execute osgmm-wrapper

A: Ensure the condor user can see and execute the osgmm-wrapper script.

I set the osgmm user’s home directory’s permissions as follows to fix my issue:

98305 4 drwxr-xr-x 7 osgmm osgmm 4096 Oct 26 17:22 .

_______________________________________________________________________________

Q: OSGMM starts then fails immediately.

10/26 17:43:50 The OSGMM (pid 16776) exited with status 110/26 17:43:50 restarting /home/osgmm/installs/current/sbin/osgmm-wrapper in 2057 seconds

A: Need to configure osgmm via installs/current/etc/osgmm.conf

_______________________________________________________________________________

Q: What about the graphs.

Right, engage-central serves lots of charts about engage usage. So how do we get those running…

A: Need to configure significant amounts of stuff.

This is intended as a high level overview of the procedure, not a step by step recipe, since some of the scripts for this are highly environment specific.

  1. Install apache
  2. Install and enable mod_php
  3. Install gratia accounting data collector
  4. Install engage usage statistics data collector
  5. Create symbolic links from web root to the data directories of the gratia and engage stats above.
  6. Basically, get index.php and other files from a pre-existing instance.


This entry was posted in Compute Grids, condor, Engage VO, High Throughput Computing (HTC), OSG, RENCI. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s