Build Linux Cluster

  • May 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Build Linux Cluster as PDF for free.

More details

  • Words: 9,011
  • Pages: 37
Building Linux Clusters Cluster Management & Administration Tools Version 2.0BETA

David HM Spector December 18, 2000

Acknowledgments.............................................................................................. 3 Introduction ........................................................................................................ 4 What’s in this Release?...................................................................................... 5 Known Issues and Bugs..................................................................................... 6 Installation Requirements .................................................................................. 7 Obtaining the Cluster Administration System ................................................. 9 Installing the Required Software..................................................................... 10 Installing the Cluster Administration System ................................................ 11 The Cluster Build Process ............................................................................... 12 Installing ClusterForge Step-by-Step............................................................. 13 Using the System.............................................................................................. 21 Connecting to Cluster Management System.....................................21 Logging in to the Cluster Management System................................22 Overview of the Cluster Management System .................................23 Adding a Node..................................................................................24 Viewing the Node Database .............................................................26 Showing a Node’s Characteristics ....................................................27 Adding a New Network Interface.....................................................27 Other Forms ......................................................................................28 Generating the DHCP File............................................................................... 29 SourceForge Menus and Commands.............................................................. 30 Creating Users .................................................................................................. 31 Appendix A : The ClusterForge Configuration File...................................... 33 Appendix B : The ClusterForge Crontab Entries .......................................... 36 Root tasks: ........................................................................................36 Tasks for clusteradmin......................................................................37

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Acknowledgments SourceForge: The author would like to thank the SourceForge crew, and especially Tim Purdue, for the creation of SourceForge – it’s a great system that will prove invaluable for developers of all stripes. It was a fun challenge to write all of the portability code; I hope I can continue to make valuable contributions to such an important project! O’Reilly: An extra-special thanks to Laurie Petrycki, the executive editor at O’Reilly & Associates for being so understanding about all the things that can conspire to get in the way of getting stuff done. Michelle Smith: (My wife) for putting up with all the stress and angst that accompanied parts of this project, especially with a new baby in the house and the unbelievable amount of work that entails.

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Introduction This document details the installation of version 2 of Building Linux Clusters “Cluster Management System” software. This software is a replacement for the version shipped with the 1st Edition of Building Linux Clusters and adds significantly more functionality. This new software release is based on the framework provided by SourceForge, an Open Source collaborative development environment produced by VA Linux Systems and published under the GNU General Public License (GPL). SourceForge provides a wide range of collaborative tools for developing software, such as an easy to use interface to software version control (CVS), bug tracking, discussion forums and a way to archive “code snippets” (useful code libraries that are not necessarily part of a complete software package), and many other tools – all of which are useful in the clustered environment where software development projects can quickly become quite complicated due to the nature of parallel systems. The Cluster Administration System components add a number of internal tables to the base SourceForge1 system that aid in the management of Linux Clusters, including the ability to add, edit and delete nodes to the cluster, the ability to manage batch queues and so on. The system is deigned so that it is extensible and new functionality may be easily added without re-releasing (or forcing you to re-install) the entire package. The additions have been made in such a way that the cluster management database components are separate from the base SourceForge tables which will make upgrading the SourceForge package itself easier as that package evolves. Because of problems with the first version of the management software, the author has decided to make this update available via the O’Reilly & Associates web site rather than wait for a new edition of the book. This will enable faster updates, bug fixes and other new information to be delivered to you more quickly. These updates will also be available via the author’s home web server which can be found at http://www.zeitgeist.com/software/clusteradmin/ . Finally, this document is a work in progress2 and is does not document all of the capabilities of this system – it is meant to help you get your cluster up and running by getting the cluster administration portion of the ClusterForge set up. There are many, many more features in this package than are described/documented here. A full description and complete manual along with an updated/replacement Cluster Management chapter one in the 1st edition of Building Linux Clusters will be made available via O’Reilly and from the author directly.

1

Occasionally I will refer to the Cluster Administration & Management system as “ClusterForge.” It’s a heck of a lot shorter and a homage to its roots in the SourceForge code-base even if the whole “XXXXforge” thing is getting a little bit hackneyed. 2 It also has not been edited by O’Reilly & Associates, so any typos or other faux-pas are the author’s alone!

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



What’s in this Release? The new release of the Building Linux Clusters Cluster Management tool kit is a completely new system for managing your cluster. It is based on SourceForge, the Open Source collaborative development system, and uses this package as the framework into which cluster management tools are inserted. The base Source Forge system provides a number of tools that can aid in the software development efforts, including: •

A database centric, browser-based user environment that offers several levels of user privilege A browser-based project description/creation system A browser-based interface to CVS, the Concurrent Versions System A bug reporting/tracking system that ties bug reports to projects or modules with projects A system for categorizing free-standing bits and pieces of code into a libraries that aid in code-reuse

• • • •

To this, the cluster administration tools add (at this first release): • • •

Node definition/management Cluster-node Interface Management Cluster user management

To these basic tools, any number of other facilities may be added; among those in the works are modules for: • • • • •

Batch Queue Management Resource management: NIS groups, disk and CPU quotas DNS management for clusters Cluster-wide process accounting Cluster load monitoring

(January 2001) (February 2001) (February 2001) (March 2001) (March/April 2001)

Any ideas and suggestions that you may have for other modules that would be useful in this system would be greatly appreciated. You can send your ideas to: [email protected] Code contributions and bug fixes are also welcome and can be sent to the same address.

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Known Issues and Bugs As with any (every?) software package, ClusterForge has some bugs and other issues. Here’s the list of issues for the 2.0Beta version: 1) Display quirks: There are small gray artifacts on some of the ClusterForge screens. This is due to bugs in the theme support in the SourceForge code base. I’ll be trying to nail it down, but the theme support in SourceForge is not fully baked. 2) SourceForge administrative interface is really hard to use. I’ll be adding both documentation and better UI elements over time. 3) There is no sanity check of the host IP addresses. At this release the add-node.php3 code does not check to see if the IP address you enter is on the same network as the master node.In order for DHCP (and, in fact for the cluster as a whole) to work, the primary interface of each node must be on the same network as the master node of the cluster. 4) There is no sanity check for the hardware address. At this release the add-node.php3 code doesn’t check to see if the format of the hardware address matches the kind of interface you’ve specified in the form (i.e., ETHn [Ethernet] type addresses should have a 48-bits of address). There is support in the cluster_node_interfaces table for this feature and this will be fixed in a subsequent release. 5) The software manuals included as part of the cluster documentation (e.g., MPI, PVM, PADE, etc.) are vanilla HTML and have not yet been modified to match the look and feel of the rest of ClusterForge. This will be fixed in an upcoming release. 6) The configuration file has a flag that controls whether or not shell accounts are automatically created – in this release this feature is not implemented. Shell accounts are always created when new users are registered with the ClusterForge system.

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



7)

Installation Requirements In order to install and use this version of the Cluster Management & Administration Tools you will need a number of other ancillary software packages. In the next release of the printed book these packages will be included on the CD-ROM, but for this stand-alone release, you will need to FTP these packages yourself3. Package MySQL MySQL shared MySQL client Perl DBI Interface Perl MySQL Interface PHP version

GNU Mailman CVS

CVSWeb OpenSSL*

Apache+ mod_ssl OR ModSSL*

Description SQL Database Supporting libraries for the database Client interfaces

Version >= 3.22 >= 3.22

Download Location http://www.mysql.com/downloads/ (same as above)

>= 3.22

(same as above)

Perl database independent interface The MySQL database driver for DBI The scripting language used by this packages Mailing list processing software Version control software

>= 1.14

http://www.cpan.org/authors/id/TIMB/

>= 1.2

http://www.cpan.org/authors/id/JWIED/Msql -Mysql-modules-1.2215.tar.gz

>= 3.02

http://www.php.net/

>= 2.0

Web interface to CVS repositories HTTP encryption

>= 1.9

Apache SSL Module

>= 1.37 *OR*, >= 2.4.10

http://www.gnu.org/software/mailman/mailm an.html http://www.cvshome.org/ Or, in the RedHat/RPMS directory on the Building Linux Clusters CD-ROM http://www.cs.cornell.edu/nogin/RPM/cvswe b.html http://rpmfind.net/linux/RPM/openssl.html * Only required if you need secure http connections ftp://ftp.MASTER.pgp.net/pub/crypto/SSL/A pache-SSL/ *Only required if you need secure http connections.

>= 1.10

>= 0.9.4

3 We would include them in this online distribution, but its always best to get software from is source. Good places to find any/all of this software is http://www.freshmeat.net/ or http://www.rpmfind.net/ or http://www.sourceforge.net/ . The PDF version of this document includes hyperlinks that can be used to download these packages directly; the URL’s are also included in case the PDF reader you are using doesn’t support hyperlinks or you are reading a hardcopy version of this document.

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



There are other packages that are very useful in debugging problems or general poking around, but not required in order to use the cluster management system – here are some that I have found useful:

Optional/Useful Software Package PhpMyAdmin

Ddd

Description Browser interface to MySQL DataDisplay Debugger

Version >= 2.1

Download Location http://phpwizard.net/projects/phpMyAdmin/inde x.html

>= 3.2

http://www.gnu.org/software/ddd/

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Obtaining the Cluster Administration System The Building Linux Clusters Administration System may be obtained from two sources: O’Reilly & Associates FTP site:

ftp://ftp.ora.com/published/oreilly/linux/clusters

The author’s web site:

http://www.zeitgeist.com/software/clusteradmin

The Cluster Administration System is distributed as GNU zip’ed tar image; it will have a file name in the form of BLC-ClusterAdmin-X.Yreltype.tar.gz, where “X.Yreltype” is a release number and a kind of release. For example: BLC-ClusterAdmin-2.0beta.tar.gz This represents the 2.0 beta release of the software. This distribution contains all of the cluster management software. As new modules are released and/or updated, there will be packages released that represent individual modules; these will be of the form: BLC-ClusterAdmin-modulenameX.Yreltype.tar.gz These will be drop-in replacements for existing modules, or new functionality that will come with their own upgrade or installation instructions. This document covers the installation of the whole package, not any of the individual modules.

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Installing the Required Software Before attempting to install the new version of the cluster management software, you should install (and configure) all of the “required” software packages listed in table 1. It is strongly recommended that wherever possible that you obtain these packages in the form of RPM (RedHat Package Manager) files. The packages should be installed in the following order: 1) MySQL packages 2) Perl packages, starting with the MSQL-MySQL modules, followed by the Perl DBI package, then the MySQL driver for DBI 3) GNU Mailman 4) CVS and CVSWeb. You should make a symbolic link from the root to wherever you put your CVS repository on your system. For example, if your CVS repository is on the device named “/spare” you would make the link by invoking “ln –s /spare /cvsroot” as root. If your CVS repository is on its own device, you could simply mount it as “/cvsroot” 5) OpenSSL/ModSSL is required at your site PHP should have been installed already on your master-node by the Building Linux Clusters CDROM installation software. If, for some reason, it is not installed, the RPM files can be found in the RedHat/RPMS directory on the CD-ROM.

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Installing the Cluster Administration System Before installing the cluster administration system itself, all of the software listed in the “required software” table must be installed. Each of the packages come with test-scripts and/or instructions that will test the installation of the package. Make sure that all of the packages are successfully installed before attempting installation of the Cluster Administration System itself. If the required packages are not installed, trying to get the rest of the system to work properly will be difficult if not impossible. Once the required software has been installed, work can begin on the installation of the Cluster Administration System itself. It is recommended that you unpack the distribution into a temporary directory, such as /tmp . The distribution kit will unpack into a directory that has the same name as the package you are unpacking. Inside this directory will be a README/INSTALL file (i.e., this document!) that describes the software, what is required to use it, and how it should be installed. The software can be unpacked with the command: tar zxvpf filename Where filename is the file you have retrieved from one of the sources listed above. It is important to include the “p” option in the tar command – this preserves the file permissions that were recorded in the tar file. Since the files are for the most part executable scripts, if the permissions are incorrect, the Apache web server will refuse to execute them. This can be timeconsuming and difficult to debug The tar command will show you all the files it is unpacking as it unpacks them. At the end of the process you will have a new directory called BLC-ClusterAdmin-X.Yreltype . This directory will contain installation instructions (this document) in PDF (Adobe Acrobat) and HTML formats, and four directories, “db” , “etc” , “html” and “utils” . The db directory contains the SQL schema required to instantiate the Cluster Administration Database. The etc directory contains configuration files that define the operation of the system The html directory contains the code that makes up the bulk of the system. The utils directory contains perl and other scripts that are run by cron in order to implement various parts of the administration system and the development environment such as account creation and node management. The rest of this section details in a step-by-step fashion how to install the software and get it running.

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



The Cluster Build Process If you have already built a cluster using the procedure outline in the 1st edition of Building Linux Clusters, skip ahead to “Installing ClusterForge Step-by-Step” and continue from there. You already are far enough along that you don’t need to use the ClusterForge for preliminary setup of your slave/compute nodes. If you have not yet set up your cluster, you should follow the process outlined in the book, but stop once you have gotten the master node set up, and when you have made the boot floppies for the slave/compute nodes. When you are ready to bootstrap the slave/compute nodes, pick up here in the ClusterForge installation section (skip chapter 6, Cluster Management, this new software replaces the version that comes with the book). Using the new software will help you set up all your slave nodes without having to re-enter any information ones you get the slave nodes set up. Once you have completed the software installation described in this document, and told the cluster administration database about your compute nodes you can pick up where you left off in Chapter 5, Software Installation and Configuration.

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Installing ClusterForge Step-by-Step

Step 1:

Log in as root

In order to install and set up this software, you will need to log in to the master node of your system as root. All of the installation and setup work will be done as superuser.

Step 2:

Backing up existing materials

If you have installed a previous version of the cluster management tools (for example, from the first release of the Building Linux Clusters CD-ROM), you should back up that release into a safe place. This can be done by copying the /home/httpd/html directory to another device and then recursively removing the html directory, or simply by renaming the directory to another name. For example: {root}# cp -Rp /home/httpd/html /anotherdev/html-SAVE; \ rm –rf /home/httpd/html will copy, and then remove the old directory, or, {root}# mv /home/httpd/html /home/httpd/html-SAVE will rename the old directory. If you are at all short on disk space on the device where /home resides, it is a better idea to take the first course rather than just renaming the directory.

Step 3:

Deactivating PostgreSQL

The previous version of the software that shipped with the 1st edition of Building Linux Clusters made use of the PostgreSQL database. The new version uses a different database that was used in the development of SourceForge, upon which the new version of these tools is based. PostgreSQL is therefore no longer necessary (unless you are using it for some other purpose on your system, in which case it should be left alone). PostgreSQL can be shut down with the following command: {root}# /etc/rc.d/init.d/postgresql stop Next, it would be a good idea to stop PostgreSQL from starting up at system boot-time so that a process that isn’t being used by anything does not consume CPU time and other resources. This can be done by deactivating postgresql’s init script with the command: {root}# /sbin/chkconfig --level 345 postgresql off

Step 4:

Installing the new HTML directory

The html directory included with the new distribution contains a complete version of the administration system and all the manuals described in Building Linux Clusters. To install this in its proper place, cd to the distribution and then recursively copy the html directory to /home/httpd/, as in: % cd /tmp/BLC-ClusterAdmin-X.Yreltype

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



% cp –Rp html /home/httpd/ These commands copy the entire directory structure, to the destination, preserving the file permissions that were in the source.

Step 5:

Instantiating the Cluster Administration Database

In order to function, the scripts and HTML files installed in the previous step work by accessing a database where information about your cluster is stored. With MySQL n a new database is created using the mysqladmin program. You can call the name anything you like; the configuration files supplied with this distribution use the name “alexandria” (the same name as SourceForge uses) – if you decide on another name, you will have to update the configuration file which is covered in Step 6. To create the database, execute the following command: {root}# mysqladmin create alexandria Mysqladmin will respond with a message indicating the database has been created. If you want to password protect this database (a good idea in multi-user environments), you can set the password with the following command: {root}# mysqladmin password “somepassword” alexandria Where somepassword should be replaces by the password you wish to apply to this database. Make sure you remember what this is – there’s no way of recovering it if you lose it! Without this password, anyone who can make a connection to your machine can connect to your database and execute any command whatsoever (i.e., they could delete everything without a trace!).

Step 6:

Loading the Cluster Administration Database

Once the database has been instantiated, we need to load in its schema and its default values. The database SQL file is in the “db” directory of the distribution and is named clusterforge.sql. To load the database tables, execute the command: {root}# mysql –p somepassword

dbname < clusterforge.sql

Dbname is the name of the database when you created with the mysqladmin command and somepassword is the password that controls access to this database. The mysql command will digest the contest of this file silently. If you wish to see what database looks like, you can execute the following command: {root}# mysql –e “show tables” –p somepassword

dbname

MySQL will show you a very long listing of tables that exist in the database that looks like this:

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



+-----------------------------------+ | Tables in alexandria | +-----------------------------------+ | activity_log | | activity_log_old | | activity_log_old_old | | bug | | bug_bug_dependencies | | bug_canned_responses | | bug_category | | bug_filter | | bug_group | | bug_history | | bug_resolution | | bug_status | | bug_task_dependencies | | cluster_cpu_types | | cluster_info | | cluster_interface_names | | cluster_node | | cluster_node_accounting | | cluster_node_interfaces | | cluster_node_packages | | cluster_project | | cluster_queue_info | | cluster_queue_node_membership | : This is an abbreviated listing of some of the names of tables in the database. Also loaded are the default values needed to set up the system that are not shown here, but they are stored in the database tables. Once the tables have been created and the default values loaded, we can proceed to configuring ClsuterForge for use.

Step 7:

Installing the ClusterForge Configuration File

ClusterForge, like SourceForge upon which is based, uses an extensive configuration file to tell the system where to find the resources it needs to operate. In the ClusterForge distribution directory, in the “etc” subdirectory, you will find a file named local.inc,this file needs to be installed in the master node’s /etc/ directory, and made readable only by root and root’s group. For example: {root}# cd /tmp/LC-ClusterAdmin-X.Yreltype {root}# cp /tmp/LC-ClusterAdmin-X.Yreltype/etc/local.inc /etc/ {root}# chmod 660 /etc/local/inc This will ensure that unauthorized processes and users cannot read the configuration file.

Step 8:

Editing the Configuration File

Once the configuration file is in the right place, we’ll need to set up all of the parameters correctly for your master node’s configuration. The file is extensively commented, and included in its entirety as Appendix A of this document. In this section we’ll cover exactly what you need

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



to change in order to get your system running. It is strongly recommended that you do not change any other flags or settings in this file – doing so can lead to disappointing results. The local.inc file is divided into 4 sections that control various parts of ClusterForge’s operations. These are: • • • •

Preliminaries – SSL Security Part I – SourceForge Hostnames Part II – Databases html/php paths Part III – GUI Modifications

Since the Cluster Administration System is based on SourceForge you will see that name used throughout the configuration file – just pretend it says “ClusterForge.” :-) We are only concerned with Parts I & II, where we will specify hostnames and paths used by the system, specifically: From Part I: $sys_default_domain = "master.cluster.ny.zeitgeist.com"; $sys_cvs_host = "master.cluster.ny.zeitgeist.com"; $sys_download_host = "master.cluster.ny.zeitgeist.com"; $sys_shell_host = "master.cluster.ny.zeitgeist.com"; $sys_users_host = "master.cluster.ny.zeitgeist.com"; $sys_docs_host = "master.cluster.ny.zeitgeist.com"; $sys_lists_host = "master.cluster.ny.zeitgeist.com";

This collection of host names specifies which hosts perform what functions in the cluster administration system. All of these variables should be set to the hostname of the master node. As you can see from the examples here, the name of the master node of my cluster is “master.cluster.ny.zeitgeist.com” – although that’s pretty obvious from the text, I am being purposefully explicit here because it is very important that all host names used in the configuration of this system use fully-qualified-domain-names (FQDNs). This is so that there is no ambiguity about what host you are naming, and so that someone accessing the administration system from a machine with a slightly broken domain name resolver doesn’t get confused and fail to find a required host. Another reason to use FQDNs is that, if necessary, the parts of the administration system and many of the SourceForge functions can be run over several hosts. For example, if your corporate source code repository lives on some big server with a large disk farm, you could put the name of that server here instead of the cluster’s master node. Just make sure that CVS is installed properly there otherwise the ClusterForge source code repository will not function correctly. The next important host names involve the domain name hosts that will be used by the system: $sys_dns1_host = "master.cluster.ny.zeitgeist.com"; $sys_dns2_host = "master.cluster.ny.zeitgeist.com";

These hosts specify the DNS servers that know about the cluster. It is usually the master node itself that is usually configured as a DNS server for the cluster so as not to clutter up the

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



organizational name space with all of the cluster nodes when all the main network needs to know about is the master node. Conversely, the cluster administration system needs to know about all of the nodes of the cluster if it’s going to be able to administer them. The second DNS host can be set to something that is “up a level” that knows about the larger organizational network, if any. This would allow for a faster name resolution of objects like the hypothetical corporate CVS server mentioned previously. From Part II: The next section deals with the database and the places files are accessed or stored. $sys_dbhost="localhost"; $sys_dbname="alexandria"; $sys_dbuser="root"; $sys_dbpasswd=""; $sys_server="mysql";

Here in $sys_dbname is where you need to specify the hostname you passed to mysqladmin to create the database; if you gave the database a password, you should specify that in $sys_dbpasswd, or the ClusterForge code won’t get access to the database. None of the other variables should be changed. Next we’ll specify where users can upload files to be added to the CVS repository. This must be an absolute path name, for example: $FTPINCOMING_DIR = "/nfs/ftp/incoming";

The next variable specifies the location where files can be downloaded. This is the root of the FTP tree for projects stored under ClusterForge. $FTPFILES_DIR = "/nfs/ftp/sourceforge";

These next two variables tell the ClusterForge code all about accessing itself. If these are not set properly, ClusterForge will fail in various mysterious ways, which are almost impossible to debug (it’s a PHP problem that is probably fixed in PHP 4.x with its debugging hooks). $sys_urlroot="/home/httpd/html/";

This is an absolute pathname to where you copies the HTML directly back in Step 4. Make sure you leave the trailing slash (“/”); this is used extensively by the code in forming URLs. $sf_cache_dir = "/tmp/sfcache";

This is an absolute pathname to a directory that is used by ClusterForge to write a set of cache files that store various chuncks of HTML that don’t often change. This is done because ClusterForge (and SourceForge) is actually a very large server-side PHP application, and every time a user accesses the ClusterForge home page, or their personal home page (more on this later!), large amounts of HTML would have to be rebuilt on the fly. For a large numbers of users this would have a very negative impact on the performance of the web server. Set this variable to a device that has free space and is world-readable – the /tmp file system is a good place.

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Finally, save the changes to the file.

Step 9:

Creating the Cache Directory

As mentioned in Step 7, ClusterForge needs a directory in which it can place cache files; you will need to make this directory and set its access permissions so that it can be read by the world, but only written to by root and the web server, for example: {root}# mkdir /tmp/SFcache; chmod 775 /tmp/SFcache {root}# chgrp nobody /tmp/SFcache

Step 10:

Setting the PHP Path

In the Apache configuration file, the PHP include path needs to be set so that ClusterForge can find all of its libraries and utility code; this done by adding a line to the default directory directive in the Apache configuration file. Edit the file /etc/httpd/conf/httpd.conf, and look for the default directory, which should start with a line that looks like: Scroll down the file until you find the closing tag for this directive, which is a line that looks like: and, then, insert the following line: php3_include_path /home/httpd/html/include:. before the closing tag. Save the addition and restart the web server with the command

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



{root}# /etc/rc/init.d/httpd restart This will restart the web server and make web portion of the ClusterForge system ready for use.

Step 11:

Creating up the clusteradmin User

If you have not already done so as part of the cluster installation process specified in Chapter 6 of Building Linux Clusters, you should create a clusteradmin user. This can be done with the command: {root}# useradd

-G clusteradmin –p password clusteradmin

This will create the user “clusteradmin,” along with a new group of the same name, with the password specified. Once the clusteradmin user has been created, find out what Unix UID has been assigned to it during the creation process, this can be done by using the grep command, as follows: {root}# grep clusteradmin /etc/passwd the result should come back looking very much like this: clusteradmin:$1$0RTdMI8d$jq0O7SX6WDxX2tZ6eVhM4.:501:502::/home/clusteradmin:/bin/bash

The “501:502” in the sample output above are the User ID and the Group ID of the clusteradmin user – the values on your system will probably be different. Make note of the User ID, it will be needed for the when we set up the background jobs.

Step 12:

Copying the clusteradmin Utilities

Beyond the visible portion of the cluster management system there are a large number of background processes that run at various times to do the work specified by the cluster manager, and keep the various project-management and collaboration subsystems in proper working order. From the ClusterForge distribution directory, recursively copy the utils sub-directory into clustadmin’s home directory. For example: {root}#

Step 13

cp /tmp/LC-ClusterAdmin-X.Yreltype/utils ~clusteradmin/

Installing the Batch Jobs

Now that the utility files are in their final destination, there are just a couple of tasks to complete before the cluster management system can be brought on-line, all of which need to be done from the clusteradmin user’s home directory. Go there with: {root}#

cd ~clusteradmin/

In the clusteradmin directory, a subdirectory needs to be created to hold data extracted from the cluster administration database. Create this directory as follows: {root}#

mkdir dumps

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Make sure that the clusteradmin user owns this directory with the command: {root}#

chown cluster dumps

Next, modifications need to be made to an include file that is common to all of the background processes, this file is named include.pl in the utils directory that was just copied into the clusteradmin user’s home directory. Edit this file and look for the line that starts: $dummy_uid = "503";

The “503” here must be changed to the Unix User ID that you recorded for the clusteradmin user. This setting tells the background processes what user owns various work files – without it the background processes cannot run. Save your changes to this file. Finally, the commands that run the batch jobs themselves are stored in two different files: one file for jobs that need to run as the clusteradmin user, and other for jobs that need to run as root, with full superuser privileges. These batch jobs are represented by the files crontab-clusteradmin.sf and crontab-root.sf .

Install the cron jobs with the following commands: {root}#

crontab –u clusteradmin

{root}#

crontab crontab-root.sf

crontab-clusteradmin.sf

For the most part these cron jobs will never need to be changed, except in two instances: First, the interval for account creation. The default is to scan the database every 6 hours for new accounts to create. If your installation is very busy, or you think that 6 hours is too long to wait for user creation the timing can be changed as per the comments in the batch files. The second instance is the interval for new-node activation. The cron file contains a batch job (dump-dhcp.pl) that scans the cluster administration database every 6 hours looking for nodes that are either new, or have had they primary network interfaces updated. When this job finds such hosts in the database, it writes a new /etc/dhcpd.conf file and restarts the dhcp daemon which will allow these new/updated hosts to be reconfigured when they are booted. If you have added the information for a new node into the ClusterForge and the node fails to bootstrap itself when you power it on, you will most likely want to run this job by hand.

Once these cron jobs have been installed, the Cluster Administration system should be operational. You should proceed to the next section and log in to start using the system to set up your cluster.

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Using the System The best way to see if everything successfully configured is to log in to the system and try it out. This section will get you stared using the system by showing you the basics of adding nodes and interfaces to your cluster. Once we have these basic tasks out of the way, the rest of this document will cover very specific tasks involved in the other components of ClusterForge, those that are part of the SourceForge code base. &RQQHFWLQJWR&OXVWHU0DQDJHPHQW6\VWHP Connecting to the management system is as easy as pointing a browser at the master node. Start a browser, and connect to the master node of your cluster. If the web server is set up correctly, you should see the web page shown in figure 1:

Figure 1

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Some details, such as the number of cluster hosts shown will differ (because no nodes have yet been entered into your version of the cluster management database or because your cluster may contain a different number of nodes). /RJJLQJLQWRWKH&OXVWHU0DQDJHPHQW6\VWHP Logging in to the system is quite simple; an account for the cluster administrator has been included in the initial data that was put into the database when you loaded the tables and other default information. To start the login process, click the “login” label that will bring up the login panel, shown in figure 2.

Figure 2 Into the “login name” area, enter the account name of the cluster administrator, in this case “admin” – the password is the same default password that is set up at cluster build-time: “mr.linux”. Enter the password, and press the “Login” button.

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



2YHUYLHZRIWKH&OXVWHU0DQDJHPHQW6\VWHP Once logged in, there are a number of options available, as shown in figure 3. As you can see, the browser window is divided into two areas: The left side is dedicated to action-oriented items such a commands, and menus, while the right side is used for informational elements, such as information about tasks that may be assigned to you based on projects you are a member of, and so on. In this first release, almost all of this information is that same information one would see in SourceForge – subsequent released will all a large number of cluster monitoring tools and other information elements useful in the clustered environment. The items of greatest interest to the cluster administrator are the SourceForge Management and the Cluster Management Menus. We’ll focus first on the cluster management portion of the system since that is where information about your cluster will reside.

Figure 3 Select the Cluster Management menu to view the options in the cluster management section. This will bring up the cluster administration menus, shown in figure 4.

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Figure 4 The cluster management screen is where all of the functions related to the management of cluster nodes can be found At the moment, the modules supplied with ClusterForge handle the creation and management of cluster nodes only. Upcoming modules will add to this functionality. From this screen it is possible to list all of the nodes currently defined in the cluster, or add a new node. Since listing all the nodes of an empty cluster will not get us very far, we’ll start by adding a new node to the cluster. $GGLQJD1RGH Adding a node to the cluster administration database is quite simple, and requires just a few pieces of information, which you should have ready since you have either built your slave/compute nodes or are ready to start that process. You will need: • • • •

The names of the slave/compute nodes The IP address you wish to assign each node The name of the primary Ethernet card (e..g. “eth0”) The hardware address of the primary Ethernet interface of each node

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Optionally, you might wish to have the following extra information available: • • • • • •

System manufacturer name System serial number Number, kind and speed of CPUs Memory configuration Amount of available disk space Name, hardware address and IP address of secondary network cards

If you have all the information on the slave nodes you wish to install, click on the “Add a node” link. This will bring up the display shown in figure 5.

Figure 5

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Fill in the required information listed above; the optional information, though not required, will be used by other reporting modules that will be available in upcoming releases. The form displayed will look very much like the one displayed here, but may differ slightly depending upon the fonts installed on your system; the domain name shown in the form will, of course, be the domain name you set up when you configured your master node. Once you are satisfied with the information you have entered, press the “Insert” button and the node will be entered into the cluster administration database. If you want to enter more compute nodes into the database, you can press the “BACK” button in your browser, modify the form and insert more nodes. Once you have entered a node or two, we can look at the listing of the cluster database to see what’s been recorded. Click on the link labeled “Cluster Management Home” to return to the top level of the cluster management area. 9LHZLQJWKH1RGH'DWDEDVH To see what the is in the database, click on the link labeled “List cluster nodes…” and a listing very much like figure 6 will be displayed:

Figure 6 This listing gives a capsule summary of the nodes in the cluster, the kinds of processing power they have and what the configuration of their primary network interfaces is. On the right-hand side of the screen there are two additional links next to each node, one allows you to bring up a form to edit the node’s information and the other allows you to delete the node from the database. These should be used with case as they cn seriously affect the operation of your cluster. The delete link will prompt you before taking any action, so there is no possibility that you will accidentally remove a node.

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



6KRZLQJD1RGH·V&KDUDFWHULVWLFV To see a more complete listing of a node’s facilities (in fact, all the information you entered in the “Add Node” form), you can click on any of the node names in this list. This will bring up a display similar to the one shown in figure 7:

Figure 7 Of course, your nodes will have different characteristics (and probably real IP addresses, unlike the dummies used here!). A useful feature of the “show node” display is that you can see other network interfaces that installed on the node, edit them, add new interfaces, or even delete interfaces. Of course, editing or deleting interfaces should be done with case as such actions can disrupt the cluster (and probably your users, too). $GGLQJD1HZ1HWZRUN,QWHUIDFH Adding a new interface is started by clicking on the “Add interface to this node…” link in the “ShowNode” display. This will bring up a display like the one shown in figure 8:

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Figure 8 Editing an interface, unlike adding a new node, requires only three pieces of information: • The network device name • The IP Address • The Hardware address The network device name is selected here via a pop-up menu that lists 17 different kinds of network interface types, the other requisite information is entered directly in the text boxes provided. On non-primary interfaces only the IP address and device name are critical since you cannot normally boot of one of these interfaces. However, the system will not allow you to enter a bogus IP address, or use the name of a device that already exists on node you are editing. Even though you can’t boot from the device DHCP information is generated so that the network device can be automatically configured at system boot-time. 2WKHU)RUPV There are several other forms in the cluster administration section, but they are all variants of the forms that have been shown here and are used for editing already entered information. Rather than go over these forms in detail, we will move on to the process of getting the DHCP information extracted from the database so you can get your slave/compute nodes up and running.

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Generating the DHCP File The /etc/dhcpd.conf file contains the information needed by the master node in order to bootstrap the slave/compute nodes and finish the installation of your cluster. The information that you have entered into the cluster administration database is not very useful if it can’t be used to get cluster nodes configured. The way that that this information is translated from the database into the DHCP system is by way of a small perl program that runs as part of a larger set of utility scripts that you installed when you installed ClusterForge. This Perl script looks at the existing DHCP configuration file installed on the master node and makes note of its last modification time. The script then looks at the all of the network interfaces described in the interfaces portion of the cluster database and if it finds any interfaces that are newer than the DHCP file it writes out a new version of the file and restarts the DHCP daemon so that nodes can be initialized. For your purposes at the moment, if you have just added a new node to the database, there is one problem with this scenario: the process that checks the database only wakes up every six hours. In the normal, day-to-day operation of your cluster its is (hopefully) unlikely that you will be adding new nodes or interfaces several times a day, but the process is there as a utility to catch new nodes or interfaces as they are added. But, this 6-hour wait does little for you in testing your cluster:

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



SourceForge Menus and Commands All of the non-cluster specific components of ClusterForge are part of the SourceForge tool-set that forms the framework for the clustering tools that share their space. These tools allow a wide variety of activities, from discussion forums, to source code and release control though bug tracking and the creation of code libraries that aid in code re-use. From the perspective of setting up your cluster there are really only two aspects of the SourceForge tools that really important: The first the SourceForge Administration menu, shown in figure 9, and then second is the new-user form, shown in the next section in figure 10. The SourceForge administration menus allow you to examine the state of user accounts and manage development groups.

Figure 9

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Creating Users The most important capability, next to being able to get the cluster running, is the ability to get user accounts set up. Traditionally this required a lot of work on the part of the system admin to create accounts, set up directories, etc.. Following the SourceForge model, ClusterForge allows the users to request/set-up account themselves by connecting to the system and selecting the “register as a new user” link on the home page. Filling in the new account registration screen will generate a new user account in about 6 hours, which is the in which interval that the batch-job that monitors accounts is run. This delay exists so that the password files and NIS maps are not constantly being updated which would disrupt general use of the cluster.

Figure 10

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Once the batch process has created a user account, a user may log in and begin using the cluster. No further action is required for an individual’s use of the cluster. There is a lot more that can be done with ClusterForge and SourceForge-based project management and collaboration tools, but this should get you started with the cluster administration components. A more complete manual for ClusterForge and all its components will be published in coming months, along with more modules that implement new pieces of the cluster administration functionality. These will be made available via the O’Reilly & Associates web site and directly from the author as mentions in the introduction to this document.

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Appendix A : The ClusterForge Configuration File // // Really Important Safety Tip: --> DO NOT LEAVE ANY WHITE // SPACE AFTER THE CLOSING PHP TAG AT THE END OF THIS FILE! // // Doing so will really confuse the software and cause // 1) cookies to fail and 2) HTML page headers to fail // which will give you some preally hard-to-debug problems. // Why? PHP is a *pre-processor* -- anything that’s not PHP gets // emitted as part of the HTML stream and processed by the browser, // so white space is meaningful! // // Preliminaries: Security // // It would be a "good thing" if your web server had a SSL certificate so that // users’ connections to the cluster administration system were encrpyted. // However, sensible people realize that not everyone wants to spend a few // hundred dollars every year on a new certificate and many clusters are on // private networks where this isn’t an issue. If you have a cert, set this // variable to "1", otherwise leave it at the default of "0" (zero). //If you turn this on an you have no ssl-enabled http server running you // won’t be able to login to the cluster adminstration system and you’ll // get very frustrated. You have been warned! // $sys_use_ssl = 0; // // // This flag controls whether or now users’ can create their own SF accounts, or // if account creation MUST be done by an administrator. // $sys_user_created_accounts = 1; // // // This flag control whether the background cron jobs use the “/etc/passwd” // and shadow password files, or if they put user name entries into // “/var/yp/src/passwd” // The default is to use the YP files. // **DON'T CHNAGE THIS UNLESS YOU KNOW WHAT YOU'RE DOING (If this is // changed, users will be created in /etc/passwd which will stop them from // using the cluster, which depends on NIS) $sys_use_yppasswd = 1; // // PART I - SourceForge hostnames // // Hostnames should be fully qualified domain names (FQDNs); using short names // would be prettier but would stop you from distributing your SourceForge // implementation across multiple domains. // // Of course, if you have a lot of machines serving a particular purpose // such as FTP or for shell accounts, the "hostname" here might be in // reality an addr_list of machines that is serviced by a round-robin // mechanism or something fancy like a local-director. // // The default SourceForge domain // this is used where ever the "naked" form of the SourceForge domain // might be used. E.g., "mailto:[email protected]"

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



$sys_default_domain = "master.cluster.ny.zeitgeist.com"; // Machine that hosts CVS $sys_cvs_host = "master.cluster.ny.zeitgeist.com"; // Machine used for downloading sources/packages $sys_download_host = "master.cluster.ny.zeitgeist.com"; // Machine(s) that host users’ shell accounts // N.B. to the SourceForge Crew: What’s the difference between the user // host and the shell host? They are clearly two different hostnames // in the source code, but they seem to serve the same purpose..? $sys_shell_host = "master.cluster.ny.zeitgeist.com"; $sys_users_host = "master.cluster.ny.zeitgeist.com"; // Machine that hosts docs (such as the FAQs and the various software // licenses (*BSD, [L]GPL, etc.). You REALLY want this to be the same // machine that the SourceForge code is running on because all of the // PHP makes reference to these documents in terms of relative paths that // are part of the SourceForge code tree. $sys_docs_host = "master.cluster.ny.zeitgeist.com"; // Machine that hosts the SourceForge mailing lists (This could also be // the mail host if you have enough horsepower & bandwidth) $sys_lists_host = "master.cluster.ny.zeitgeist.com"; // Domain Name Servers // N.B.: Use terminated FQDNs here (with the final ".") so the resolver // doesn’t attempt to recurse in the case of a slightly broken DNS // configuration $sys_dns1_host = "master.cluster.ny.zeitgeist.com"; $sys_dns2_host = "master.cluster.ny.zeitgeist.com"; // Part II - Databases, html/php/other paths $sys_dbhost="localhost"; $sys_dbname="alexandria"; $sys_dbuser="root"; $sys_dbpasswd=""; $sys_server="mysql"; // Where files are placed when uploaded $FTPINCOMING_DIR = "/nfs/remission/u7/ftp/incoming"; // Where the released files are located $FTPFILES_DIR = "/nfs/ftp/sourceforge"; // Where the SourceForge files are placed // *** IMPORTANT: sys_urlroot *MUST* be an ABSOLUTE FILEYSTEM PATH NAME // that points to the www directory of the SourceForge // installation. If you use ANY form of relative path // you will break the html_image function in include/html.php // $sys_urlroot="/home/httpd/html/"; // Cache location -- this is needed by include/cache.php // This directory must be world reabable, but writable only by the web-server $sf_cache_dir = "/tmp/sfcache"; // Name of the system as a whole (needed by various utils and titles) $sys_name = "Building Linux Clusters"; $sys_shortname = "BLC"; // Part III - GUI modifications (menu colors, etc.) // See the top of the file include/html.php, this is where the menu colors

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



//

and colors used throughout SourceForge are defined.

// Themeing related vars... Some of this needs to change in the session stuff // The theme base directory, everything else is handled by theme_sysinit() $sys_themeroot=$sys_urlroot."themes/";

// End of customizations -- place nothing after the closing PHP tag! ?>

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Appendix B : The ClusterForge Crontab Entries 5RRWWDVNV # periodic clusterforge tasks for root # NB: If you have more crontab files for root add this file to # the worklist (i.e., a master crontab) for root, otherwise you # will REPLACE the exiting crontab with this one -- probably # not what you want to do... # # $Id: crontab-root.sf,v 1.6 2000/12/20 05:17:05 spector Exp $ # # crontab times are written as follows [see also crontab(5) ] # field allowed values # -----------------# minute 0-59 # hour 0-23 # day of month 1-31 # month 1-12 (or names, see below) # day of week 0-7 (0 or 7 is Sun, or use names) # # # this first entry is really important -- it’s what makes the users that # are created inside clusterforge. # 10 */6 * * * cd /home/clusteradmin/utils && ./new_parse.pl ; cd /var/yp && \ make >/dev/null # # Look for new or updated nodes in the clusteradmin database and write them # out into the /etc/dhcpd.conf file. Really this should only be done when # new nodes are added, but it costs very little to run this job since it only # runs if there is work to do... 0 */6 * * * cd /home/clusteradmin/utils/underworld-root && ./dump-dhcp.pl # # # # 0 0 0

These are the nightly jobs that take care of updating the counts for items in the code snippets (called "the trove" in the code) and check on the statis of jobs in various priject "to do: lists * * * * cd /home/clusteradmin/utils/underworld-root && ./db_trove_treesums.pl 2 * * * cd /home/clusteradmin/utils/underworld-root && ./stats_nightly.sh >/dev/null 2 * * * cd /home/clusteradmin/utils/underworld-root && ./db_jobs_close.pl

# These are the daily jobs that take care of all of the various project stats 0 4 * * * cd /home/clusteradmin/utils/underworld-root/ && \ (./db_project_metric.pl ;./db_project_cleanup.pl) 0 4 * * * cd /home/clusteradmin/utils/underworld-root && \ db_project_weekly_metric.pl >/dev/null 0 4 * * * cd /home/clusteradmin/utils/underworld-root && db_rating_stats.pl 0 4 * * * cd /home/clusteradmin/utils/underworld-root && db_top_groups_calc.pl>/dev/null 0 4 * * * cd /home/clusteradmin/utils/underworld-root && db_site_stats.pl

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



7DVNVIRUFOXVWHUDGPLQ # # # # # # # # # # # # # # # # # # *

periodic tasks for the clsuteradmin user $Id: crontab-clusteradmin.sf,v 1.6 2000/12/10 22:38:45 spector Exp $ crontab times are written as follows [see also crontab(5) ] field ----minute hour day of month month day of week

allowed values -------------0-59 0-23 1-31 1-12 (or names, see below) 0-7 (0 or 7 is Sun, or use names)

This needs to run in concert with the root batch jobs that actually create user accounts. This jobs runs first, then 10 minutes later the new_parse.pl script runs which actually creates the accounts. */6 * * * cd /home/clusteradmin/utils/underworld-dummy

&& ./dump_database.pl

BUILDING LINUX CLUSTERS – CLUSTER ADMINISTRATION SYSTEM VERSION 2.0BETA



Related Documents

Build Linux Cluster
May 2020 3
Cluster
October 2019 42
Cluster
November 2019 38
Cluster
April 2020 20