Computing Facilities    links to the SCS and CMU home pages Carnegie Mellon School of Computer Science Carnegie Mellon University
 
Advanced search tips 
 
 » Introduction to Facilities 
 » Accounts & passwords 
 » AFS 
 » Application software 
 » AV help 
 » Backups & restores 
 » Calendaring 
 » E-mail 
 » Networking 
 » Printing 
 » Purchasing 
 » Resource management 
 » Security 
 » Software licensing 
 » Support charges 
 » Web publishing 
 » Mac support 
 » Linux support 
 » Windows PC support 

Local system administration of Facilitized hosts

There are well over 1000 Unix/Linux workstations within the CMU School of Computer Science. In order to better manage an environment of this size, SCS Facilities has made several modifications and additions to the "standard" vendor software. This document describes the major changes that we have made to OS-level software, and how to make local modifications to this software on Facilities-supported Unix/Linux workstations. It also describes some Facilities policies for local system administration. People making configuration changes to their hosts, or installing network-aware software should also be aware of SCS network use policies.On this page:

Introduction

DANGER!!! DANGER!!! DANGER!!!

Some of the procedures described in this document involve becoming root on your machine and modifying system configuration files. You can seriously damage your machine's functionality if you make a mistake or are unfamiliar with the procedure.

If you aren't sure of how to perform some administrative operation, you may contact SCS Facilities by sending e-mail to help@cs.cmu.edu or by calling the Help Center (x8-4231; 9-5, M-F) and we will either do it for you or assist you in performing it yourself.

Getting Root Access

If you need root access to your machine, contact SCS Facilities. The usual way we provide root access is to add an entry of the form
   username.root@CS.CMU.EDU
in the file .klogin.local in root's home directory (note: this is not necessarily the same as /). This entry will allow you to su using your username.root Kerberos instance password, and will also allow you to login as username.root. If you do not have a .root Kerberos instance, you can create one for yourself by using the Instance Manager. If you forget your .root instance password, you will have to contact SCS Facilities to have it changed. Because the above method of root access depends on Kerberos, and thus upon the network working, you may want to have a local root password. You can set a local root password password by becoming root and using the command:
   passwd root
Facilities generally does not make use of local root passwords, so feel free to set this password to whatever you wish (though please pick a secure password). You can bypass Kerberos authentication on login entirely by logging in as root:local This also works for other usernames. You can login as username -f to avoid running username's rc scripts upon login. Please do not change root's .klogin file, since that contains the necessary entries needed for Facilities staff members to access the machine. Also, please do not change root's home directory, since .klogin needs to be in root's home directory in order to enable remote access.

If you have a Linux host, you can boot it into single user mode by appending the word:

   single
to the line in the GRUB menus. You may have to type
 e 
to modify the GRUB commands. If your machine asks for a local rootpassword during boot (for example, because it needs fscking) and you don't know (or haven't set) a local root password, you may be able to boot directly to a shell by typing:
   linux init=/bin/sh
at the lilo prompt.

Creating Accounts

If you are at all uncertain about the procedures involved in creating accounts on your machine, please contact SCS Facilities and we will do it for you. Facilities attempts to keep usernames and user IDs (ie the number that corresponds to a username) synchronized across machines. We do not do any synchronization of group IDs. If you are creating a local account for somebody who already has an SCS identity, you can use the following comand to add their account:
 /usr/adm/fac/bin/scsuseradd username 
You can use the following methods to lookup a username or UID. should use the correct username and UID. If a person has an AFS identity in the CS cell, you can find out their UID by typing
   pts examine <username>
A master list of all SCS UIDs (including those that don't have AFS ID's for one reason or another) can be found in the file /afs/cs/data/admin/Names (UIDs are in the second field of lines in this file). When you create an account for an existing SCS user, also be sure to use the correct full name (eg Harry Bovik )as it appears in the SCS white pages (you can find that out by doing a lookup username+) otherwise it's possible that the mail system could become confused if mail is sent directly to that user at the local machine. The command:
/usr/adm/fac/bin/   scsuseradd  username 
can take some flags and arguments. Primitive help can be seen by adding the
-h
flag. there is no man page but running
perldoc /usr/adm/fac/bin/scsuseradd 
prints some documentation. Note that this command cannot be used to create local accounts.

The password entry for existing SCS user accounts be a "*". This will allow Kerberos logins to the account, and will help head-off some of the security problems that local passwords can cause. If you want to set a local password, the password entry must be an "x" and you must create an initial entry in the /etc/shadow file :

     username:NP:6445::::::
Note that you may also need to modiy other system files besides /etc/passwd and /etc/group to create an account (in particular, /etc/shadow on Solaris and recent versions of Redhat/Fedora). If you have questions on how to create an account, please contact SCS Facilities.

If you are creating a local account for somebody who does not have an official SCS identity (such as a spouse or friend), you can use one of the UID's that we've set aside for this purpose: 1515, 1516, 1518, 1519, 1520. It is suggested that you choose a username for such accounts that is different from any existing SCS username since otherwise there may be login problems caused by, for example, having the same username as somebody whose Kerberos account has been expired (this problem may be avoided by logging in as username:local). Note that the mail system will deliver mail addressed to username@machine to the username in the global whitepages, if it exists, not the local user. As a result, if the username you picked gets allocated to a real SCS user, mail sent to that local account will not go there anymore.

If an account you create for a guest or friend (ie somebody who isn't already a SCS user) is abused in any way, you are responsible. SCS Facilities can provide little or no assistance to people who do not have valid SCS Kerberos identities, or in creating accounts for such people.

Security and Restricting Access

You should assume that our network is hostile, and that any traffic on it may be monitored by potential intruders. For this reason, you should use ssh when connecting to machines. You should assume that if somebody can login to a machine that they can become root on it by exploiting some vulnerability (we install patches for all known remote exploits, but we may not install patches for all local-only exploits, and patch availability may lag behind the most recent known exploits). For this reason, you should be aware that there is a risk in typing passwords at any machine on which other people have accounts. If you are an administrator for a group of machines, it is suggested that you give yourself root access on those machines by adding your root instance to root's .klogin.local file. By su-ing on your local machine, and doing:
   ssh remote-host
you will be able to ssh in as root to machines you administer without typing a password on them.

There are several methods by which you can control access to services on a machine. All facilitized Unix machines run tcp wrappers (tcpd). The files /etc/hosts.allow and /etc/hosts.deny control the addresses that are allowed to connect to various services. See the man pages for tcpd(8) and hosts_access(5) for details on how to configure tcpd. Facilities does not deploy hosts with any default entries in hosts.deny or hosts.allow.

You can also use the program iptables to set up a firewall on your machine. Note that the default behavior when setting up iptables is to deny access to all machines, so the default configuration must be adjusted to allow at a minimum CMU SCS Facilities machines and Users access to your machine. See the iptables(8) man page for details.

Please do not configure tcp wrappers or iptables in such a way that denies access to the machine by Facilities Users or services. SCS Facilities can not provide assistance to any machine to which access has been blocked.

If you suspect that your machine has been broken into, contact SCS Facilities at once, so that we can examine the machine and see if there are any logs (such as sniffer logs) that would help detect other compromised machines. If you are looking for signs of a break-in, be aware that it is common for intruders to replace system binaries such as ls , ps , netstat , etc, so you should not trust the output of such programs.

Daemons & Services

We have made many modifications to the standard set of services that system vendors ship by default:

  • All machines run sshd The ssh client and server are not vendor versions but are locally modified to support various features in the facilitized SCS environment.
  • telnetd has been replaced with a Kerberized version.
  • finger is a local version that uses the same LDAP information that the mail system does. It is not possible to separate finger lookups from your global mail forwarding information.
  • named has been replaced by a local caching named.
  • All machines run a NNTP server (which uses DNNTP to contact the real news servers). The file /etc/dnntp_access contains the list of hosts (one per line) that are allowed to have access to the NNTP server on that machine. You should use your local workstation in preference to dnntp.srv if you need NNTP access for a particular machine. If that machine is not a .cs, .ri, or .edrc host, then it will also need to be added to an access list one of our main news servers. Contact SCS Facilities to have that done.
  • Sendmail is the supported mail transport agent. See the section below on mail for more details.
  • We have removed many of the other (often insecure) services that are sometimes found in the inetd.conf that is shipped by the OS vendor.
  • lcladmd and terad are services that we have added (depending on system type) for remote account administration and the backup system. Please do not remove these.

If you wish to make local changes to services run under inetd, do not directly edit /etc/inetd.conf . That file is automatically maintained. Instead, you should modify the file /etc/inetd.conf.local (or /etc/inet/inetd.conf.local /usr/adm/fac/bin/newinetd.conf

Services that you add to /etc/inetd.conf.local will be added to /etc/inetd.conf . Services that you preface with a "!" in /etc/inetd.conf.local will be removed from /etc/inetd.conf . If you wish to make local changes to /etc/services , you can add these changes to the file /etc/services.local and run

     /usr/adm/fac/bin/newservices

to merge the files. If you believe that a new entry in /etc/services would be of general use, then send mail to help@cs.cmu.edu and we will add it to the global services file.

Many services that are not run by inetd run under nanny. Nanny oversees a list of servers, and restarts them if they stop running for some reason. See the nanny man page for more information. Files specifying the servers that are run by nanny are located in /usr/local/lib/nannyconfig/.

Crontabs on facilitized systems are generated by a script in a similar manner to /etc/inetd.conf and services. If you wish to make local additions to root's crontab, you can add them to the file /usr/lib/crontab.local and then run:

     /usr/adm/fac/bin/newcrontab

to have the changes take effect.

The Mail System

Sendmail is the supported mail transport agent on facilitized Linux machines. For the most part, our Sendmail implementation is standard, but you should note the following:
  • Mail spool files are stored in /var/spool/mqueue and log files are stored in /var/adm/syslog.
  • We strongly encourage you to NOTdeliver mail to your local Linux workstation. Mail delivered locally may not get backed up regularly, and may not be recoverable if the machine breaks or gets reloaded.
  • By default, your workstation Sendmail daemon is configured to not accept mail from other machines. If you need (or think you need) to accept mail, please contact Facilities for instruction on how to change your default Sendmail configuration.

Updating Local Software: Depot & SUP

Software on facilitized machines is automatically updated by two methods: depot and SUP. These programs are run nightly by /usr/local/bin/dosupdepot. Occasionally, a machine will fail to depot/SUP because of some problem (such as a full disk, or AFS problems). If you suspect that your machine is not getting software updates, you can look at /usr/adm/fac/log/depot.log to see when it last successfully ran dosupdepot. At any time, you can run dosupdepot by hand to, for example, force an update after you've subscribed a machine to a different release of a software collection. Automatic dosupdepot upgrades can be disabled entirely by creating a file called /etc/disableupgrade. (Note: don't do this---you will not get security fixes and other important patches and software upgrades if you do). You can force a depot run even if /etc/disableupgrade exists by running dosupdepot with the -force option. Every machine has a particular release level of software that it is subscribed to by default (individual collection release levels may be overridden by entries in /usr/local/depot/depot.pref.local). The machine-wide software release level is controlled by the file /etc/releaselevel. By default (if that file does not exist), the release level is omega. You can subscribe a machine to alpha or beta release levels of software by putting a single line reading alpha or beta in /etc/releaselevel.

Depot controls the contents of /usr/local/, and does not modify any files outside of that directory. If you make any local modifications to the contents of /usr/local, you must modify local depot configuration files or your modifications may be lost. Information on how to make local changes to depot configuration files can be found in the on-line depot documentation.

Note that by default depot will remove any files from /usr/local that are not referenced by some depot configuration. Local modifications to /usr/local must be noted in the /ude/local/depot/depot.pref.local file or the next time depot runs locally installed files will be deleted.

To install software onto a facilitized Linux machine:

  • Install your software into /usr0/local/{bin, lib, etc....}
  • In your /usr/local/depot/depot.pref.local file add the following line:
               path  mylocal  /usr0/local
     
    This will tell depot to look in /usr0/local for a collection called 'mylocal' and link all of files found there into the /usr/local hierarchy.
  • Run, as root, the dosupdepot command. The following command will save the output in a file called /tmp/depot.log:
               dosupdepot >> /tmp/depot.log 2>&1 
    
  • there are conflict errors in the log file and depot did not successfully complete, then you must add an 'override' construct to your depot.pref.local file. In this example I am assuming that the 'java' collection is the one that is in conflict with your software install:
               override  mylocal  java
    
    That line will tell depot that if there are conflicts, in this case the 'mylocal' and the 'java' collections want to install the same named file in the the same place, that the 'mylocal' collection will win. Typically if depot finds conflicts it will just give up with an error and make the administrator take some action. If there continue to be conflicts you'll have to add the additional collections to be overridden to the list. For example, if there is another conflict with a collection named 'foo' as well as 'java' the line in the ocal/depot/depot.pref.localwould become:
               override mylocal java,foo
    
    This will have to be done iteratively (i.e. run 'dosupdepot'; then examine the log files for conflict errors) until all conflicts are found.

For most non-essential files, depot does not copy the file to your host, rather it makes symlinks into AFS. See the depot documentation for instructions on how to have depot copy files onto your local disk. Note that some recently deployed hosts have been configured to have some popular non-essential collections copied instead of linked. If you find yourself short of space on /usr, you may wish to check the contents of local/depot/depot.pref.localperhaps free up (by making them non-local) some of the space that is being taken up by these local collections.

SUP is responsible for updating various system configuration files that are maintained by Facilities. A list of the facilities-provided files that have been upgraded by SUP can be found in the file last in a subdirectory of /usr/adm/fac/sup/. The exact path will depend on the OS you are running. For example, on Linux FC5 the full path is /usr/adm/fac/sup/fac.i386_fc5/last. If you wish to not have SUP upgrade a particular facilities-provided file you can create a file called refuse in the SUP directory (/usr/adm/fac/sup/fac.i386_fc5/ in the case of Linux FC5) and in that file put a list of files and directories (one per line, without the leading /) that should not be upgraded.

Important note: if you put down a refuse file, you will not get bug/security fixes for those files, and things may break. For example, if a new version of program A needs a new version of file B to work, and you refuse upgrades for B and not for A, then A will stop working. We recommend not refusing upgrades unless there is an overwhelming reason to do so. Facilities staff will not debug problems caused by refusing some portion of (or all) Facilities software updates.

AFS Issues and Problems

All fully-facilitized machines run the Andrew File System. AFS provides a wide-scale, shared file system with reasonable security features (unlike NFS). Most of the software that is run out of /usr/local is actually symlinked to AFS. The basic AFS configuration files are usually located in /usr/vice/etc/. The only file there that you may want to modify is /usr/vice/etc/cacheinfo. In particular, this file controls the default AFS cache size. You may wish to increase the size of your AFS cache if you will frequently access large files out of AFS. You should never make your AFS cache size so large that there is a risk that the partition it is on will run out of space. If that happens, unpredictable behavior may result.

The cacheinfo file also specifies the default location of the AFS cache, but on many machines the location it specifies is often a symlink to the real cache directory. Sometimes you may wish to move your vice cache to another partition. You should be sure to ascertain exactly where your vice cache is really located before changing its location (follow the symlinks). Do not remove currently-used cache files, as this can panic your machine. Instead, create the new cache directory, change the default cache location (either by changing the entry in cacheinfo or by having the location specified in cacheinfo be a symlink to the new location of the cache), and reboot your machine. Then remove the old cache directory.

Occasionally, the AFS cache on a machine may become corrupted. Symptoms of this problem include an inability to access certain files or directories in AFS, or being unable to run a binary located on AFS without an immediate core dump. If these symptoms occur, you should verify that it is a local problem (as opposed to an AFS server problem) by seeing if it occurs on other machines of that type, or comparing checksums for the binaries between the machine having the problem and other machines. You can use the command

     /usr/local/bin/fs checkservers

to check on the status of AFS servers that your local machine's cache manager has recently contacted. The command

     fs checkvolumes

will check the status of alternate locations of volumes for replicated collections, and may fix some problems. If the problem is local to a particular machine, then it is very possibly caused by AFS cache corruption. There are a few things to try to fix the problem. If it is a single file or volume, you can run

     fs flush path-to-file

or

     fs flushv path-to-volume

Alternatively, you can try interactively reducing the cache size, and then increasing it back to the default. To do so, run

     fs setcachesize 20  (or some other small number)
wait until that command completes, and then run
     fs setcachesize -reset
to reset it to its original size. Sometimes, in cases of severe corruption, the above procedures may not the problem. In order to completely clear the cache, remove the CacheItems file in the cache directory and reboot (instead of rebooting, you could try manually stopping and starting AFS using the appropriate rc script and arguments, but that does not always work).

With very few exceptions all SCS hosts should be a member of the system:friendlyhost AFS group. If you have trouble accessing files from your machine that should be accessable by system:friendlyhost, contact Facilities, and we will add that machine to friendlyhosts. One consequence of being a friendlyhost is that, if you are running a webserver on your machine, you should not allow the server to access files in /afs/cs, as that would circumvent the purpose of the system:friendlyhost access controls.

Non-Facilitized Machines

The information above applies to machines running the SCS Facilities environment. SCS Facilities can provide little, if any, support for machines that do not run our environment. If you or your project have such a machine, then you or your project is responsible for taking care of it. In particular, you are responsible for providing security patches and upgrades, and ensuring that it does not become a problem (eg runs a password sniffer or is used for denial of service attacks) for the rest of the facility.