[Return to Library] [Contents] [Previous Chapter] [Next Section] [Next Chapter] [Index] [Help]


8    RIS Troubleshooting

This chapter contains information to help you troubleshoot problems with your RIS system. Topics include:


[Return to Library] [Contents] [Previous Chapter] [Next Section] [Next Chapter] [Index] [Help]


8.1    Problems with the ris Utility Lock Files

To prevent multiple users from performing operations on RIS areas simultaneously, the ris utility creates two lock files in the /tmp directory, rislock and ris.tty.lock. when the user is installing or deleting software from a RIS area. If the ris utility is run by another user, or the same user on a different terminal, selecting add or delete software generates a message similar to the following:

The ris utility is currently locked while j_smith on /dev/ttyp3
is installing software.  Try again later.

If the ris utility is stopped prematurely, these lock files may not be removed. If the lock files are not removed, the message displays even though no other user is using RIS.

If this occurs, you must delete the lock files from the /tmp directory.

Caution

Before deleting the lock files, ensure that no other user is using the ris utility.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


8.2    Problems with Client Registration

The server requires a client's hardware address in order to boot the client over the network. The ris utility prompts you for the client's address during the registration process. If it does not, check the following:

Note

The client can use the setld utility to load optional subsets or layered product subsets over the network. See the System Administration guide for more information about loading subsets with the setld utility.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


8.3    Problems with Cloned Client Registration

A CDF is created as a result of a RIS installation. To use the CDF for installation cloning, the hardware configuration, to some degree, and the software subsets to load must be the same. Before allowing a CDF to be specified for installation cloning of a client, RIS attempts to verify that the subsets specified in the CDF exist in the RIS area which the user has selected. If they do not match, the CDF is rejected for use. This error can occur if the version numbers of the subset do not match(for example OSFBASE350 and OSFBASE400).

It is possible that the CDF will be used for installation cloning of a system that is registered to a different RIS area. In this scenario, it is possible that the subsets contained in these RIS areas are different. It is also possible that the version of Digital UNIX served by the RIS area is different from the version specified in the CDF. In this scenario, there would be many missing subsets because none of the subsets specified in the CDF would be present in the RIS area.

In the event that a CDF is specified that contains the name of a software subset that is not present in the selected RIS area, the following is displayed:

Enter a CDF name or press <Return> to exit CDF selection: rz26.cdf

 
The selected CDF, rz26.cdf, specifies software subsets that are not present in the selected RIS environment. The missing software subsets are: OSFSERPC400
 
Please select a different CDF.

If you attempt to use a CDF that was not created as part of a RIS installation, it is not compatible with installation cloning. The following is displayed:


 
Enter a CDF name or press <Return> to exit CDF selection: rz26.cdf
 
The selected CDF, rz26.cdf, was not created during a RIS installation. Therefore, it cannot be used for Installation Cloning. Please select a different CDF.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


8.4    Problems with Client Not in RIS Database

If a message appears on the client's console while you are performing a RIS installation that states that the client is not in the RIS database, check the following on the server:


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


8.5    Problems with RIS Server Response

Booting failures often occur because the information possessed by the server is invalid. The following two server files are involved in handling RIS clients. You should check them in the order listed:


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


8.5.1    Diagnosing Response Failures on Servers Using bootp Daemon

Digital UNIX servers respond to bootp requests from Digital UNIX clients. If the Digital UNIX server's information is correct for the client but the server still fails to respond, enable logging of bootp messages on the server by editing the server's /etc/inetd.conf file and by modifying the line for bootps to include the -d option as a bootpd command argument. For example:

bootps  dgram   udp   wait   root   /usr/sbin/bootpd   bootpd -d

Then, find the process IDs for the Internet daemons. Send a HUP signal to the inetd daemon so it will reread the /etc/inetd.conf configuration file, and kill the bootpd daemon. For example:

ps x | egrep "inetd|bootpd"

  228 ??  I      0:00.93 /usr/sbin/inetd
  243 ??  I      0:00.91 /usr/sbin/bootpd
 9134 p2  S      0:00.23 egrep inetd|bootpd
kill -HUP 228
kill -KILL 243

Caution

You must kill the inetd daemon before killing the bootpd daemon.

It is not necessary to restart the bootpd daemon manually; the inetd daemon starts it automatically.

To track boot requests as they occur, run the tail -f command on the /var/adm/syslog.dated/today's-date/daemon.log file and boot the client. Many daemons other than the bootpd daemon log information to the daemon.log file; however, the log file shows a hardware address that matches the address in the /etc/bootptab file for the client.

If the client's boot requests are not logged, you can enable additional logging by editing the /etc/inetd.conf file, and add a second -d option to the bootpd command. Each additional instance of the -d option (up to three) increases reporting; the second instance enables the server to report all boot requests, even for client systems it does not recognize. This level of reporting should help you determine where in the system the request is being lost.

If you modify the /etc/inetd.conf file, restart the inetd daemon by sending it a HUP signal. Example 8-1 shows a section of a daemon.log file. It shows the data logged by various system daemons, including the bootpd daemon when run with two -d flags set.

Example 8-1: Sample daemon.log File

Jul 28 14:56:36 ludwig mountd[191]: startup
Jul 28 14:56:38 ludwig xntpd[235]: xntpd version 1.3                   [1]
Jul 28 14:56:43 ludwig mold[269]: mold (V1.10) initialization complete
Jul 28 14:56:44 ludwig evd[272]: E003-evd (V1.10) initialization complete
Jul 28 14:56:45 ludwig internet_mom[275]: internet_mom - Initialization
                complete...
Jul 28 14:56:45 ludwig snmp_pe[278]: M004 - snmp_pe (V1.10) initialization
                complete
Jul 28 16:34:55 ludwig inetd[282]: /usr/sbin/bootpd: exit status 0x9   [2]
Jul 28 16:35:47 ludwig bootpd[1228]: bootpd 2.1a #0: \                 [3]
                Fri Feb 05 00:32:28 EST 1993
Jul 28 16:35:47 ludwig bootpd[1228]: reading "/etc/bootptab"
Jul 28 16:35:47 ludwig bootpd[1228]: read 3 entries from "/etc/bootptab"
Jul 28 16:35:47 ludwig bootpd[1228]: request from hardware address \   [4]
                08002B2C9C6F
Jul 28 16:36:08 ludwig bootpd[1228]: request from hardware address \   [5]
                08002B309668
Jul 28 16:36:08 ludwig bootpd[1228]: found: host1.dec.com (08002B309668)
                at (16.69.224.83)
Jul 28 16:36:08 ludwig bootpd[1228]: file /var/adm/ris/ris0.alpha/\
                vmunix.host1.dec.com
Jul 28 16:36:08 ludwig bootpd[1228]: vendor magic field is 0.0.0.0
Jul 28 16:36:08 ludwig bootpd[1228]: sending RFC1048-style reply

  1. Many daemons log information to this file. [Return to example]

  2. Result of sending a HUP signal to the inetd daemon and killing the bootpd daemon. [Return to example]

  3. A new bootpd daemon starts up in response to a boot request. The bootpd daemon reads the /etc/bootptab file as a part of its startup. [Return to example]

  4. A bootpd request by a system with hardware address 08002B2C9C6F. Because the system is not a client of this RIS server, its hardware address is not in the server's /etc/bootptab file. [Return to example]

  5. A bootpd request by a system with hardware address 08002B309668. The system is a client of this RIS server. [Return to example]


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


8.5.2    Diagnosing Response Failures on Servers Using the joind Daemon

To serve bootp requests from clients on servers running Digital UNIX Version 4.0, the joind daemon, which also services Dynamic Host Configuration Protocol (DHCP) requests, should be running. DCHP enables the automatic assignment of IP address to clients on networks from a pool of addresses. The IP address assignment and configuration occurs automatically whenever appropriate client systems (workstations and portable computers) attach to a network. The Digital UNIX implementation of DHCP is based on the JOIN product by Competitive Automation. Ensure that the server's information on the client is correct, namely information contained in the bootptab file of the server as shown in Section 7.1.3. If the server still fails to respond, enable logging of bootp messages on the server by using the following procedure:

  1. Check that the joind daemon is servicing your bootp request. This can be done by issuing the following command:

    ps -x | grep -E "joind"

    393 ??       I        0:05.82 /usr/sbin/joind
    26446 ttyp0     S +     0:00.01 grep -e joind
    

  2. Determine the current setting of JOIND_FLAGS by issuing the following:

    rcmgr get JOIND.FLAGS

  3. Stop the joind daemon by issuing the following command:

    /sbin/init.d/dhcp stop

  4. Restart the daemon with debugging turned on by doing the following. Set the JOIND_FLAGS to indicate debugging is turned on.

    rcmgr set JOIND_FLAGS y -dx

       Where x is the level of debugging. A value from 0 to 9 is valid.
       Where y is the previously determined setting of the JOIND_FLAGS.
    

    /sbin/init.d dhcp start -dx

    Example 8-1 shows a section of a daemon.log file. It shows the data logged by various system daemons, including the joind daemon.

  5. To turn off debugging, do the following:

    /sbin/init.d/dhcp stop
    rcmgr set JOIND_FLAGS y

        Where y is the previous determined setting of the JOIND_FLAGS.
        determined.
    
    #  /sbin/init.d dhcp start


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


8.5.3    Restrictions on Running bootpd and joind

A RIS server should run the bootpd or joind daemon. A RIS server running both of these daemons is not supported. The results will be unpredictable when running both daemons.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


8.5.4    Problems with booting the RIS client

If you encounter a situation where the system will not boot or where the system will boot but then not be able to mount the root file system, you should check to ensure that the RIS client is not registered for bootp service on multiple RIS or Dataless servers. In order for the bootp protocol to work properly, it is important that the client be registered for bootp service on only one server. The client is registered for bootp service when they are registered for a Digital UNIX operating system base product or when the client is registered as a Dataless client.

It is possible for a RIS client to be registered to two RIS servers at the same time, given they are not both registered for the Digital UNIX operating system base product on both servers and attempt to boot their systems using bootp.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


8.6    Problems with System Panics on Boot Due to the Inability to Mount the Root File System

Starting with Digital UNIX version 4.0, the installation media is mounted as the root file system for installation. This occurs in both the case of CD-ROM installation and RIS installations. As a result, it is important that the installation media be mounted on the server locally. Due to limitations imposed by NFS, RIS cannot provide client access to files which it has remotely mounted from another system. The distribution media or extracted RIS area must be available through a local mount point on the RIS server.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


8.7    Problems with Loading the Correct Kernel File

If the Digital UNIX server responds but an incorrect kernel (vmunix) is loaded, it is possible that the server's RIS area is configured incorrectly. You can observe the loading process by editing the /etc/inetd.conf file and restarting the Internet daemon as described in the previous section. In this case you add the -d option to the line containing the tftpd command, as follows:

tftp    dgram   udp   wait   root   /usr/sbin/tftpd    \
                                tftpd -d /tmp /var/adm/ris

Logging the server's tftp traffic shows you what file is being transferred and what time the transfer is started and finished. Ensure that the proper vmunix file is being loaded and that the loading operations are completed correctly.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Chapter] [Index] [Help]


8.8    Problems with Getname Failing on Client

If the RIS server is using C2 security and the RIS password has not been set to not expire, it is possible for the RIS clients to be denied service. If the RIS client receives a message similar to the following:

Cannot find the name for client using bin/getname. Check with the system
manager of you RIS server

The RIS password on the server has probably expired. To fix this problem, refer to Section 3.2.