High-Performance Networking Unleashed

- 29 -

Administering Your Network

by Phillip T. Rawles

Network administration is a multifaceted area. Unlike other network functions, network administration focuses primarily on what happens after the network is installed. While it is true that administration tasks and issues must be considered in the design and implementation phases of a network installation, administration is primarily an operations issue.

This chapter focuses on the basic network administration tasks and functions common to all networked systems. Although the functional implementation of administrative tools varies among network operating systems (NOSs), each NOS provides a means of accomplishing these basic administrative functions. In this chapter, tasks are arranged in the order that an administrator installing a new network operating system would typically need to address them.

Network Administration Functions

Network administration tasks include installing and configuring network workstations, creating and maintaining user accounts, backing up the system, distributing software across the networked workstations, and providing end user support. Each of these tasks serves a specific purpose in the administration of a networked system. However, one task can have an impact on several other tasks. In order to gain insight into how these tasks are interrelated, they are commonly grouped into functional areas. The International Standards Organization (ISO) Network Forum has divided network administration and management tasks into five basic functional areas: Fault Management, Configuration Management, Accounting Management, Performance Management, and Security Management.

Fault management is the process of locating and correcting network problems. A fault is defined as any anomaly that adversely affects network operations. Common faults include broken network wiring, disk errors, and memory failures. The fault management process begins when a suspected fault is reported. An analysis of the reported symptoms is performed to identify the problem. After the problem is identified and defined, the root cause of the fault situation must be isolated to ensure that proper corrective action can be taken.

Fault management is an exercise in troubleshooting. Some problems are easily identifiable, but others may be caused by the interaction of several factors. Even for those problems that have a single cause, the troubleshooting process can be difficult if the problem is intermittent. Once the root cause of the problem is identified, the administrator can take corrective action.

Fault management is the most important network administration task. Without proper fault management procedures, network administrators can find themselves spending all of their time trying to keep the network functional. By implementing good fault management procedures, network administrators can reduce the amount of time spent "firefighting," freeing them to focus their efforts on improving network operation and performance.

Configuration management is the process of gathering information from the network and using that data to manage and optimize network devices. Common configuration management tasks include the assignment of network addresses to network devices and maintaining an up-to-date inventory of equipment installed on the network. Configuration management is a prerequisite function to performance management.

Accounting management is the process of managing the use of network resources. Basic accounting management tasks include the creation and maintenance of user accounts and groups and the assignment of access rights to users and groups. Ensuring that adequate network resources are available to network users and documenting the usage of such resources is also an accounting management function.

The activities associated with maintaining and improving network speed, responsiveness, flexibility, and adaptability are collectively known as performance management. Typical performance management tasks include system tuning, overall capacity planning, and performance troubleshooting.

Networked systems allow for information to be distributed rapidly throughout an organization. Although networking allows productivity gains and quicker information access, it creates the possibility of sensitive information being destroyed or accessed by unauthorized personnel. The process of protecting against such events is known as security management.

Network Address Management

One of the first tasks that you must perform when you install a new network is the assignment of network addresses to the stations on the network. Although this is technically a network installation task, maintaining a list of addresses and making changes in station configurations over time is a network administration function. Before considering the various methods available for network address assignment, let's consider the role of network addresses.

Each station on a network requires an address that uniquely identifies it to the other stations on the network so that messages can be addressed to it. Fortunately, such an address is automatically etched on each network interface card (NIC) by the manufacturer. This address, called the media access control (MAC) address or physical address, consists of 16 hexadecimal digits. The first eight digits represent the manufacturer of the NIC, and the last eight digits are assigned in succession, much like a serial number. A network may contain network interface cards from various vendors that were manufactured at various times, thus yielding a near random list of MAC addresses. Although each address is unique, you have no logical means of determining which station is at which address or how to get to that address.

To illustrate the problem, consider a small town of 50 houses. Each house has a unique house number consisting of random digits assigned to it, such as 579432467. The letter carrier for this town has several letters to deliver. However, all they have to identify the destination is the house number. How can they locate the correct house without driving down every street until they happen upon it?

In order to make the houses easier to identify, it would be helpful to assign each house a logical address that provides more information to the letter carrier. Logical location numbers arranged in ascending numerical order combined with street names (such as 121 Wood Street) would make it significantly easier for the letter carrier to find the correct house. After finding the house by this logical address, the letter carrier could verify the physical house number with the address on the message to ensure that the letter is being delivered to the correct location.

Just like the letter carrier in this example, networks use logical addresses to identify stations and facilitate the delivery of messages between networks. These logical addresses are commonly referred to as network addresses. Just as a street address consists of a street name and a house number, a network address consists of a network number and a station number. The network number uniquely identifies the network segment the station is on, and the station number uniquely identifies the station on that segment.

Each network protocol uses its own logical network addressing scheme. In the case of IPX, the network number and station number are handled separately. For TCP/IP, the network number and station number are run together into a single IP address. The IP address is parsed by looking at a second parameter called the subnet mask.

IPX

Network address assignment for networks based on the Internet Packet Exchange (IPX) protocol typically requires little administrator intervention. Each network segment is assigned a network number at the server. Most network operating systems randomly choose a network number, although the network administrator always has the option of assigning a specific number to a network. Station number assignment is performed automatically at boot time by the network operating system.

IP

Network address management for networks based on the Internet Protocol (IP) is significantly more involved. Rather than having separate network numbers and station numbers, the IP protocol combines both into a single composite IP address. The network address and station address portion of the IP address are determined by consulting a key that indicates which part of the composite address represents the network address and which part represents the station address. Both an IP address and a subnet mask are required for two stations to communicate through the IP protocol.

If the two stations are located on separate network segments, a third parameter is required to provide a means of internetwork communication. This third parameter is the default gateway, sometimes referred to as the default router or gateway router. The default router represents the door from one network to other networks. The default gateway is similar to the out-of-town slot at the post office. If a station wants to send a message to a station on another network, it sends it to the default router.

Unlike IPX, the IP protocol does not automatically assign any portion of the composite IP address. All configuration parameters must be assigned by the network administrator. You can assign configuration parameters manually or through one of two automated methods: the bootstrap protocol and the dynamic host configuration protocol.

The bootstrap protocol (BootP) was the first method developed to automate the IP configuration process. The basic procedure used by BootP is as follows: At boot time, the station makes a broadcast message to the network and asks if anyone has IP configuration information for it. A BootP server located on the same network segment examines the MAC address of the requesting station and consults a table of known MAC addresses. If the MAC address is in the server's table, the server returns the IP configuration parameters associated with the MAC address to the station. If no server answers the station's request, the BootP process fails and the station is unusable until an alternative address assignment method is successful.

The bootstrap protocol is a static addressing protocol. Each station on the network always gets the same IP configuration parameters, barring action by the network administrator. The static nature of BootP requires that an IP address be reserved for the exclusive use of every network device that might connect to the network. There is also no automatic check performed by the BootP server to ensure that the administrator has assigned functionally correct IP configuration parameters.

NOTE: The static nature of BootP can be limiting in modern computing environments. Laptop computers that are only occasionally connected to the network require a network address to be reserved for them at all times. Because there are a limited number of network addresses available on a network segment, this can artificially limit the number of stations that can be connected.

Another shortcoming of BootP is the administrative workload required. Whenever a new network device that requires configuration through BootP is placed into service or a device is moved to a new network, the administrator must edit the address records on the BootP server containing the MAC address of the affected device. Because the BootP protocol uses a station's MAC address for identification, replacing a network interface card in a station requires the BootP server to be updated to reflect the new MAC address associated with the station.

In order to solve these problems with BootP, a new automatic IP address assignment strategy was developed. The dynamic host configuration protocol (DHCP) provides a means of dynamically assigning IP configuration information. The operating premise of DHCP is that a station will once again send a broadcast message to the network asking for configuration information. Upon receipt of the request, the DHCP server checks to see if it has a static address defined for the requesting MAC address. If it does, it returns the predefined IP configuration parameters for the requesting MAC address. If there is no static address defined, the DHCP server will select an address from a list of available addresses and return it to the requesting station.

The dynamic address returned by the DHCP server is called a lease address because it has a time limit associated with it. The requesting station can only use the leased address for a certain period of time (commonly 72 hours). At the end of the lease, the station must contact the DHCP server and request an extension to the lease. By using dynamic address assignments, the number of IP addresses required is limited to the number of available physical network connections rather than being artificially limited to the maximum number of devices that could require connection to the network. The amount of administrator overhead required to manage the server is also reduced because new stations no longer require modification to the server.

NOTE: Even if a DHCP server is used for IP address assignment, there are certain types of stations that should be mapped to static addresses. In general, all servers should be assigned static addresses. Network operating system file and print servers, database servers, and any Internet servers such as FTP servers or HTTP (World Wide Web) servers all fit into this category. Dynamic addresses should be reserved for use with client stations. However, because most stations on a typical network are clients, there is a considerable benefit to the use of dynamic addressing.

IP Name Services

The IP address is the address used by the network to route messages from one station to another, but most stations have alphanumeric names that are used by the users to identify them. The hostname is used at the local level as a means to identify a station in a manner easy for humans to remember. These names (commonly referred to as hostnames or domain names) are immediately resolved to their IP addresses by the local workstation for use by the network.

Just as in IP address assignment, there are static and dynamic methods for resolving domain names to IP addresses. The first method developed for host name resolution is the Domain Name Service (DNS). The Domain Name Service is a static mapping of host names to IP addresses. When a user enters a destination host name, the station asks the DNS server for the IP address associated with the host name. Upon receipt of the destination IP address, the station sends the message to the destination station. Due to the static nature of DNS, it can only be used when network stations have static IP addresses obtained through manual configuration, BootP, or DHCP in static mode.

At this time, there is no open standard for handling host name resolution for dynamically assigned IP addresses. In order to facilitate the use of dynamic IP addresses in Windows networks, Microsoft developed a proprietary protocol called the Windows Internet Name Service (WINS). In a WINS environment, each station contacts the server at boot time and gives the server its IP address and host name. The server then consults this dynamic table when asked for the IP address for a specific host name. Due to its dynamic nature, WINS can be used with any IP address assignment method. Stations with statically assigned IP addresses will always provide the same information to the DHCP server.

The Internet Engineering Task Force (IETF) is currently working on an open standard for dynamic IP address host name resolution. The project, tentatively called Dynamic DNS, aims to add support for dynamically assigned IP addresses to the existing DNS standard. Microsoft has pledged to support Dynamic DNS in place of WINS upon completion of the new standard.

User Accounts and Authentication

The first computers were only capable of running one operation at a time. In order to use the system, you had to be sitting at the computer's console. From the system's perspective, everyone at the console was the same user.

This single-user approach contradicts the basic fact that different people have different computing needs. They need access to different files, applications, and varying rights to the system's configuration. Without the ability to distinguish between users, the system cannot provide for such varying needs. This leads to basic concerns in terms of data security and ease of use.

If the payroll system runs on the same physical computer system as the maintenance system, there needs to be a way to ensure that maintenance personnel do not have access to the payroll system and that payroll personnel do not have access to the maintenance system. It is equally important to ensure that both the payroll and maintenance personnel do not have access to the system's configuration information unless they happen to be the system administrator.

In order to meet the varying needs of the people using the system, the concept of user accounts was developed. Accounts refer to the capability of a computing system to differentiate between various people using it and to customize the information and the way information is presented to the user. Although the system may still only support one user at a time, it can grant access to files, applications, and configuration information based on profiles associated with each user account. This section details the issues associated with implementing and maintaining user accounts.

User Identification

In order for a system to distinguish among users, there must be a unique means of identification for each user. This identification method serves as the computer system's means of knowing who is accessing the system and what system resources they are able to access. There are many possible means of identifying users, but the most common technique is the use of a string of characters known as the User Identification Code (UID).

User Identification Code requirements vary from system to system, but they usually consist of a single string of characters with a maximum length. The longer the UID becomes, the harder it is to remember and the easier it is to mistype. For this reason, it is good administrative practice to set a maximum UID length even if there is no limitation from the network operating system. A common maximum length for user identification codes is eight characters.

There are several approaches to assigning user identification codes. For large systems or accounts that are used only for a finite period of time, such as class accounts in an educational environment, administrative overhead can be greatly reduced by assigning arbitrary codes such as "abc123." Although codes of this type are easy to create and manage, they are not descriptive.

For long-term user accounts, a code based at least in part on the user's name has the advantage of being easy for both the user and system administrator to remember. This is especially important in systems where UIDs are used for electronic mail because users must know each other's code to send mail. For a small departmental system, the user's first name can be used. However, problems can arise when there is more than one person requiring access to the system with the same first name. What if there are two Freds in the department? For this reason, first-name-based UIDs are too ambiguous for most systems. A more formal method of user identification must be developed.

There are two commonly used user identification strategies based on combinations of the user's first and last names. The first strategy is to take the user's last name and append their first or first and middle initials to it, yielding a user identification code that is easily associated with its owner. This approach has the additional benefit of providing easy alphabetical ordering by last name. However, problems can arise if last names are longer than the maximum number of characters allowed in the user identification code. For instance, John Allen Smith might easily resolve to smithja, but Herbert James Gaither-Weiss creates problems. Should Herb's UID be gaitherh or gaithewh? Another problem with this method is that it is difficult to determine where the last name stops and the first initial begins. For instance, the UID fells, which belongs to Sheri Fell, might be thought to belong to Sheri Fells.

Another strategy is to use the first and middle initials of the user's name combined with however many characters of the last name will fit within the user identification code length guidelines. Assuming an eight-character UID limit, John Smith would become jasmith and Herb Gaither-Weiss would resolve to hjgaither. This approach is easier to remember, although the resulting user identification codes are no longer alphabetized by last name.

Regardless of the algorithm you use to develop user identification codes, it is imperative to use it consistently. This is especially important if the resulting user identification codes are going to be used for electronic mail. By providing a consistent strategy, both users and administrators will be able to easily determine UIDs of other users.

Passwords

The second part of user authentication is proving to the system that the user is who he or she claims to be. The most common means of providing this assurance is by using a password. If good password rules are implemented and steps are taken to ensure users follow these rules diligently, passwords can be a very effective means of user authentication.

The main problem with implementing effective password security is combating human nature. People typically use a password only a few times daily. This may not be often enough to readily memorize the password. Because of this, people have a tendency to choose passwords that are easy for them to remember. Unfortunately, passwords that are easy for them to remember are also easy for someone else to guess. Commonly used passwords that provide low levels of security include names of family members or pets, birthdates, and special interests such as a favorite sports team.

The key to implementing effective passwords is to use rules that combat this predisposition and require the passwords that are chosen to be secure. Password rules should reflect the environment and the importance of the resources being protected. Rules protecting a bulletin board system can and should be less stringent than rules protecting a company's research and design information.

Many strategies to increase password effectiveness are included in most network operating systems. The most effective rule for increasing the effectiveness of a password is to institute a minimum password length greater than six characters. In general, the longer the password is, the lower the chances of its being compromised by someone looking over a shoulder and catching it as it is typed into the system. Most network operating systems can define a minimum password length.

Similarly, the longer a password is used, the higher the probability that it may be compromised. Forcing periodic password changes can reduce the risk of a password being stolen or restore security to an account that has had the password compromised. Unfortunately, if password changes are required too frequently, users will have a hard time remembering their new passwords. A nice balance of security and user needs is to require password changes be limited to no more than once every 30 days. Once every 90 days is a good rule of thumb for most systems. Most network operating systems can force periodic password changes.

TIP: If periodic password changes are implemented, the system should be capable of remembering the last few passwords used and disallowing their reuse. This prevents a user from simply alternating between two passwords, effectively negating the benefit of periodic changes. Although this feature is not common in network operating systems, some UNIX systems and database systems provide this capability.

Another means of increasing password effectiveness is to require mixed case or the use of nonalphabetic characters in the password. By using capitalization, you can make a fairly short password more secure, especially if the capitalized letter comes in the middle of the password rather than at the beginning. Another highly effective password rule is requiring a punctuation mark to be included in all passwords.

A good password construction strategy is to combine two short words separated by a punctuation mark. You can enhance the security of this technique by including mixed case, yielding a password such as baT$boot. This is especially effective if the words bear no particular relationship to each other. The capability to force mixed case and special characters in passwords varies among network operating systems, although most will provide at least one of these options.

Although these strategies greatly increase password effectiveness, the best password rules will not prevent unauthorized access if the user does not take an active role in securing and valuing her password and the information it protects. Common sense items such as not writing passwords down and not giving passwords to others are all too frequently ignored. In the long-term analysis, user education is the most important component determining overall password effectiveness.

Groups and Access Control

Many users share common requirements in terms of applications and data access. One of the most effective administrative strategies is to create groups of people that share common needs and provide access based on the group rather than the users in the group. You can create a group of users that contains all members of the accounting department who need access to the payroll system. As each user in the group inherits the rights of the group, you can grant access to the system to the entire group at one time.

This approach also increases overall system security by providing consistency for all members of the group. If the access needs of the group changes, the administrator can make the change in one place rather than changing each user on an individual basis. New users requiring access to the system gain all the appropriate rights by simply being added to the group, and users who no longer need access can simply be removed from the group.

Although the use of groups eases administrative overhead, there is at least one case when it is not appropriate. Each user should have a home directory to which he has exclusive access. Some network operating systems provide the capability to prevent the system administrator from gaining casual access to a user's home directories. Administrators can still gain access to home directories by changing the security profile, although such action leaves a security trail for the owner of the directory. This capability helps prevent network administrators from unknowingly accessing sensitive information stored in personal directories.

Disaster Prevention and Recovery

No one wants to think about network disasters. However, it is a network administrator's responsibility to take action to ensure that the organization can continue to function in the event of a disaster. Creating an effective disaster prevention and recovery plan is complicated by the sheer number of potential disasters. The catastrophic failure of a network component, earthquakes, tornadoes, and acts of terrorism such as the World Trade Center bombing are all examples of network disasters.

Regardless of the nature of the disaster, the primary objective of disaster prevention and recovery is to minimize the period of time that information services are not available to the organization. Taken to its logical extreme, the best option is for the users to not realize that a disaster has taken place.

Uninterruptible Power Supplies

The most common cause of system failure is damage caused by electrical power fluctuations. Electrical power fluctuations have many causes, ranging from surges to lightning strikes to brownouts caused by overburdened circuits during the air conditioning season.

Regardless of the cause of power fluctuations, they pose a serious risk to network equipment. The simplest method of protecting against damage from high voltage electrical spikes such as lightning strikes is to install surge protectors. However, surge protectors cannot supply power to devices in the event that electrical power is lost, resulting in improper system shutdown cycles. Failure to properly shut down network servers and equipment can have disastrous repercussions on system functionality, especially if they occur during system upgrades or backup procedures.

In order to prevent system damage from line surges, brownouts, and blackouts, the use of uninterruptible power supplies (UPS) is required on all servers. The most basic type of UPS passes the incoming power directly to the attached devices while monitoring for power failures. In the event of a power failure, the device is switched to battery-supplied power. This type of UPS is best used for workstations or network communications equipment.

Advanced units constantly supply power to the devices from the batteries, which are constantly charged by the incoming power. This type of UPS is preferable for server installations because it provides better protection from power fluctuations while effectively eliminating the time lag and potential cycle flux resulting from the change from line power to battery power. Regardless of the type of UPS selected, it must be carefully sized to ensure it is capable of providing adequate power to the attached devices. The use of undersized units will result in lower than expected runtime and potentially "dirty" power.

Power protection systems that protect servers and other complex systems should also provide a means of communicating to the device they protect. This communications capability allows the protected device to realize that it is running off battery power and to shut itself down properly prior to exhaustion of the battery. Most UPS devices are equipped with an RS-232 serial communication port that provides this capability. The device being protected must have a compatible communication port and must be capable of running a service that constantly monitors the communications port and notifies the system of updated conditions in the UPS system. All modern network operating systems support such services either from the operating system itself or from third-party drivers that commonly ship with the UPS system. Some advanced systems also provide a means for the system connected to the UPS communications port to inform other devices that might be powered through the UPS of its condition.

System Backup

The data on most networked systems is far more valuable and difficult to replace than the systems themselves. It is this inherent value of data that makes modern information systems economically feasible. The most important part of disaster prevention is providing a means of protecting this information through system backup. In order to recover from a failure, you must have a good system backup to recover. The first step in implementing an effective backup program is to determine what data you need to back up and how often it needs to be backed up. Some data is fairly static and might only need to be backed up on a monthly basis, whereas other data is changed on a minute-to-minute basis and may need to be backed up every few hours to ensure system stability upon restoration.

Some common examples of data types and their typical backup schedules are as follows: System configuration files and applications typically are updated very rarely and only need to be backed up when they are altered through administrative action. User files are changed more often and should be backed up at least once every working day.

The best time to back up these files is during night hours when the system can be secured to ensure that there are no open files that would not be saved on the backup. A good time to schedule daily backups is 2:00 a.m. This is late enough that most users who might be working overtime have completed their work and early enough to complete the backup process before users who come in early arrive.

Database files containing vital operating information are constantly changing and should be backed up every few hours. Database applications pose another issue in terms of system backups. The files associated with the database application are always open while the application is in service. Therefore, a file-based backup system will not be able to save these files as they are already open by another application. In order to solve this problem, a specialized piece of software, known as a backup agent, is used to interface with the database management system and provide access to the data.

There are three basic types of backups: full, differential, and incremental. As its name suggests, a full backup is a backup of all files regardless of whether they have been changed since the last backup. A differential backup saves only those files that have been changed since the last full backup, whereas an incremental backup saves only those files that have been changed since the last backup of any type.

Each backup type presents different issues in terms of restoration. If the most recent backup is a full backup, all that needs to be done to return the system to operation is to restore the full backup. If differential backups are used, the last full backup must be restored followed by the last differential backup. In the case of incremental backups, the last full backup must be restored followed by each incremental backup taken since the full backup in the same order they were saved.

Several different media options are available for system backup including floppy disks, Zip drives, and writable CD-ROMs. However, for network backup purposes, the best solution is magnetic tapes. Available in several different formats, these tapes can hold in excess of 10 gigabytes at a cost of a few cents per megabyte of capacity.

Common tape formats include quarter-inch tape (QIC) and digital tape. Quarter-inch tape equipment is typically designed to work on the computer's floppy diskette bus, yielding performance that is relatively slow compared to the digital tape alternatives. For this reason, quarter-inch tape should only be considered for use in workstation or peer-to-peer network backup. The actual QIC tape media is also significantly more expensive than that of digital tape, yielding a higher operating cost.

Digital tape systems differ from quarter-inch tape in several important areas: The tape drive hardware is typically designed to work on a small computer system interface (SCSI) bus, yielding significantly faster data transfer rates than quarter-inch tape. Digital tape formats also offer higher capacity than QIC tapes at a lower per tape cost.

There are two basic digital tape formats: 4mm and 8mm. Four millimeter tape systems are based on Digital Audio Tape (DAT) standards, whereas 8mm tapes are based on 8mm video tape standards. At the current time, 4mm tape is the more common media with several manufacturers manufacturing tape drives compliant with the standard. Exabyte Corporation is the only current manufacturer of 8mm tape systems. Regardless of the type of digital tape used, data grade media should always be used, and tape drives should be cleaned on a regular basis to ensure data reliability.

TIP: Regardless of the type used, magnetic tape is not errorproof. Just as floppy disk drives and hard drives can develop errors, backup tapes are susceptible to data loss. In order to guard against such data loss, the contents of a backup should always be compared against the actual data immediately after the tape is written. The network administrator should check the backup logs on a daily basis for errors or other anomalies that could indicate that the backup set is not complete.

It is also important to rotate tapes to ensure that if a single tape fails, there is another backup set to restore. The most basic tape rotation technique is to simply use each tape once and archive it. Although this has the advantage of ensuring that all data ever backed up is available, it requires a significant investment in tapes and greatly increases media storage requirements.

Another technique primarily used for single servers or small departmental networks is to rotate at least two tapes on a weekly basis. Each tape begins with a full backup followed by daily incremental or differential backups for the remainder of the week. Although this technique eases restoration concerns because all backup sets are on the same tape in the order they were taken, up to a full week's data is at risk in the event of tape failure. If this technique is used, it is recommended that the last tape each month be kept for at least a year for archiving purposes.

A more comprehensive technique commonly used in larger networks is known as the Grandfather, Father, Son tape rotation method. In this technique, a different tape is used for each day of the week that a weekly backup is not performed. For an organization that backs up every night, there would be six daily tapes labeled by day of the week. There would also be three tapes that are rotated each week on the seventh day of the week. These three tapes are labeled one, two, and three. On the day of the last weekly backup of each month, twelve tapes are rotated. These tapes are labeled for the month they are used. This technique provides an automatic archiving method to save data for up to a year as part of the standard tape rotation procedure.

TIP: Regardless of the tape rotation technique used, it is important that proper maintenance be performed on the tape system. Tape drives should be cleaned on a periodic basis per manufacturer's specifications. Tapes should be periodically replaced as they approach the upper limit of their expected service life as defined by the tape manufacturer.

Another area of concern is the location of the backup system. The most common approach on small departmental networks is to locate the tape drive in the primary file server. By installing the tape drive in the server, you can copy data to the tape drive at system bus speeds. Besides being much faster than copying the data across the network, this technique has the advantage of not impacting available network bandwidth.

However, placing the tape drive in the file server creates a greater demand on the server's processing and data bus capacity. In addition to serving files to the network clients, the server must also run the backup software. Another potential downside to this approach is the risk of a total system failure in the event of a failure in the backup system. If a tape drive fails, it could bring down the entire system with it, effectively creating the type of catastrophic system failure against which it was meant to protect.

A second approach is to place the backup subsystem in a client workstation. This approach isolates the backup devices from the file servers, effectively eliminating the chances of crashing the servers during the backup process. A backup agent may be required on the network server to allow the client-based backup system to read configuration information and other special files. Another issue with this approach is that it places a high demand on network bandwidth. If you use this approach, you should schedule backups for a time when there are few users on the system.

A third approach is the implementation of specialized backup hardware. Designed for the express purpose of backing up enterprise networks, these systems provide fast, reliable system backup capabilities for all devices on the network. However, these solutions are usually quite expensive and are best implemented when a centralized backup point for multiple servers is required.

Dealing with Faults

Regardless of the amount of care you take to ensure that the system is running properly, faults will always occur. Components such as hard drives, memory, and power supplies are the most common sources of system faults. The key is to minimize the service interruption caused by the failure of these components. The best-case scenario is if the failure of a component can be dealt with by the system automatically in a manner such that the user does not notice that a fault occurred. Systems that provide such capabilities are referred to as being fault tolerant.

The most basic level of fault tolerance is to implement redundant systems (also known as hot backups). Redundant systems consist of at least two devices, a primary device to routinely perform the required function and at least one secondary device to monitor the primary device and take over if a fault occurs. Examples of redundant systems are backup routers and duplicate network lines.

Redundant systems typically lend themselves to devices that are static in configuration. Another obstacle to redundant systems is cost. Implementing redundant systems typically more than doubles the cost of implementing a single system due to the instantaneous monitoring requirements.

For those devices that are mission critical, but difficult to duplicate, such as file servers, subsystems that may be prone to failure should be identified and duplicated. This practice is referred to as redundant subsystems. A common implementation of redundant subsystems in network servers is in hard disk arrays.

Redundant Arrays of Inexpensive Disks (RAID) is a system that uses several physical hard drives clustered to provide a single logical drive to the network operating system. The data stored on the individual drives is mirrored on other drives so that in the event of the failure of an individual drive, the data remains accessible. Most RAID subsystems include the capability to remove and replace drives in the system while the system remains online. When a drive fails, it can be replaced, and the system will automatically replace the data that was on the failed drive from the copy on the other drives.

Other examples of fault-tolerant subsystems include multiple power supplies and error-code-correcting memory systems. These fault-tolerant subsystems allow the overall system to remain highly available even if a fault occurs. However, it is imperative that the system administrator consistently monitor the systems to ensure that they are working properly. If a power supply fails and the system switches to a backup unit, the failed unit needs to be replaced as soon as possible for the system to remain fault tolerant.

As you might expect, fault-tolerant systems are expensive and should only be used when the opportunity cost of the system being out of service long enough for repair exceeds the cost of the fault tolerant system. Fault tolerance is usually reserved for mission-critical servers and network hardware such as routers and bridges. Individual workstations are usually not candidates for the application of fault-tolerant equipment.

There are usually several devices that, although mission critical, are not candidates for redundancy for one of the reasons already listed. These devices may be candidates for warm backups (also known as warm swaps). Warm backup refers to the practice of keeping a backup device on hand. If a device fails, a technician replaces the failed device with the backup device to restore service as quickly as possible. Examples of devices that are good candidates for warm backup consideration include network concentrators and bridges. Warm backups are especially cost-effective because a single warm backup device can back up several production devices.

Regardless of whether a device is fault tolerant, when a fault occurs, the affected device must be repaired. There are two repair options: on-site and off-site. On-site repair can either be performed by personnel from within the organization or by a third-party service provider under service contract. When looking for service contract providers, be sure to analyze response time to ensure that the service provider can accommodate your needs. Off-site repair is most commonly used for specialized devices such as concentrators and low volume printers. If an off-site repair strategy is selected for a device, ensure that adequate backup hardware is available to allow time for shipping in addition to the time required to repair the device.

Software Administration

One of the most time-consuming activities associated with network system administration is the management of application software. Whereas the host-based systems that preceded them used a central application program, today's personal computer network-based systems distribute software throughout the stations on a network. Maintaining some degree of consistency in such a distributed system poses a serious network administration challenge.

Software Metering

Software license fees represent a large investment for an organization. Although volume purchases and site license agreements can reduce costs, license fees can still exceed $1,000 per workstation. Fortunately, most license agreements limit the number of copies that can be run concurrently rather than the number of workstations capable of running the software.

It is highly unlikely that every workstation on a network will be running the same application at any given time. Therefore, minimizing software license fees becomes an issue of determining the maximum number of concurrent licenses required to meet the organization's needs and limiting the number of workstations that can run the application concurrently to the number of licenses available.

Software metering systems provide these capabilities. Metering packages specifically include the capability to determine the number of workstations accessing an application and a means of preventing workstations from accessing an application when the number of licenses is eclipsed. All metering packages work on server-based applications, and some can be configured to meter client-based software as well.

Software metering packages are available from a number of vendors for most network operating systems. Implementation methods vary, although solutions that are wholly server-based are preferable to solutions requiring an agent on each client due to the added complexity and management requirements of an agent-based system.

Software Distribution

The key to easing management of a personal computer software base is to centralize the software. Whenever possible, install software to a file server. This provides a greater measure of consistency across the network and a single point for software maintenance. Installing and configuring software to run from a file server was once an art form. Fortunately, most software developers now provide applications that are capable of running from a file server or at least are network-aware.

When installing a server-based application across a network, use a two-part process. First, you must install the application files to the file server. For small applications, you can accomplish this by simply copying them to a directory on the server and setting appropriate file permissions to allow execution on the client workstations. Larger applications may include a specialized server installation mode that automatically configures the software for a server-based environment.

Second, after you install the software on the server, the client workstations must be configured to run the server-based software. For simple programs, this can be as straightforward as creating a shortcut to the application's executable file on the server. More complex applications may require specific files, such as dynamic link libraries (DLL), to be housed locally on the workstation. In that case, the appropriate files must be marked and copied. Some network-aware applications provide a workstation installation program that automatically configures the workstation to run the application from the server. Always check the application's installation documentation to see if such a mode is supported before beginning installation.

Regardless of whether a client installation program is available, manually configuring each workstation can be a time-consuming task. Although you can complete the installation of an application across a departmental network containing a few workstations in a relatively short period of time, the same cannot be said of an enterprise network containing several hundred or even thousands of workstations. In addition to being time-consuming, manual configuration is also error-prone, especially if there are several people configuring workstations.

In order to reduce the time required and propensity for errors associated with manual client installation, a means for automating client installation is desirable. The first means of easing the burden of software installation is creating scripts to automate manual tasks. Some client software installation programs can also be configured to take input from a script file, thus removing the burden of entering configuration information from the administrator installing the software.

Taking advantage of such scripting techniques can reduce the time and likelihood of error, but an administrator must still physically visit each workstation, execute the script, and test the installation. This represents a serious investment of resources, time, and money in large organizations. Imagine having to send someone to install software on 5,000 workstations located in 12 buildings in 10 different cities. Later in this chapter, in the section titled "Network Management Platforms," I discuss solutions that will assist in further automating software management.

Software Audits

One of the least appealing responsibilities of a network administrator is auditing the software contained on client workstations. Due to the distributed nature of network workstations, it is difficult to determine what versions of which software packages are installed. This presents a significant issue when new software is required. Is the other software on the workstations compatible with the new software? Can upgrade versions of the software be installed, or must completely new licenses be purchased? A network administrator needs quick accurate answers to these types of questions.

The difficulty of answering these questions is compounded by employees who have installed personal software on their workstations. Most personal software is designed to run on standalone computers rather than on networked workstations. Installation into a networked environment can create software incompatibilities and render the workstation unusable until a network administrator can repair the damage. Even if the software presents no compatibility issue, it consumes system resources such as hard drive space.

Although the potential of personal software to degrade workstation performance is of considerable importance, there is a larger issue involved with the installation of personal software. By installing software on a corporate workstation without a license for the software, the user is placing the company into a software license noncompliance situation. This problem is magnified by the human desire to share. If a single user in the accounting department gets a new screen saver for Christmas, half the accounting department could be running illegal software by New Year's Day.

Software license noncompliance or software pirating is a serious issue. The Software Publishers Association (SPA) investigates corporate software license with relentless efficiency. If an organization is found with illegal software, the minimum penalty is a fine equal to the purchase price of the illegal software coupled with an order to remove the offending software. If the software in question is required by the business, they must then purchase legal copies of the software. Many network administrators have lost their jobs over software license noncompliance issues.

To ensure that these issues are properly addressed, a network administrator needs a comprehensive audit of software installed on all workstations on the network. In order to provide such a comprehensive listing of software on a large network in a practical manner, some means of automatically querying the workstations is required. Such capabilities are included in most network management platforms.

Network Management Platforms

To assist in the management of distributed networks, a new type of network management tool has been developed. These new tools represent a comprehensive approach to providing management tools for the maintenance of distributed network systems. Designed to work in a distributed environment, these tools utilize a manager/agent approach. A central manager coordinates the activities of agent software installed on each client workstation.

Network management platforms provide a wide range of functions. As already mentioned, they provide a means of installing software rapidly onto distributed workstations. In order to gain the maximum benefit of this capability, the structure of each workstation's hard drive should be identical. The agent must know where to find and place each file associated with the software being installed. Without such consistency, the effectiveness of a network management platform is seriously compromised.

Another major functional area of management platforms is the capability to gather and store configuration information from the network workstations. In addition to automatically performing application software audits, network management platforms include the capability to report a wealth of information about a workstation's hardware, such as

The type of network card installed in the system
Interrupt and I/O port usage
Memory installed in the system and its current usage
Amount of disk space installed and the amount of free disk space
Video card type, supported modes, and current mode
Processor type and speed
Communication port availability and configuration

By collecting and analyzing this information, you can make software upgrade decisions from a more informed perspective. A network administrator preparing for a major upgrade could easily identify which workstations are not adequately configured to support the new software. By using this information, you can make accurate projections of the amount of funding required to upgrade the hardware to support the software.

Management platforms also include several functions to improve the efficiency and quality of user support. Real-time communications sessions (commonly called chat sessions) can be launched between administrators and users to discuss user problems. In a typical chat session, the screen is divided into two sections: one for the user's input and one for the administrator's input. This feature is particularly valuable to organizations that have workstations located in branch offices that would otherwise incur a long-distance telephone fee to contact.

Most network management platforms also include the capability to interact with the remote workstation through a command prompt. Administrators can check configurations, remove or add files, or launch programs directly over the network, without traveling to the location of the workstation. It is even possible to remotely reboot workstations to activate changes made in configuration files.

Taking the concept of remote access one step further, some network management platforms include the capability to control the remote workstation. Remote control differs from a remote command prompt in that in a remote control scenario, the network administrator sees the exact same screen as the local user. Both the user and administrator can enter characters into the system or use the mouse in graphical user environments. This capability allows the user to show the administrator the exact symptoms of the problem. The administrator can then perform problem resolution activities remotely and show the user how to operate the system properly.

NOTE: Remote control can present a serious security concern. You must take care to put procedures into place that prevent network administration personnel from connecting to remote workstations covertly and "eavesdropping" on users as they work. Ideally, a remote con-trol program would announce to the user on the remote workstation that someone wants to share control of their workstation and give the user the ability to accept or deny the request. However, some commonly available packages do not include this capability, so you should be careful when selecting a package.

Network management platforms usually include capabilities to manage network devices such as hubs, switches, and routers. Although this capability is beyond the scope of this chapter, it is an important feature in an internetworked environment.

Network Security

The process of protecting data and equipment from unauthorized access is collectively known as network security. The importance of implementing good network security procedures is highlighted when you consider the ramifications of not taking such precautions: data can be accidentally or intentionally erased from the system; a competitor can gain an unfair advantage by accessing confidential data; and the use of network resources can be lost, yielding a corresponding loss of productivity.

It is the role of network administration to take preventive action to ensure that the risk of such losses is minimized. However, care must be taken to balance the reduction of security risks against the ensuing loss in ease of use and availability of the networked systems. Security procedures and system flexibility are diametrically opposed concepts. Every step taken by a network administrator to prevent unauthorized access creates another step that an authorized user must take to gain access to the data. It is important to analyze each system on a network and place appropriate security restrictions on an individual basis.

Security Threats

The first step in evaluating security risks is to determine the threats to system security. Although the term network security has been commonly categorized as protecting data and system resources from infiltration by third-party invaders, most security breeches are initiated by personnel inside the organization. Organizations will spend hundreds of thousands of dollars on securing sensitive data from outside attack while taking little or no action to prevent access to the same data from unauthorized personnel within the organization.

Although the media commonly present exotic tales of espionage through computer systems, most information theft occurs through traditional means. Even when the source of the threat is an outside force, the instrument is usually in the form of an inside operative. From the perspective of an outside entity wanting to gain access to sensitive information, it is usually much easier to find an employee who is willing to help than it is to attempt to break into the target company's systems. The compromised employee can usually gain access to the data more rapidly than an outside attack could yet is considerably more difficult to trace.

The most common scenario for information theft is disgruntled employees or employees who are leaving the organization. Whenever an employee resigns or is released, immediate action should be taken to protect information assets to which the employee might have had access. Their computer accounts should be disabled. If it is not possible to disable the employees' accounts, security rights should be re-examined to ensure that the risk of a security breech is minimized. Passwords to any administrative accounts the employee might have had access to should be changed immediately. Many ex-employees report being able to access data at their old firms for years through accounts that were never secured after their departure.

Systems administrators and database administrators present an even larger security risk. By necessity, these employees have greater access to data and systems than any other user. For this reason, it is best if systems administration is performed by at least two people rather than a single administrator. Whenever there are multiple administrators, the likelihood of detecting improper administrative action is increased greatly.

It is also a good practice to create users and grant them administrator rights for daily systems administration rather than using the administrator account itself. The password for the administrator account should then be sealed and placed in a safe location. If all other administrator access level accounts are disabled for some reason, the administrator account will still be available and can be used to restore access to the system.

Although internal threats represent the majority of security threats, a significant number of threats originate from outside the organization. Hackers, competitors, vendors, and government agencies have all been known to attempt unauthorized access to computer systems.

The threat from hackers has been largely overstated. Individuals who fit into this group have more of a Robin Hood mentality than a destructive mentality. Most hackers, or crackers as they prefer to be called, are more interested in the thrill of breaking into the system than they are in causing damage once they succeed in gaining access. Unfortunately, there is an increasing trend for hackers to be employed by other entities as an instrument to gain access to systems.

As the amount of critical data stored on networked systems has increased, the appeal of gaining access to competitors' systems has also increased. In highly competitive industry segments, an entire underground market exists in the buying and trading of product and sales data. By gaining access to research and development information from a competitor, millions of dollars and years of research can be eliminated.

Another external threat is that of government intrusion, both from the domestic government and from foreign governments. Agencies such as the Federal Bureau of Investigation and the Internal Revenue Service can have vested interests in gaining access to critical tax and related information. Foreign governments are especially interested in information that could represent an economic or national defense advantage.

Network Operating System Security

Regardless of the source of the threat, there are several measures you can take to increase the security of a network operating system. The first step is to physically secure all network equipment. If you can lock it up, lock it up. Servers, hubs, and switches should always be installed in a location that provides physical security to prevent tampering and unauthorized access to the network.

It is equally important to educate users on the importance of maintaining security and basic security techniques. Concepts as simple as not leaving workstations that are logged into the network unattended can be overlooked if the employee does not think about the ramifications of his actions. Anyone who has physical access to these unsecured systems by extension has access to the data stored on the host system. This problem is exacerbated if the workstation in question is located in an open office or cubicle-based environment.

Educating network users to log off the system or to lock their workstations when not at their desk can greatly reduce the security risk. Some operating systems allow the use of screen savers that automatically lock the workstation after a few minutes of inactivity. This can be a powerful tool to force users to adhere to good security practices.

NOTE: An organization that spends several hundred thousand dollars on the latest firewall technology may not even think about protecting against data being copied onto floppy diskettes or laptop computers. This is the easiest method for transporting stolen data. To reduce this risk, many organizations are now removing floppy drives from workstations or installing drive locks that prevent the floppy drive from being accessed by the user. This approach has the added benefit of reducing the risk of a workstation contracting a virus from a contaminated diskette.

Firewalls

Although the majority of network security threats come from within the organization, there is a growing threat of outside infiltration. This threat is greatly increased by connection of private networks to public networks such as the Internet. In order to reduce the risk of outside parties gaining access to a private network, you can install a firewall.

As the name suggests, a firewall is a network security device that blocks certain data from coming through. There are three basic types of firewalls: packet filters, circuit-level gateways, and application gateways. Packet filters, also known as router firewalls, limit the types of packets that can pass through the firewall. In this manner, communication through the firewall can be limited to specific session types. For instance, a packet filter can allow e-mail (SMTP) packets to pass through but prevent file transfers (FTP) from occurring.

Circuit-level gateways provide greater flexibility than packet filters. Like packet filters, a circuit-level gateway can limit the types of packets that can pass through the firewall. However, a circuit-level gateway works by stripping the network layer header and footer from the incoming packet and replacing it with a version appropriate for the other side of the firewall. In this way, a circuit-level gateway can connect private IP networks to the Internet without re-addressing the private network to Internet standards.

Application gateways provide the best security because stations inside the firewall and those outside the firewall never actually communicate with each other. The outside station sends a message to the application gateway, which reformats and retransmits the message to the destination inside the firewall. Unlike packet filters or circuit-level gateways, both workstations view the firewall as the final destination of the network traffic.

Regardless of the firewall technology implemented, one of the most common misconceptions in network security is that a firewall solves all network security concerns. A firewall merely represents a first line of defense. Just as warring factions create multiple lines of defense, a network administrator should ensure that proper security measures are taken inside the firewall.

Remote Access Security Considerations

One of the fastest-growing areas of network computing is that of remote access. Sales representatives, technical support personnel, and telecommuters all require access to centralized network resources from remote locations. These communication links are usually made through the public telephone network. Unfortunately, extending the corporate network to remote locations through the public telephone system represents a large security risk.

To reduce the security risk of remote access, several safeguards should be implemented. Use a communications protocol that provides encryption for user identification codes and passwords. For example, the Point-to-Point Protocol (PPP) should be used whenever possible rather than the Serial Line Internet Protocol (SLIP).

To maintain system security, passwords should never be stored in local password cache files, especially on portable computers. Although password caches have the benefit of reducing the time required for a user to access data on the network, they effectively bypass all security if a computer falls into the wrong hands. A thief who steals a laptop in an airport could simply plug the computer into a telephone, tell it to dial the preconfigured number, pass the stored password to the server, and have access to the system.

Another means of providing additional security is to implement call-back rather than dial-in. In a call-back environment, the user calls the server and instructs the server to place a call to the workstation to initiate the remote access session. The server can be configured to always return the call to a specific phone number or have the user enter a number to place a return call.

For remote users who attach to the network from a consistent location, such as telecommuters, the remote access server should be configured to place the return call to a predefined telephone number. For mobile users, call-back offers a means to create a log of locations and phone numbers from which the user is accessing the system. This log can then be scanned in real time for any nonstandard locations. Another advantage to a call-back implementation is the potential to reduce toll charges by originating the call at the central site rather than in the field.

Viruses and Other Computer Parasites

Viruses and other computer parasites represent a huge productivity threat to organizations regardless of size. Experts estimate that the loss of productivity resulting from computer virus infections is measured in billions of dollars annually.

A computer parasite is any program or task that destroys data or prevents a computer system from being used as intended. Computer parasites range from the benign, such as the Stoned virus that displays a message to the user of an infected computer about legalizing marijuana, to the extreme, such as the Hari virus that destroys all data stored on a system.

There are three major types of viruses. Boot sector viruses attach themselves to the system code of the infected system. The first type of virus to be developed, boot sector viruses always reside in the infected system's memory and are fairly easy to detect and remove from a system. Examples of boot sector viruses include Stoned and Michelangelo.

File infectors attach to executable files rather than the system's boot sector. Therefore, file infectors affect system performance only when the infected file is resident in memory. This can make detection and removal of file infectors more difficult than boot sector viruses. Examples of file infectors are Vienna, Jerusalem, Dark Avenger, and Frodo.

The latest virus threat comes from macro viruses. Unlike other virus types that are transmitted through the infected executable programs, macro viruses are hosted in documents. Although the manner of infection differs, the result of a macro virus infection can be just as destructive. In some respects, macro viruses represent a larger threat as they attack data files rather than executables. Although executable files can be easily replaced by reinstalling the affected software, documents are irreplaceable unless they have been backed up onto other media.

As use of the Internet expands, another virus threat is emerging. World Wide Web browsers are evolving to include new technologies designed to increase the capability of browsers to perform application tasks. These technologies provide an automatic means for executable software, commonly referred to as applets, residing on a server to be downloaded to the local workstation and executed automatically when a link is pressed. Many times, the user is not aware that the applets are being installed on the workstation. As these technologies continue to mature, a means of ensuring that their capabilities are not used for destructive purposes is an industry-wide concern.

Virus hoaxes, although not technically a computer parasite, can represent as much of a threat to productivity as real viruses. The best known virus hoax is the "Good Times" virus. The hoax consists of an electronic mail message stating that a new computer virus has been developed that can destroy the contents of a hard drive simply by opening an electronic mail message with the subject line Good Times. In the spirit of helpfulness, users forward the "warning" to all of their friends and coworkers, thus overwhelming network resources in the process of perpetuating the hoax.

In addition to viruses, the field of computer parasites includes Trojan horses and worms. Trojan horses are viruses or other destructive programs that are masquerading in the guise of application software. The user launches what she thinks is an application, and the parasite is unleashed. Worms are programs that attack memory rather than files. A typical worm infection results in tasks being launched until the system becomes so overloaded that it ceases to function effectively. Fortunately, worm attacks can usually be cured by shutting down and restarting the affected system.

The best defense for computer parasite attacks is a good offense. Rather than waiting for an infection to occur, and then taking steps to remove the infection, you should take preventive action to ensure that the risk of infection is minimized. The most important preventive measure is the implementation of sound system and user policies:

Make all system and executable files read-only. Although some viruses can change the read-only file attribute, new operating system implementations counteract this threat by providing a means of preventing applications from changing file attributes.
Limit the use of floppy diskettes whenever practical. Locks that prevent the use of diskette drives are now on the market for under $20. As previously mentioned, this has the additional benefit of increasing overall data security by preventing sensitive data to be copied.
Education on the risks of parasite infection and preventive measures should be required for all users. An alert user can recognize suspicious program activity and report it before the problem propagates throughout the enterprise.

In addition to the implementation of sound system and user policies, virus-scanning software should be installed on all workstations and servers within the enterprise. Many capable virus-scanning packages are available on the market. Key features to look for when searching for a virus-scanning solution include cost, cross-platform compatibility, and the available of timely updates as new viruses and their derivatives are discovered every day.

Help Desk and Trouble Ticket Systems

The most visible and often time-consuming responsibility of network administration is supporting end users throughout the organization. Network administrators often feel more like firefighters running from one user problem to the next than administrators. In many cases, administrators can be so busy responding to end user problems that they cannot take action to prevent such problems from occurring in the first place.

This problem is made more frustrating by the fact that most problems users experience do not require systems administrative action to resolve. Printers that do not work, application programs that do not work as expected, and computers that will not boot can often be traced to simple operational issues.

The establishment of a help desk to serve as the initial point of contact for end users improves this situation from both the end user and administrator's perspective: The end user has a single point of contact for all problems associated with the use of the network, and the network administrator is freed to concentrate on those problems that represent operational issues.

A help desk is a single point of contact for end users to report any computing problems they might be experiencing regardless of their nature. Personnel at the help desk take the call and make an assessment as to which of four main categories the user's problem falls into: application questions, workstation installation issues, network administration issues, or system faults.

The majority of calls to a typical help desk will be application questions. Questions such as "How can I add a chart to a word processing document?" and "How do I print to the color printer?" fall into this category. The help desk personnel are usually capable of handling these questions internally either over the phone or by a visit to the user's workstation. Through the use of remote control software as detailed in the systems management platform section of this chapter, it is even possible for most problems to be taken care of directly from the help desk.

Workstation installation issues include workstations that are not properly configured to meet the user's needs. Examples of this type of problem would include a user whose spreadsheet program does not have the statistical analysis package installed or a workstation that does not have appropriate printer drivers. These problems can also be solved directly by help desk personnel.

Network administration issues are those problems whose root cause lies in the management of network resources. Examples include users who do not have adequate access rights to network resources or requests for new users or new groups. The help desk personnel usually forward network administration issues to the network administrators for appropriate action.

If the problem appears to be caused by a system fault, the problem is immediately escalated to the network administration personnel. Problems of this type are usually identified quickly because the problem often affects multiple users in the same way at the same time, resulting in a rash of calls to the help desk. By taking care to note who is affected by the problem, help desk personnel can greatly aid systems administration personnel in solving the problem quickly and efficiently.

In order to ensure that each problem reported to the help desk is resolved, a trouble ticketing system can be implemented. Trouble ticket systems are a formalized methodology of tracing a help call through the problem resolution process from the initial report to problem resolution. Trouble ticket systems provide many benefits:

Problems that are not resolved in one shift are documented for the action on the next shift.
Common problems that represent user training needs can be identified.
Intermittent problems can be tracked to determine the underlying cause.
Resources that create an unusual amount of help desk traffic can be identified and examined.
A database of common problems and their solutions can be built.

Although manual trouble ticket systems are adequate for smaller networks, several automated trouble ticket systems are available for larger systems. These automated systems provide many benefits, including the capability to automatically search for similar problems from previous trouble tickets, the capability to interface directly into electronic mail systems, comprehensive search capabilities, and automated reporting functions.

Regardless of whether a trouble ticket system is manual or automated, it can be effective only if it is kept current. Each problem should be logged in to the system no matter how insignificant it may seem. Often these small problems when looked at on a macro level are indicative of other, more serious problems in the network system.

Documentation

One of the most important tools for supporting and maintaining a network system is proper documentation. Unfortunately, most network administrators would rather concentrate on the next project rather than on carefully documenting the work just completed. Good documentation should be considered a vital part of each network administrative task rather than a separate task to be completed when the "real work" is completed. To put it another way, documentation is an integral part of the administration process rather than a separate task.

The importance of good documentation is emphasized by considering the results of poor documentation. A system installed for a long period of time needs to be upgraded to a newer version of the operating system. Before the upgrade process can be started, the network administrator needs answers to several questions about the system and its configuration: What are the current settings on the device? What version of the operating system or firmware is installed on the device? These questions increase in importance if the administrator who initially installed the device has since left the company.

The network documentation process begins with the management of documentation from software and hardware vendors. Instructions that detail jumper settings, installation procedures, system requirements, and copies of any software that may be required for configuration should be carefully stored. This information is invaluable when a change needs to be made to a system. A copy of all user documentation should also be kept at the help desk for users to use as reference material.

There are three basic areas that should be considered when documenting network systems: hardware configurations, software configurations, and network wiring diagrams. Hardware configuration is one of the most important documentation areas. In the event of hardware failure or a proposed upgrade, you must analyze the current configurations and read the manufacturer's documentation to ensure that replacement equipment is compatible with installed equipment. Examples of hardware configuration information included in the systems documentation includes:

CMOS configurations
Jumper settings
Driver settings
Memory maps
Installed component types and versions

Software configuration information should also be carefully documented. The client/server environments that are becoming commonplace require considerable configuration at both the client and server ends. In order to make the system as self documenting as possible, a consistent strategy should be used for the assignment of software configuration information such as station names. Software configuration information includes

Directory structures used for applications and user files
Application serial numbers, license codes, and proofs of purchase
System startup and configuration files

Network wiring and cables also require careful documentation. Due to the dynamic nature of networked systems, the destination of a cable can quickly become lost in a "rat's nest" of wiring. Each end of all cables should be descriptively labeled and color-coded. Network maps detailing the locations of servers, workstations, network communications equipment, and the wiring that connects them should be kept current. Some network management platforms include the capability to create and update logical network maps in real time. Although these products do not explicitly determine the location of the device, they do show how the equipment is interconnected. The issue of physical location can be addressed if the network devices are named in a manner that represents their location, such as smith_101 for a workstation located in room 101 of the Smith building.

Summary

Network Administration is a never-ending operations issue. There are five basic network administration functions: fault management, configuration management, accounting management, performance management, and security management. By addressing each of these functional areas, a network administrator can ensure that data and applications stored on the network are secure and available for use.

By being diligent in the creation and implementation of network policies, a network administrator can greatly reduce the time spent "firefighting" and increase the time spent planning for the future needs of the network system. In short, the more time spent developing a comprehensive plan to administrate your network, the less time spent actually performing network administration.