By Chris Byers
As graphics machines and number crunchers, Sun workstations and servers have enjoyed a nice little niche as top-of-the-line machines. They've earned this reputation from their proprietary bus architecture and flawless graphics adapters and monitors, which bring their performance to a very high level.
Along with being a bit on the pricey side, Sun workstations can be a bit of a pain to administer, however. With any luck, you should be able to find at least a few answers to problems that may pop up in the Sun environment.
The following section deals mainly with general topics and concepts of the Solaris Environment.
Solaris actually encompasses the entire user environment, from the UNIX operating system to the X-11 based windowing system, as well as many other features.
The two major releases of Solaris are:
There are several good reasons. First and foremost, it is much more compatible with the rest of the UNIX industry. The big players, such as IBM, HP, SGI, and SCO are based on System V, instead of BSD.
In the PC world, all major vendors have System V-based UNIXes, except for BSDI, Inc. As the name implies, BSDI's operating system is based on Berkeley's BSD.
In addition, for some time now, Sun has only been doing development work for Solaris 2, and it's not likely to change.
The standard X11R5 release of X Windows is bundled with Solaris 2.3. Although it is still called OpenWindows, it is really X11R5 with the addition of Adobe DPS.
Version 2.0 of Solaris only ran on desktop SPARCstations, as well as a few other Sun machines.
There are actually two flavors of Solaris 2.0: the SPARC flavor and the "x86" flavor.
The SPARCstations and their clones will all run the Solaris 2.1 (and above) SPARC flavor. They will also work on all models of the Sun-4 family. For the 4/110 and the 260/280, the FPU is not supported. This means that floating point operations will work, but very slowly.
Starting with 2.5, support for machines with kernel architecture sun4 is dropped. In other words, the machines on which uname -m and arch -k return sun4, not the machines on which those commands return sun4c, sun4m, sun4u or sun4d. The unsupported machines include the sun4/110 (not to be confused with the SS4 @110MHz), sun4/2xx, sun4/3xx and sun4/4xx. These are all VME-based deskside/server configurations. All versions of the SPARC PROMs should work under Solaris 2.x, but you can run into the following problems:
2. If booting diskless, you need a link in the /tftpboot directory, tftpboot -> .. Admintool will make that link automatically. A Solaris port for the PowerPC has been completed. Solaris 2.1 and 2.4 for x86 have been released to end users. It runs on a wide range of high-end PC-architecture machines. High-end means 16 MB of RAM and an 80486 (or 33 MHz or faster 80386DX). It will not run on your 4 MB, 16 MHz 386SX, so don't bother trying! Also, floating point hardware (80387-style) is absolutely required in 2.1. Starting with Solaris 2.4 for x86, an fp coprocessor is no longer required, though still recommended. All three buses are supported: ISA, EISA, and MCA. Some PCI devices are supported, though full bus nexus support for PCI is not there. See also 3.36. To summarize all this, Jim Prescott provided the following chart, which I've updated:
Solaris | SunOS | OpenWin | Comments |
1.0 | 4.1.1B | 2.0 | |
4.1.1_U1 | 2.0 | sun3 EOL release (not named Solaris) | |
1.0.1 | 4.1.2 | 2.0 | 6[379]0-1[24]0 MP |
1.1 | 4.1.3 | 3.0 | SP Viking support |
1.1C | 4.1.3C | 3.0 | Classic/LX |
1.1.1 | 4.1.3_U1 | 3.0_U1 | 4.1.3 + fixes + Classic/LX support |
1.1.1 B | 4.1.3_U1B | 3.0_U1 | 1.1.1B + SS5/SS20 support |
1.1.2 | 4.1.4 | 3_414 | The "final" 4.x release (SS20 HS11) |
2.0 | 5.0 | 3.0.1 | sun4c only |
2.1SPARC | 5.1 | 3.1 | Dec '92 |
2.1 x86 | 5.1 | 3.1 | May '93 |
2.2SPARC | 5.2 | 3.2 | May '93 |
2.3SPARC | 5.3 | 3.3 | Nov '93 |
OpenWin 3.3 is X11R5 based: Display PostScript instead of NeWS, no SunView. It is still primarily OPEN LOOK. The Spring 1995 OpenWin will be Motif and COSE-based. Statically linked BCP support | |||
2.3 edition II SPARC | Special Solaris 2.3 distribution for Voyager and SparcStation 5 | ||
2.3 hardware | 5/94 SPARC ?? | ||
2.3 hardware | 8/94 SPARC Supports S24 (24 bits color for SS5), POSIX 1003.2, Energy Start power management and SunFastEthernet + patches. | ||
2.4 | 5.4 | 3.4 | From this moment on, the SPARC and x86 releases are in sync. Q3 '94 Adds motif runtime and headers (not mwm). |
2.4 hardware | 11/94 First SMCC release of 2.4 | ||
2.4 hardware | 3/95 Second SMCC release of 2.4 (includes support for booting from SSA) | ||
2.5 | 5.5 | 3.5 | UltraSPARC support, PCI support. NFS V3, NFS/TCP, ACLs, CDE, Sendmail V8 name service cache, dynamic PPP Posix threads, doors (new IPC mechanism) many "BSD" type functions back in libc, many "BSD" programs back in /usr/bin. Mixed mode BCP support (e.g., apps only dynamically linked against libdl.so) |
2.5 hardware 1/96 | Creator3D support (Creator3D/FFB+ is not supported in 2.5 11/95, though the files are present but of unsupported, "mostly works", beta quality) | ||
2.5.1 | Ultra-2 support, Ultra-Enterprise server support. Large (32bit UID) support. 64bit KAIO (aioread64/aiowrite64), 3.75 GB of virtual memory. Pentium/Pentium Pro optimizations (up to 25% for certain database apps) Ultra ZX support. Initial PowerPC desktop release. |
SunOS 5.x contains an emulation mode called Binary Compatibility (BCP)for running 4.1.x binaries. Some overhead is involved, though, because this works by dynamically linking the 4.1.x binaries with a shared library that emulates the 4.1.x binary interface on top of 5.x. Up to and including Solaris 2.2, the programs needed to be fully dynamically linked.
Solaris 2.3, 2.4 and beyond support fully statically linked programs. There is an exception: the programs won't obey nsswitch.conf. They will instead use the standard "use NIS if present, fall back to files" approach of SunOS 4.x.
In this case, the programs may require a passwd:compat line and will only talk to NIS, or NIS+ in emulation mode, or they will read from files.
With the release of Solaris 2.5, mixed mode executables are supported. Mixed mode executables are partly static and partly dynamic in nature. These programs can use /etc/nsswitch.conf, depending on precisely how much was dynamically linked.
The word is that Sun will drop binary compatibility at some point in the future, so it is best to get as much software as possible moved to native Solaris 2.x.
This section should give you some direction on where to look for certain questions you may have, or where to find software and solutions.
Sun still makes printed manuals, but they don't automatically distribute them as they used to. You can still use the man pages for many commands, and there is a CD-ROM called the "AnswerBook," which contains all the printed documents in PostScript form, with hypertext capabilities and a keyword search engine. You should be able to answer most of the questions you have with this CD.
With the Solaris 2.5.1 release, the following CDs are available:
Solaris 2.x CD: | Solaris 2.x User AnswerBook |
Solaris Desktop 1.x | Wabi 2.x Answerbook Solaris Common Desktop Environment AnswerBook 1.0.x |
Updates for Solaris Operating Environment 2.x | Solaris 2.x on Sun Hardware Answerbook |
Server Supplement | NSKit 1.2 answerbook Solaris 2.x System Administrator AnswerBook (Solaris 2.5.1 Supplemental System Admin AnswerBook) Solaris 2.x Reference Manual AnswerBook |
Solstice AutoClient & AdminSuite | Solstice AutoClient 2.0 AnswerBook Solstice AdminSuite 2.2 AnswerBook |
Solstice Online Disksuite | DiskSuite 4.0 AnswerBook |
Solstice Backup | Solstice Backup 4.2 AnswerBook |
Solaris 2.x Software Developer Kit | All programming manuals. |
Solaris 2.x Driver Developer Kit | Device driver developer manuals. |
Only the first two CDs ship with the desktop edition, the third is SPARC-specific. The last two CDs are part of two separate products; SDK and DDK. The rest are server only, though the reference manuals are available in nroff source form.
There is some overlap between CDs. As distributed with 2.1 and 2.2, the Answerbook search engine runs only with the OpenWindows (xnews) server, not with MIT X11. This changed in 2.3. If you are using the MIT server instead of what Sun provides, you'll have to use one of several "answerbook workaround" scripts that are in circulation. The AnswerBook distributed with 2.3 and later runs with the OW3.3 X11R5+DPS server, so it should display on any X11+DPS server, such as on DEC, IBM, and SGI workstations.
You should buy (or print from within Answerbook) at least the reference manual and the System and Network Administration books, because if your system becomes disabled you won't be able to run the Answerbook to find out how to fix it. Catch 22.
Instead of the standard whatis file, Solaris uses a manual page index file called windex. You must build this index with catman -w -M <man-page-directory>.
Unfortunately, in Solaris 2.1 this will result in a lot of "line too long" messages, plus a bogus windex file in /usr/share/man, as well as a core dump in /usr/openwin/man. There is a similar problem in Solaris 2.2, where catman works in /usr/share/man but the "line too long" errors appear in /usr/openwin/man. Furthermore, man usually doesn't work if it can't find the windex entry, even if the man page exists.
A script that works better than catman is makewhatis in /usr/openwin/man. Unfortunately, by default, it searches files in /usr/man, not in openwin, and it only looks in some predefined man subdirectories. To avoid this problem, you can change its for ... command to for I in man*, then use it like this:
cd /usr/share/man; /usr/openwin/man/makewhatis . cd /usr/openwin/man; /usr/openwin/man/makewhatis .
This will create /usr/share/man/windex and /usr/openwin/man/windex.
You then have to alias man to "man -F" to force it to look for the correct files. In addition to this problem, the switch to look up different sections has changed. For example, instead of typing "man 2 read", you now have to type "man -s 2 read".
The following is a script to get around that little extra switch:
#!/bin/sh if [ $# -gt 1 -a "$1" -gt "0" ]; then /bin/man -F -s $* else /bin/man -F $* fi
Here are most of the important sites:
These are some of the more important FAQ's and their locations:
5. See also the "Solaris SW list. Monthly Post" above and the whatlist
file.
6. The Sun Security Bulletin announcement mailing list. Low volume, announcement
only list. Subscribe by mailing security-alert@sun.com
with subject SUBSCRIBE cws user@some.host.
It is supposed to take up 164 MB of space, but you also have to include swap into that number. Here are some suggested partition sizes:
/ | 10 MB |
/usr | 78 MB |
/var | 10 MB |
/usr/openwin | 83 MB |
/opt | 48 MB (for full installation) |
You must have at least as much swap as you have memory. I would suggest using twice as much swap as memory for improved performance and fewer error messages.
In addition to the system files, the answerbook will take up 164 MB of disk space. It can be used from the CD-ROM, but for better speed in looking up answers you should install it if you have enough space to spare.
A package is the SVR4 mechanism for "standardizing" software installation. Sun is using this as the default format for distributing add-on software for Solaris 2.x.
Packages can be installed or de-installed with the commands pkgadd or pkgrm, which are the standard SVR4 commands. In addition, Sun has the swm utility, which is a text-based facility, and swmtool, which is the GUI version.
Be careful with space, as lots of files fill up the /var/sadm/install/ directory.
The following is a summary of pkg* commands:
Starting with Solaris 2.2, Sun introduced a new scheme for automatically mounting removable media. Sun provided two programs for management: vold and rmmount. The vold (volume daemon) polls the devices to see if anything is present, and the rmmount program (removable media mounter) mounts the disk.
NOTE: On most SPARCstations, you have to run the volcheck command after inserting the floppy. You have to do this because if the system would poll the floppy, it (the floppy drive) would wear out rather quickly.
An advantage of this scheme is that any user can mount and unmount floppies at will (you don't have to be the root user). Also, you can do some neat things like starting the Audio CD player when an audio CD is inserted. It is also extensible, giving developers the ability to write their own actions.
There are some minor drawbacks as well. For one, you can't just access /dev/rfd0 to use a floppy. Now you have to use longer names, like /vol/dev/rdsk/floppy0. You must mount CDs with the format /cdrom/VOLNAME/SLICE.
When you read or write to a nonsystem disk with tar, cpio, and so on, you can put in a disk and run the volcheck command. Then you can use the tar command on the device /vol/dev/rfd0/unlabeled.
On Solaris 2.3 and later, the device can be defined as either /vol/dev/rdiskette0/unlabeled or /vol/dev/aliases/floppy0.
To use the old scheme, get into the /etc/rc2.d/ directory. Then remove the /S*volmgt link, and voila, you should be able to access the disk.
Solaris 2 came with some significant security enhancements over Solaris 1. One of these enhancements is that there is no + in hosts.equiv. Root logins are not allowed anywhere but at the console, and all accounts require a password.
You can enable root logins over the net, but you must edit the /etc/default/login file. Here you must comment out the line CONSOLE=line. You can still use the /etc/hosts.equiv file, but there is no default that comes with it.
The console line can look like the following:
For a more in-depth look at this subject, you can get the file ftp.anon from the ftp site ftp://ftp.math.fsu.edu/pub/solaris/.
The ftpd server that comes packaged with Solaris 2.3 is nearly complete for anonymous ftp. The only piece missing is /etc/nsswitch.conf, which needs to be set up.
In addition, you must make sure that the filesystem on which ftp resides is not mounted with the nosuid option set.
NOTE: It is very important that there are no files under ~ftp that are owned by ftp. This would cause a serious security breach in your system!
In reality, the real question here is how to print from an SVR4 system to a BSD system. The easy way of doing this is through the Admintool GUI, which should take care of these issues.
I have run into cases where I needed to use the command line lp* tools to set up Axis boxes for remote printing, since it uses scripts to redirect printing to ip addresses.
The following is a short guide to setting up a printer on a remote BSD system. I will call the Solaris 2 workstation sol and the 4.1.x server bertha, and the printer name will be printer:
lpsystem -t bsd bertha | # says bertha is a bsd system. |
lpadmin -p printer -s bertha | # creates printer on sol # to be printed on bertha. |
accept printer | # allow queuing. |
enable printer | # allow printing. |
lpstat -t | # check the status. |
If you want to make this printer your default printer, type the following:
lpadmin -d printer
By default, root's shell is /sbin/sh, which is statically linked. You can't just change the name of the shell it uses (for instance, to /sbin/csh), because that doesn't exist. The location of csh is /usr/bin/csh, which is dynamically linked. You won't be able to access this on startup, since the startup script calls this shell before the /usr filesystem is mounted. Certain critical files may also be in /usr/lib, instead of /etc/lib.
There are a couple of ways to go about this. One is to create an alternate root account, such as rootcsh, with a uid of 0 and /bin/csh as its shell. The only drawback is that you now have to remember to change all of root's passwords at the same time.
The other thing you can do is write a script in root's .profile to first check to see whether /usr is mounted. If it is, tell it to initialize another shell, exec /bin/ksh.
This would indicate that the NFS server is running 4.1.x and needs a patch from Sun to update its network lock daemon (lockd). Without the patch, file locking will not work on files mounted from the NFS client.
To patch this you need to obtain the lockd jumbo patch. It will fix a number of other lock manager problems as well, so this is highly recommended.
The lockd patches are:
100988 (for 4.1.3)
101817 (4.1-4.1.2)
101784 (4.1.3_U1)
102264 (4.1.4)
100518 (Online: Disksuite)
First, add the line /etc/system and reboot the system:
set scsi_options & ~0x80
What this does is turns off the Command Queuing, which doesn't work well with SCSI. In Solaris 2.4 and later these options can be set per each SCSI bus.
On certain disks, all you need to do is decrease the maximum number of queued commands in /etc/system:
forceload: drv/esp set sd:sd_max_throttle=10
You don't have to load any patches at all. But if you like a running system...
In reality, the only patches that you must have (aside from fixing problems that pop up due to the lack thereof) are patches for security related problems. All other patches should meet one of two problem conditions: Do I now have the problem that this patch fixes; or will I run across this problem in the future if I don't install the patch now?
If the answer to both answers is no, you might not want to install the patch. The reason for this is that in some cases patches can actually cause other bugs in your system, making it worse than before you applied the patch.
The exception to this rule of thumb are the patches that come with the Solaris 2.x CDs. According to Sun, they have been tested together and supplement the base OS to the supported level. As a matter of fact, on some systems you may not even be able to boot the machine until the patches are installed.
The following are ftp and WWW sites where patches are available for download:
Sites not sponsored by Sun, accessible for all:
ugle.unit.no:/pub/unix/sun-fixes
ftp.luth.se:/pub/unix/sun/all_patches
SunSites (carry recommended and security patches):
sunsite.unc.edu:/pub/sun-info/sun-patches
sunsite.sut.ac.jp:/pub/sun-info/sun-us/sun-patches
sunsite.doc.ic.ac.uk:/sun/sunsite-sun-info/sun-patches
Sunsolve:
sunsolve1.sun.com:/pub/patches
These are Sun's own sites. They have the recommended patches available for anonymous ftp, packaged as one huge 2.x_Recommended.tar.Z file and as individual patches.
Starting with SunSolve CD 2.1.2 all Sun patches are shipped on the SunSolve CD. Contract customers can get all patches by ftp from Sunsolve or via e-mail and query one of the online Sunsolve databases on the Internet.
48 pseudo-tty's is the default limitation. This can be changed by editing the /etc/system and adding the following line:
set pt_cnt=<num>
After you save the file, you must halt the system and reboot
<Stop A; boot -r
Although you can set a limit at any number you like, you will probably run into a system-specific limitation somewhere. Solaris 2.x supports more than 3000 pseudo-tty's.
Once you install Solaris, you should run the command /usr/sbin/rtc -z $TZ, where $TZ is your timezone. The default root crontab runs /usr/sbin/rtc -c once each day.
Once this is implemented, your system clock will give the proper time, whether you are booted into DOS or Solaris.
There is a way to do this. You can share all the partitions other than the system partitions (such as /, /usr, /var and /opt) between the two OSes. Also, all partitions, including the system partitions, can be mounted and accessed by either OS.
The easiest way to set this up is to do separate Sun installs on two different disks. Then just choose the appropriate disk at boot time with the PROM's boot command.
Setting up both OSes on one disk is a little harder, but not much. You need to partition the disk to allow for both OSes. Almost any partition layout is possible, but one common setup might be:
a:/ for Solaris 2
b:swap (shared)
c: The usual (whole disk)
d:/ for Solaris 1
e:/usr for Solaris 1
g:/usr for Solaris 2
Again, it's most reliable to use suninstall to do the installations. If, for some reason, you choose not to use suninstall, make sure you run installboot for both bootable partitions.
With this setup, you choose between the two OSes in the PROM's boot command as follows:
To boot Solaris 2: | boot |
To boot Solaris 1: | boot disk:d |
NOTE: In boot PROM versions 2.5 or before, the disk:d syntax is not supported, and the PROM cannot boot from root partitions that begin or end beyond 1 GB.
First, save a copy of /etc/nsswitch.conf to another file so that your only copy won't get destroyed.
You can run this command to change the hostname:
/usr/sbin/sys-unconfig
This causes the system to halt and reboot. When it reboots, the system will ask for its name, as well as networking parameters.
Yes. All daemons inherit the umask 0 from init by default. This turns into a real problem for such services as ftp, which with standard configuration, changes all uploaded files with the permissions 666.
In order to change the default umask used by daemons, you must execute the following commands in /bin/sh and reboot the machine:
umask 022 # make sure umask.sh gets created with the proper mode echo "umask 022" > /etc/init.d/umask.sh for d in /etc/rc?.d do ln /etc/init.d/umask.sh $d/S00umask.sh done
The trailing .sh of the scriptname is important; if you don't specify it, the script will be executed in a subshell, not in the main shell that executes all other scripts.
This section concentrates on networking issues which frequently pop up.
It used to be nearly impossible. In SunOS 4.1, it was impossible to run DNS name resolution without either NIS or a very kludgey fix.
With Solaris 2.1, however, it became incredibly simple, although the manual for SunOS 5.1 was incorrect. All you have to do is change a line in /etc/nsswitch.conf:
hosts: files dns
What this is telling the system is to look in /etc/hosts first. If the host is not found there, try the DNS. If that doesn't work, give up. Then you have to edit the /etc/resolve.conf file to tell the resolver routines how to contact the DNS nameserver.
You must have the names of machines that are somehow contacted during boot in the files in /etc, and files must appear first in the hosts: line; otherwise, the machine will hang during boot.
This service controls which of the resolver services are read from NIS, which of them are read from NIS+, which are read from the files in /etc and which are from DNS.
A common example would be:
hosts: nis files
which means ask NIS for host info and, if it's not found, try the local machine's host table as a fallback.
Advice: If you're not using NIS or DNS, suninstall probably put the right version in. If you are, ensure that hosts and passwd come from the network. However, many of the other services seldom if ever change. When was that last time you added a line in /etc/protocols?
If your workstation has a local disk, it may be better to have programs on your machine look up these services locally, so use these files.
Terminology: Sun worried over the term resolver, which technically means any get info routine (getpwent(3), gethostbyname(3), and so on), but is also specifically attached to the DNS resolver. Therefore, they used the term source to mean the things after the colon (files/DNS/NIS/NIS+) and database to mean the thing before the colon (passwd/group/hosts/services/netgroup).
A complete discussion can be found in nsswitch.conf(4). To see this man page, just type:
man 4 nsswitch.conf
In order to run the NIS server, you must get the Solaris network transition kit from Sun. Here are the versions that have been released:
NSkit 1.2 is available for SPARC and x86.
With NIS+ clients, each client doesn't hard bind to the servers in the way that NIS clients do. The clients have a list of NIS+ servers within the cold-start file. When the clients need to do a lookup, they do a type of broadcast called a manycast and talk to the first server that responds. In this way, each client can be sure to use the lightest loaded server for the request.
To request NIS+ server services, you must start rpc.nisd with the -B switch. In order to start this, you can edit the server's /etc/init.d/rpc file and change the line
EMULYP="-Y"
to read
EMULYP="-Y -B"
Once you restart the rpc.nisd server, you should be good to go.
Yes. In Solaris 2.x, you have an extra feature in ifconfig that allows having more than one IP address per interface.
The syntax for ifconfig is ifconfig IF:N (ip address) up, where IF is an interface, such as le0, and N is a reference number from 1 to 255.
The first thing you should do is run the truss command as follows:
truss -f -o (file) (command) (arguments)
This will put a trace of all system calls in file. truss is a good place to start troubleshooting many failures, such as insufficient permissions on files and others.
You must have support for the DPS extension in the X server to display Answerbook. Most common UNIX workstations support the DPS extension, but most PC X packages don't come with DPS support, and you may have to buy the support separately for your implementation of X terminals.
You can use ghostview as a replacement for the answerbook viewer. Unfortunately, the hypertext links won't work with ghostview.
Your best bet (if you can't get DPS extension support) is to install a client side Display PostScript extension. You can get Adobe's DPS-NX from Bluestone; their WWW link is http://www.bluestone.com.
The PPP that was shipped with Solaris 2.3 will not inter-operate with other PPP implementations. You need to get patch #101425 to fix this.
You're using gcc without properly installing the gcc fixed include files. Or you ran fixincludes after installing gcc without moving the gcc supplied varargs.h and stdarg.h files out of the way and moving them back again later. This often happens when people install gcc from a binary distribution. If there's a tmp directory in gcc's include directory, fixincludes didn't complete. You should have run just-fixinc instead.
Another possible cause is using gcc -I/usr/include.
On bootup, the first invocation of ps will try to recreate /tmp/ps_data. When it does this, ps scans the /dev tree. You may have a loop in /dev, causing ps to run forever.
In most cases, the loop is caused by the symbolic link /dev/bd.off. This link should be pointing to /dev/term/b, but it sometimes gets truncated and points to /dev instead.
To fix this, just recreate the link like so:
rm -f /dev/bd.off ln -s /dev/term/b /dev/bd.off
You may want to use the truss utility to determine whether or not this is the true cause of the problem.
First off, make sure you have /usr/ccs/bin/m4 installed. It came packaged with SUNWbtool.
This also may be related to bugs in Solaris 2.3 and various revisions of patches. syslogd is broken in all 101318 patches between level 42 and 50, but it works in 101318-54.
If you are using Solaris 2.4, you might need patch 102534 and 102697.
Some vendors still ship a version of RPC/NFS that allows, at most, 8 groups in the client credentials. Root on Solaris is, by default, in 10 groups. As a result, the Solaris 2.x mount command will send AUTH_UNIX credentials that are too big for the remote mount daemon to cope with, resulting in the "Invalid client credential" error. Workaround: Put root and all your users in 8 or fewer groups.
You must log out and log in again for changes in the number of groups to take effect (or exit root's shell and re-su).
Starting with Solaris 2.4, a kernel workaround to limit NFS readdir requests to 1024 bytes was disabled by default. This breaks interoperability with buggy old NFS implementations (such as SunOS 3.2, Ultrix, and NeXT).
There are two workarounds. The first one works and is:
mount all filesystems from such servers with rsize=1024.
The second one, which requires a patch for bugid #1193696 (101945-29 or later for SPARC, 101946-24, or later for x86)
Edit /etc/system and add:
set nfs:nfs_shrinkreaddir = 1
and reboot.
Once again, the cause is a buggy patch. Patch 101945-17 actually introduced a bug in the NFS client code that causes programs using NFS locking to go into an uninterruptible read.
If you use truss, t will show the program sleeping in the read statement, while top shows it using all the CPU.
The fix for this is to install a patch for bug-id #1198278 on all of your NFS clients. If you have SPARC, you will need patch #101945-29 or later, and f you are using x86 you will need patch #101946-24.
You can implement a workaround until you get the patch. Just mount NFS filesystems with noac. Unfortunately, you will get a significant performance drop by doing this.
You should be able to boot with the -as switch. When the system does boot, it will ask you a number of questions, including the name of the system file. At this point, you can either use the previous /etc/system file (if you remembered to make a copy of it) or you can specify /dev/null.
With the introduction of sendmail V8 for Solaris 2.x in patch form and in Solaris 2.5, a bug in sendmail.cf has suddenly started to appear. The end-of-line character is not defined for the ethernet mailer, causing sendmail to send bare newlines in violation of the SMTP protocol, which requires CR-NL. To fix, find the following line in sendmail.cf:
Mether, P=[TCP], F=msDFMuCX, S=11, R=21, A=TCP $h
and change it to:
Mether, P=[TCP], F=msDFMuCX, S=11, R=21, A=TCP $h, E=\r\n
To be on the safe side, check all lines starting with M that contain P=[TCP] or P=[IPC]. They all should use E=\r\n. This bug is also fixed in the latest Solaris 2.x sendmail patches.
Okay, let's try something else. Solaris 2.x will still send large packets over such links but without the "don't fragment" bit set. On several occasions, links have been reported that don't properly handle such packets. They're not fragmented; instead they're silently dropped.
So if the previous fix doesn't work, you can resort to the following drastic measure, which negatively impacts network performance:
/usr/sbin/ndd -set /dev/tcp tcp_mss_max 536
536 is the standard packet size that is guaranteed to work by virtue of the fact that most systems will communicate outside the local net with packets that big. If the connection then starts to work, it's time to find the largest value that works.
It's also worth mentioning that the ip_path_mtu_discovery needs to be applied at both sides of a connection to fully work; applied at one side, it will only affect outgoing large packets. (Downloads from the site will succeed, but uploads from another Solaris 2.x machine without the workaround applied may still fail.) The tcp_mss_max workaround need only be applied at one side.
If you need the tcp_mss_max workaround for some sites, there is a problem on the link between you and those sites. Get it fixed. traceroute will tell you where the problem lies. Try traceroute host size, for varying sizes. If traceroute without a size parameter works, but traceroute with a size parameter of 1460 fails at some hop, the connection between that hop and the next is broken.
If you have remote, but unshared, filesystems, such as /, /var, /var/adm, and so on, they must be mounted with the llock option. This is implemented on root for Solaris 2.x, but not for remote /var or /var/adm.
The system will hang when it tries to work with the tmp files if you don't specify llock, and the hang happens early in the boot process. Also, lpshed may fail if it can't lock /var/spool/lp/SCHEDLOCK.
To get around this problem, you need to add the llock option to the mount options for /var and/or /var/adm. It should be fixed in /etc/rcS.d/S70buildmnttab.sh.
There were a number of changes with the sd driver between 2.3 and 2.4. In particular, the code that resets the drive to the 512 block size is no longer called in the case of data overrun. Therefore, it is no longer possible to install 2.4 from a local non-Sun CDROM drive.
Really, the only workarounds to this problem are either to borrow a SunCD or to mount the CD to a remote machine. Then you have to do a network installation.
This is a problem specific to SPARC versions of Solaris 2.x. However, if you do have a CD-ROM drive that has been modified to use a 512-byte block size as the default, it should work just fine.
When you compile, it s not finding where the library is. Therefore, you have to specify where the library is with the -R switch:
cc -L/usr/dt/lib -L/usr/openwin/lib -R/usr/dt/lib -R/usr/openwin/lib \ xprog.c -lXm -lXt -lX11
There are two possible causes for this kernel memory leak.
There's a bug in the volume management device driver that, when unloaded, leaks memory: Fix with patch 101907-05 (sparc) or 101908-07 (x86). This bug especially affects systems that are not running vold, as it is triggered when the kernel decides to unload unused device drivers.
Also, the NFS client cache will cache too much. A simple workaround is to add set nrnode = 1000 to /etc/system and reboot. You may want to make this larger or smaller, depending on how much memory you have. A good rule of thumb is about 20[en]30 rnodes per megabyte of memory.
Another possible candidate is an overflow in /tmp or other swap based tmpfs filesystems. Check this possibility out with df/du.
For some reason, the nfs service has disappeared from your /etc/services file, nis map, or NIS+ table. You need to have an entry like the following:
nfsd 2049/udp nfs # NFS server daemon (clts) nfsd 2049/tcp nfs # NFS server daemon (cots)
If you use NIS+, you must make sure that the NIS+ entry is readable for the machine executing nfsd.
If you used your SunOS 4.x services file, that would explain it: SunOS 4.x doesn't have an entry for nfsd in /etc/services, Solaris 2.x requires one.
This will usually not happen until you upgrade to Solaris 2.4 or 2.5. Solaris 2.3 and earlier would always consult /etc/services, regardless of what nsswitch.conf said. /etc/services does contain the right NFS entries. Solaris 2.4 and earlier don't have an entry for NFS over tcp.
No, nothing that drastic. However, if root can't find or use a shell, you have to boot into single user mode from the CD.
Once you've done that, mount the root file system and change the designated shell that root uses in /etc/passwd back to /bin/sh.
In Solaris 2.4, there are a combination of problems that make running with quotas or with near-full disks almost impossible. The problems are related to writing messages to /dev/console, which requires interrupt switching, making the machine appear dead, clients caching up to 2 MB of failed writes and retrying them, and just generally beating the server to death.
The fix to this problem is (of course) a patch. You need to get the kernel patch # 101945-32 for the SPARC, or patch # 101946-29 for X86. You need to apply these patches, not only to the server, but to the clients as well.
Solaris 2.5 and Solaris 2.4 kernel patch 101945-34 and later have a bug in their TCP retransmission algorithm that causes excessive re-transmissions over slow links; Sun's bug ID is #1233827.
A workaround for this bug is running the following commands at system boot, that is, by adding them to /etc/init.d/inetinit (values are in milliseconds):
/usr/sbin/ndd -set /dev/tcp tcp_rexmit_interval_min 3000 /usr/sbin/ndd -set /dev/tcp tcp_rexmit_interval_initial 3000
Someone else suggested different changes, because with the above, each retransmit due to a lost packet will take a long time. The following uses a smaller value for the minimal retransmit interval but also limits the outgoing packet size to 536 bytes, so retransmitted packets are smaller.
/usr/sbin/ndd -set /dev/tcp tcp_rexmit_interval_min 1000 /usr/sbin/ndd -set /dev/tcp tcp_rexmit_interval_initial 3000 /usr/sbin/ndd -set /dev/tcp tcp_mss_max 536
Patches for this bug have been released, as listed below. You should not combine the patches with the tcp_rexmit_interval settings listed here.
101945-42: SunOS 5.4: patch for kernel
103169-06: SunOS 5.5: ip driver and ifconfig fixes
103447-03: SunOS 5.5: tcp patch
103448-03: SunOS 5.5_x86: tcp patch
103170-06: SunOS 5.5_x86: ip driver and ifconfig fixes
103582-01: SunOS 5.5.1: /kernel/drv/tcp patch
103630-01: SunOS 5.5.1: ip and ifconfig patch
103631-01: SunOS 5.5.1_x86: ip and ifconfig patch
103581-01: SunOS 5.5.1_x86: /kernel/drv/tcp patch
103632-01: SunOS 5.5.1_ppc: ip and ifconfig patch
103583-01: SunOS 5.5.1_ppc: /kernel/drv/tcp patch
In the first release of Solaris 2.5, NFSv3 has a bug that manifests itself when it tries to calculate the block allocations returned by the stat function. The server will actually report a value that is 16 times the correct value. In turn, the client returns a value 16 times smaller to the stat function.
This gives the effect that an unpatched Solaris 2.5 server and an unpatched Solaris 2.5 client are not having any problems.
However, on the clients with the bug, files on servers returning the right value will have a block count 16 times too small. This will fragment NFSv3 swap files in Solaris 2.5, since they will appear to have holes in them and swap will reject them. Should you run across this problem, you need to patch your server.
If you have a situation where the clients are correct and the servers have the bug, files will appear to have 16 times as many blocks allocated as they should have. When this happens, a lot more damage can be done than just a wacky du output.
There are two ways to fix this problem. One is to upgrade to Solaris 2.5.1, and the other is to install the 2.5 NFS patch #'s 103226 for the SPARC version, or 103227 for the x86 version. You should only use version 04 or later of these patches, however.
Make sure, if you do install patches, that you install them on both the clients and the servers. This is especially important for 2.5 clients using NFS swap files.
This section covers some common questions in the arena of software development.
Sun has dropped their old K&R C compiler, supposedly to create a market for multiple compiler suppliers to provide better performance and features. Here are some of the contenders:
When you install gcc, don't make the mistake of installing GNU binutils or GNU libc; they are not as capable as their counterparts that come with Solaris 2.x.
You'll need to get several patch kits for X11R5 if you are running Solaris 2.1. The majority of them require gcc 2.3.3 or later and you must have run fixincludes when you install the gcc software.
The recommended patch kit is identified as R5.SunOS5.patch.tar.Z. You can download this from the ftp site ftp.x.org:/R5contrib. This will work fine with gcc 2.3.3 or later, as well as the SunPRO C compiler.
With Solaris 2.3, X11R6 will compile straight out of the box.
The libraries are now split between two directories:
/usr/lib
/usr/ccs/lib
The libraries of importance are as follows:
/usr/lib:
libsocket--socket functions
libnsl--network services library
/usr/ccs/lib:
libgen--regular expression functions
libcurses--the SVR4 curses/terminfo library.
With any luck, in this chapter, you have found the answer to a question that was stumping you. Though Sun's documentation is usually fairly thorough, there is always something that slipped through that was not discussed sufficiently.
I hope you found this information useful and applicable to your particular system. If nothing else, I hope that you at least found out what your particular problem wasn't.
©Copyright,
Macmillan Computer Publishing. All rights reserved.