Because, in a word, it's useful.
Many standard utilities (rdist, make, cron, etc.) allow you to specify a command to run at a certain time. Usually, this command is simply passed to the Bourne shell, which means that you can execute whole scripts, should you choose to do so.
Lastly, Unix runs Bourne shell scripts when it boots. If you want to modify the boot-time behavior of a system, you need to learn to write and modify Bourne shell scripts.
Furthermore, and this is what this tutorial is all about, you can put commands in a file and execute them all at once. This is known as a script. Here's a simple one:
#!/bin/sh # Rotate procmail log files cd /homes/arensb/Mail rm procmail.log.6 # This is redundant mv procmail.log.5 procmail.log.6 mv procmail.log.4 procmail.log.5 mv procmail.log.3 procmail.log.4 mv procmail.log.2 procmail.log.3 mv procmail.log.1 procmail.log.2 mv procmail.log.0 procmail.log.1 mv procmail.log procmail.log.0
There are several things to note here: first of all, comments begin with a hash (#) and continue to the end of the line (the first line is special, and we'll cover that in just a moment).
Secondly, the script itself is just a series of commands. I use this script to rotate log files, as it says. I could just as easily have typed these commands in by hand, but I'm lazy, and I don't feel like it. Plus, if I did, I might make a typo at the wrong moment and really make a mess.
Some versions of Unix allow whitespace between #! and the name of the interpreter. Others don't. Hence, if you want your script to be portable, leave out the blank. |
A script, like any file that can be run as a command, needs to be executable: save this script as rotatelog and run
to make it executable. You can now run it by runningchmod +x rotatelog
./rotatelog
Unlike some other operating systems, Unix allows any program to be used as a script interpreter. This is why people talk about ``a Bourne shell script'' or ``an awk script.'' One might even write a more script, or an ls script (though the latter wouldn't be terribly useful). Hence, it is important to let Unix know which program will be interpreting the script.
When Unix tries to execute the script, it sees the first two characters (#!) and knows that it is a script. It then reads the rest of the line to find out which program is to execute the script. For a Bourne shell script, this will be /bin/sh. Hence, the first line of our script must be
#!/bin/sh
After the command interpreter, you can have one, and sometimes more, options. Some flavors of Unix only allow one, though, so don't assume that you can have more.
VAR=valueand to use the value of the variable later, use
$VARor
${VAR}
The latter syntax is useful if the variable name is immediately followed by other text:
prints#!/bin/sh COLOR=yellow echo This looks $COLORish echo This seems ${COLOR}ish
There is only one type of variable in sh: strings. This is somewhat limited, but is sufficient for most purposes.This looks This seems yellowish
Environment variables are passed to subprocesses. Local variables are not.
By default, variables are local. To turn a local variable into an environment variable, use
export VAR
Here's a simple wrapper for a program:
Here, NETSCAPE_HOME is a local variable; CLASSPATH is an environment variable. CLASSPATH will be passed to netscape.bin (netscape.bin uses the value of this variable to find Java class files); NETSCAPE_HOME is a convenience variable that is only used by the wrapper script; netscape.bin doesn't need to know about it, so it is kept local.#!/bin/sh NETSCAPE_HOME=/usr/imports/libdata CLASSPATH=$NETSCAPE_HOME/classes export CLASSPATH $NETSCAPE_HOME/bin/netscape.bin
The only way to unexport a variable is to unset it:
This removes the variable from the shell's symbol table, effectively making as if it had never existed; as a side effect, the variable is also unexported.unset VAR
Also, if you have a function by the same name as the variable, unset will also delete that function. |
Also, note that if a variable was passed in as part of the environment, it is already an environment variable when your script starts running. If there is a variable that you really don't want to pass to any subprocesses, you should unset it near the top of your script. This is rare, but it might conceivably happen.
If you refer to a variable that hasn't been defined, sh substitutes the empty string.
prints#!/bin/sh echo aaa $FOO bbb echo xxx${FOO}yyy
aaa bbb xxxyyy
If you have more than nine command-line arguments, you can use the shift command: this discards the first command-line argument, and bumps the remaining ones up by one position: $2 becomes $1, $8 becomes $7, and so forth.
The variable $0 (zero) contains the name of the script (argv[0] in C programs).
Often, it is useful to just list all of the command-line arguments. For this, sh provides the variables $* (star) and $@ (at). Each of these expands to a string containing all of the command-line arguments, as if you had used $1 $2 $3...
The difference between $* and $@ lies in the way they behave when they occur inside double quotes: $* behaves in the normal way, whereas $@ creates a separate double-quoted string for each command-line argument. That is, "$*" behaves as if you had written "$1 $2 $3", whereas "$@" behaves as if you had written "$1" "$2" "$3".
Finally, $# contains the number of command-line arguments that were given.
The above patterns test whether VAR is set and non-null. Without the colon, they only test whether VAR is set.
As a special case, when a glob begins with * or ?, it does not match files that begin with a dot. To match these, you need to specify the dot explicitly (e.g., .*, /tmp/.*).
Note to MS-DOS users: under MS-DOS, the pattern *.* matches every file. In sh, it matches every file that contains a dot.
it won't do what you want: first of all, sh will expand the *s and replace them with a list of all the files in the current directory. Then, since any number of tabs or blanks can separate words, it will compress the three spaces into one. Finally, it will replace the first instance of $$ with the PID of the shell. This is where quoting comes in.echo * MAKE $$$ FAST *
sh supports several types of quotes. Which one you use depends on what you want to do.
The backslash is itself special, so to escape it, just double it: \\.
work pretty much the way you'd expect: anything inside them (except a single quote) is quoted. You can say'foo'
and it'll come out the way you want it to.echo '* MAKE $$$ FAST *'
Note that a backslash inside single quotes also loses its special meaning, so you don't need to double it. There is no way to have a single quote inside single quotes.
preserve spaces and most special characters. However, variables and backquoted expressions are expanded and replaced with their value."foo"
the expression is evaluated as a command, and replaced with whatever the expression prints to its standard output. Thus,`cmd`
printsecho You are `whoami`
(if you happen to be me, which I do).You are arensb
The { commands; } variant is somewhat more efficient, since it doesn't spawn a true subshell. This also means that if you set variables inside of it, the changes will be visible in the rest of the script.
: ${VAR:=default}
set -x turns on the x option to sh; set +x turns it off.
set args... sets the command-line arguments to args.
if condition ; thenThat is, an if-block, optionally followed by one or more elif-blocks (elif is short for ``else if''), optionally followed by an else-block, and terminated by fi.
commands
[elif condition ; then
commands]...
[else
commands]
fi
The if statement pretty much does what you'd expect: if condition is true, it executes the if-block. Otherwise, it executes the else-block, if there is one. The elif construct is just syntactic sugar, to let you avoid nesting multiple if statements.
The more observant among you (or those who are math majors) are thinking, ``Hey! You forgot to include the square brackets in the syntax definition!''#!/bin/sh myname=`whoami` if [ $myname = root ]; then echo "Welcome to FooSoft 3.0" else echo "You must be root to run this script" exit 1 fi
Actually, I didn't: [ is actually a command, /bin/[, and is another name for the test command.
This is why you shouldn't call a test program test: if you have ``.'' at the end of your path, as you should, executing test will run /bin/test. |
The condition can actually be any command. If it returns a zero exit status, the condition is true; otherwise, it is false. Thus, you can write things like
#!/bin/sh user=arnie if grep $user /etc/passwd; then echo "$user has an account" else echo "$user doesn't have an account" fi
while condition; do
commands
done
As you might expect, the while loop executes commands as long as condition is true. Again, condition can be any command, and is true if the command exits with a zero exit status.
A while loop may contain two special commands: break and continue.
break exits the while loop immediately, jumping to the next statement after the done.
continue skips the rest of the body of the loop, and jumps back to the top, to where condition is evaluated.
for var in list; do
commands
done
list is zero or more words. The for construct will assign the variable var to each word in turn, then execute commands. For example:
will print#!/bin/sh for i in foo bar baz "do be do"; do echo "$i" done
A for loop may also contain break and continue statements. They work the same way as in the while loop.foo bar baz do be do
case expression inexpression is a string; this is generally either a variable or a backquoted command.
pattern)
commands
;;
...
esac
pattern is a glob pattern (see globbing).
The patterns are evaluated in the order in which they are seen, and only the first pattern that matches will be executed. Often, you'll want to include a ``none of the above'' clause; to do this, use * as your last pattern.
However, one can do many interesting things by redirecting one or more file descriptor:
(Exercise: why doesn't cat * > zzzzzzz work the way you'd expect?) |
This can be used as a mini-file within a script, e.g.,
cat > foo.c <<EOT #include <stdio.h> main() { printf("Hello, world!\n"); } EOT
It is also useful for printing multiline messages, e.g.:
line=13 cat <<EOT An error occurred on line $line. See page 98 of the manual for details. EOT
As this example shows, by default, << acts like double quotes (i.e., variables are expanded). If, however, word is quoted, then << acts like single quotes.
command1 > /tmp/fooexcept that no temporary file is created, and both commands can run at the same time
command2 < /tmp/foo
There is a proverb that says, ``A temporary file is just a pipe with an attitude and a will to live.'' |
Any number of commands can be pipelined together.
command 2>&1 > filenameassociates file descriptor 2 (standard error) with the same file as file descriptor 1 (standard output), then redirects both of them to filename.
This is also useful for printing error messages:
echo "Danger! Danger Will Robinson!" 1>&2
Note that I/O redirections are parsed in the order they are encountered, from left to right. This allows you to do fairly tricky things, including throwing out standard output, and piping standard output to a command.
A function is defined using
name () {and is invoked like any other command:
commands
}
name args...
You can redirect a function's I/O, embed it in backquotes, etc., just like any other command.
One way in which functions differ from external scripts is that the shell does not spawn a subshell to execute them. This means that if you set a variable inside a function, the new value will be visible outside of the function.
A function can use return n to terminate with an exit status of n. Obviously, it can also exit n, but that would terminate the entire script.
printsbasename /foo/bar/baz
baz
printsdirname /foo/bar/baz
/foo/bar
If test is invoked as [, then it requires a closing bracket ] as its last argument. Otherwise, there must be no closing bracket.
test understands the following expressions, among others:
echo simply prints its arguments to standard output. It can also be told not to append a newline at the end: under BSD-like flavors of Unix, use
echo -n "string"Under SystemV-ish flavors of Unix, use
echo "string\c"
awk -F : '{print $1, $3 }' /etc/passwd
The -F : option says that the input records are separated by colons. By default, awk uses whitespace as the field separator.
sed -e 's/foo/bar/g'
The trailing g says to replace all instances of ``foo'' with ``bar'' on a line. Without it, only the first instance would be replaced.
By default, tee empties filename before it begins. With the -a option, it appends to filename.
The -n option causes sh to read the script but not execute any commands. This is useful for checking syntax.
The -x option causes sh to print each command to standard error before executing it. Since this can generate a lot of output, you may want to turn tracing on just before the section that you want to trace, and turn it off immediately afterward:
set -x # XXX - What's wrong with this code? grep $user /etc/passwd 1>&2 > /dev/null set +x
Therefore, it is best to keep things simple and linear: do A, then do B, then do C, and exit. If you find yourself writing many nested loops, or building awk scripts on the fly, you're probably better off rewriting it in Perl or C.
#!/bin/shHowever, someone else might have gmake installed somewhere else, so it is better to write
...300 lines further down...
/usr/local/bin/gmake foo
#!/bin/sh
GMAKE=/usr/local/bin/gmake
...300 lines further down...
$GMAKE foo
in your script. But let's say that the user prefers to use Emacs as his editor. In this case, he can set $VISUAL to indicate his preference.vi $filename
However,
is no good either, because $VISUAL might not be set.$VISUAL $filename
So use
to set $VISUAL to a reasonable default, if the user hasn't set it.: ${VISUAL:=vi} $VISUAL $filename
As we saw above, the way scripts work, Unix opens the file to find out which program will be the file's interpreter. It then invokes the interpeter, and passes it the script's pathname as a command line argument. The interpreter then opens the file, reads it, and executes it.
From the above, you can see that there is a delay between when the OS opens the script, and when the interpreter opens it. This means that there is a race condition that an attacker can exploit: create a symlink that points to the setuid script; then, after the OS has determined the interpeter, but before the interpreter opens the file, replace that symlink with some other script of your choice. Presto! Instant root shell!
This problem is inherent to the way scripts are processed, and therefore cannot easily be fixed.
Compiled programs do not suffer from this problem, since a.out (compiled executable) files are not closed then reopened, but directly loaded into memory. Hence, if you have an application that needs to be setuid, but is most easily written as a script, you can write a wrapper in C that simply exec()s the script. You still need to watch out for the usual problems that involve writing setuid programs, and you have to be paranoid when writing your script, but all of these problems are surmountable. The double-open problem is not.
which resets the input field separator to its default value. Otherwise, you inherit $IFS from the user, who may have set it to some bizarre value in order to make sh parse strings differently from the way you expect, and induce weird behavior.IFS=
In particular, the user might have ``.'' (dot) as the first element of his path, and put a program called ls or grep in the current directory, with disastrous results.
In general, never put ``.'' or any other relative directory on your path.
I like to begin by putting the line
at the top of a new script, then add directories to it as necessary (and only add those directories that are necessary).PATH=
I once had a script fail because a user had put a square bracket in his GCOS field in /etc/passwd. You're best off just quoting everything, unless you know for sure that you shouldn't.
if [ $answer = yes ]; then
However, $answer might be set to the empty string, so sh would see if [ = yes ]; then, which would cause an error. Better to write
if [ "$answer" = yes ]; then
The danger here is that $answer might be set to -f, so sh would see if [ -f = yes ]; then, which would also cause an error.
Therefore, write
which avoids both of these problems.if [ x"$answer" = xyes ]; then