ITIC:Introduction to Bash scripting

From Juneday education
Jump to: navigation, search

What is a script

Remember that Bash is a command line interpreter (a shell). The word line is central here. Bash only processes complete commands if they end with a newline character (typically you press Enter to issue the command for processing by Bash).

Bash can work interactively like this. You enter commands to be processed when you press Enter. But Bash can read files with commands too, and it can read from standard in. Here are some examples of bash working interactively, reading from standard in (using a pipe) and reading from a file called file-with-echo:

$ cat file-with-echo
echo hej
$ echo "echo hej" | bash
hej
$ bash < file-with-echo 
hej
$
$ cat file-with-echo | bash
hej
$

In the example above, we started by showing you the contents of the file file-with-echo using cat. The text inside the file was echo hej. But, hey, that's a valid command line! So if we told bash to read that file from standard in, using redirection and then also a pipe, wouldn't it execute the command line in the file? Yes it would and it did. We can even use echo and a pipe to bash to have bash execute the command from echo, as you saw in the example: echo "echo hej" | bash. Bash read echo hej from standard in, and since it ended with a newline (default behavior from echo), it just executed it. So we created a new instance of Bash, which executed the command echo hej and then exited.

You should be aware that in all of the examples above, we are running bash interactively but start a new instance of bash which reads commands and executes them, then dies (so we are back in interactive mode).

But Bash can also execute a file that contains a special first line thing called shebanb and that has execute permissions for the user. Let's look at such a file:

$ cat say_hej.sh
#!/bin/bash

echo hej

The first line is the shebang and it tells Bash how to execute this file. In fact, the shebang line tells Bash to use Bash to execute it. The rest of the file is just one single command line with echo hej. By the way, "hej" means "hi" in Swedish.

We named the file say_hej.sh using the .sh suffix. Not because we had to, but because it is a shell script, and it's a convention to name such files with the suffix .sh so that we later can expect the file to be a script, just by looking at its name. As far as Bash is concerned, the file could be called anything, like say_hej.bananas or even say_hej.pdf but that makes less sense to us humans. Some operating systems (looking at you, Windows) have put some magic into filenames, letting the suffix decide what file type they are. We, the authors, have never understood how a name can change the contents of a file (it can't, by the way). Just as if Rikard changed his name to Rebecca (which happens to be the name of his sister), it wouldn't make him a woman physically. Or if Henrik changed his name to Rikard, it wouldn't make them the same person. We think that the contents of the file should decide its file type, and so does most reasonable operating systems as well as Bash.

If we had an MP3 file called say_hej.sh, Bash would not be able to "run" it. Because Bash only understand plain text and command lines. Bash would still be able to tell an MP3 player application like mpg321 to play the MP3 file, and mpg321 would still be able to play it (if it really is an MP3 file), regardless of its name ending with .mp3 or not.

Anyway, files with execute permissions, and a shebang and some command lines are what we call scripts (or Bash scripts if the shebang is exactly #!/bin/bash.

So let's add the execute permission to the file and execute it:

$ chmod u+x say_hej.sh
$ ./say_hej.sh 
hej
$

To add execute permission for the user that owns the file, we use chmod u+x and the file as argument. To execute the file, we need to use a relative path. If we are in the same directory, the relative path will be ./filename, in our case ./say_hej.sh . This has to do with the fact that, if we want to use our script as a command, bash must know where to find the command. Bash uses the environment variable PATH to decide where to look for commands. This is actually a good thing. If we happened to have five programs called ls in different places on our computer, how would Bash know which one to use? We need the list of directories in the PATH variable to find the correct one. If you want to execute your own ls program you will have to be explicit about it and give Bash the relative or absolute path to your command. You cannot expect Bash to read your mind and understand that you meant another version of ls from the standard one in /bin/ls .

So, if our script is not in one of the directories listed in our PATH variable, we need to use a relative or absolute path to the script to help Bash find it.

In conclusion:

  • Bash scripts are just plain text files with command lines
  • They can be executed as commands if
    • they have the execute permission
    • they start with #!/bin/bash
    • we help Bash to find the script using a path to it

Our first script

Let's create a script together. The script should:

  • Print a welcome message with you user name and the current time and date
  • Print your computer's name and IP address

Now, a script is just a bunch of command lines, right? So we can actually try the command lines in the shell interactively, before we make it into a script. If they work in the shell interactively, they will work in the script.

Before we start, we'll have to tell you a few useful techniques:

  • You can tell echo not to make a newline by using the flag -n
  • You can create text (e.g. for echo to print) by using command substitution:
    • echo "Today is $(date +%A)" - prints Today is Monday if it is Monday
    • $(date +%A) will be expanded to the result of the command
  • Always use quotes around text (will spare you some headache)
  • Using a semicolon allows you to issue two commands on the same command line

When trying the commands for our script in the interactive shell, we'll use the following commands:

  • echo - to print stuff
  • hostname -I - to get the IP address
  • uname -n - to get the computer name

And the environment variable $USER contains your username.

$ echo "Welcome $USER"
Welcome rikard
$ date
tis  6 aug 2019 11:30:15 CEST
$ echo -n "Your IP address is ";hostname -I
Your IP address is 10.0.116.35 192.168.56.1 2001:6b0:2:2801:36bd:fb4f:417b:b503 
$ uname -n
newdelli
$

Feel free to try the same commands yourself. You should get different results, of course.

Now, let's make a script out of those command lines. If you want to follow what we do here, create a directory first and cd to that directory. You don't want to pollute your home directory with a lot of scripts and unrelated files.

Use an editor to create a file called welcome.sh and put the shebang on the first line of the file. Put the commands in the same order as we tested them in the file and save it. Set the execute permission on the file and execute it with ./welcome.sh. If this is the only file in the directory, you may use tab completion when executing it. Type ./w and press TAB, then enter.

This is what the file looks like when edited using nano:

The nano editor and the script

Adding execute permissions: $ chmod u+x welcome.sh and then running it:

$ ./welcome.sh 
Welcome rikard
Your IP address is 10.0.116.35 192.168.56.1 2001:6b0:2:2801:36bd:fb4f:417b:b503 
newdelli

Oops! We forgot the date! Edit the file again and add the date line. This is what the script could look like:

$ cat ./welcome.sh
#!/bin/bash
echo "Welcome $USER"
echo -n "Time and date is "
date
echo -n "Your IP address is "
hostname -I
uname -n

$ ./welcome.sh 
Welcome rikard
Time and date is tis  6 aug 2019 11:57:24 CEST
Your IP address is 10.0.116.35 192.168.56.1 2001:6b0:2:2801:36bd:fb4f:417b:b503 
newdelli

A few notes. We don't need to use semicolons in the script, since we can put commands on separate lines to make it easier to read. So we split up the echo -n "Time and date is " and date commands on separate lines. You can keep the semicolon and have them on the same line if you prefer.

A few alternatives are:

$ echo -n "Time and date is ";date
Time and date is tis  6 aug 2019 12:01:53 CEST

$ echo "Time and date is $(date)"
Time and date is tis  6 aug 2019 12:02:25 CEST

We think that's a matter of style and you can choose whichever feels more natural to you. Or try all three styles:

  • echo -n "some text ";some_command
  • echo -n "some text "
    some_command
  • echo "some text $(some_command)"

Evolving the script - using variables

As you may recall from the module on Bash introduction, you can create variables in Bash. This is very common in scripts, so let's evolve our script to use some variables.

Using variables have many advantages. Here are a few:

  • You can initialize a variable to a value in one place and use it in many places in your script
  • If you need to change the value, you now only have to change it in one place
  • Variables makes printing easier, in particular when you want to print something generic but a part of the text may vary

This is how to make a few variables for our script:

GREETING="Welcome $USER"
TIME=$(date)
IP=$(hostname -I)
HOSTNAME=$(uname -n)
echo "$GREETING"
# ...etc

Above, we had the environment variable $USER as part of the new GREETING variable. When the value of a variable is used, you put a dollar sign before the variable name. Variables can be nested like the above (the value of a variable can contain the value of another variable etc). Then we created three additional new variables, TIME, IP and HOSTNAME.

The naming of variables is kind of flexible, but constants and environment variables usually have all caps (uppercase). A constant is a variable that you give a value once, and then only use it, never change it.

Normal variables are such that we give them new values often. Such variables usually have all lowercase names.

# i is a variable whose value changes
$ for i in 1 2 3 4; do echo $i; done
1
2
3
4
# BACKUP_DIR is a constant, never to be changed in the script
BACKUP_DIR="/home/rikard/backups"
# when used:
for file in $(find music/ -name '*.mp3')
do
 mv $file $BACKUP_DIR
done

This is what the script could look like if we used variables instead:

$ cat welcome.sh
#!/bin/bash

GREETING="Welcome $USER"
TIME=$(date)
IP=$(hostname -I)
HOST=$(uname -n)

# the action begins here
echo "$GREETING"
echo "Time and date is $TIME"
echo "Your IP address is $IP and hostname is $HOST"

You should note that you need to declare and initialize a variable before its value is used. With "before", we simply means on a command line above the variable's use. First you use the following command line to declare and initialize the variable GREETING:

GREETING="Welcome $USER"

After that, on a few lines below, you use the variable in another command line:

echo "$GREETING"

Declaring and initializing a variable is a command line like any other. You can do it in the shell interactively too. You can "forget" a variable using unset:

$ WELCOME="welcome.sh"
$ ./$WELCOME
Welcome rikard
Time and date is tis  6 aug 2019 12:24:54 CEST
Your IP address is 10.0.116.35 192.168.56.1 2001:6b0:2:2801:36bd:fb4f:417b:b503  and hostname is newdelli
$ unset WELCOME
$ echo "$WELCOME"

$

Using today's date as part of a filename

When making backups, it is useful to give the backup file a name that contains the date it was being backed up. How can we do that? We'll use command substitution (command expansion)!

Let's pretend we are writing a small script which backs up some files. We want the backup copies to contain the backup date as part of their names, so that we can distinguish between backups and get a file version from a certain date if we need.

Before we begin, we need to learn a little about formatting the output from the date command. If we can make it output the date in a nice format (without spaces in particular, because spaces in a filename is like asking for trouble), then we can use command expansion to create a filename with today's date as part of the name.

To format the output from date we use a date format string. The format string for four digit year is %Y, for two digit month is %m and for two digit day is %d. We can add any text or characters between the parts of the date format symbols. So if we want a date format for e.g. 2019-08-06, we'd use the following date format string: %Y-%m-%d. Now, you can instruct date to use such a format string, by giving it as an argument starting with a plus directly followed by the format string:

$ date "+%Y-%m-%d"
2019-08-06

$ date "+%H:%M:%S"   # means: "hour:minute:second"
13:18:28

OK, great. So we could now create a backup file name that incorporates today's date as part of the filename:

$ cp welcome.sh welcome.sh.bak.$(date +%Y-%m-%d)
$ ls -1
file-with-echo
say_hej.sh
welcome.sh
welcome.sh.bak.2019-08-06

If the file to backup is a text file, we can also compress the backup using gzip -9 (to make a zip file with maximum compression).

We could make this into a script for backing up the welcome.sh file:

$ cat do_backup.sh 
#!/bin/bash

FILE="welcome.sh"
DATE=$(date +%Y-%m-%d)
BACKUP="$FILE.bak.$DATE"
cp "$FILE" "$BACKUP"
gzip -9 "$BACKUP"

$ ./do_backup.sh
$ ls -1
do_backup.sh
file-with-echo
say_hej.sh
welcome.sh
welcome.sh.bak.2019-08-06.gz

Did the compression save us any bytes?

$ ls -l
total 20
-rwxrw-r-- 1 rikard rikard 118 aug  6 13:25 do_backup.sh
-rw-rw-r-- 1 rikard rikard   9 aug  6 10:49 file-with-echo
-rwxrw-r-- 1 rikard rikard  22 aug  6 10:59 say_hej.sh
-rwxrw-r-- 1 rikard rikard 211 aug  6 12:19 welcome.sh
-rwxrw-r-- 1 rikard rikard 199 aug  6 13:26 welcome.sh.bak.2019-08-06.gz

Wow! The backup is a whole 12 bytes smaller. Just kidding, but large text files get a lot smaller when zipped. Part of the zip file is overhead from metadata, so for small files we usually save nothing (sometimes really small files can get bigger when compressed).

Evolving the backup script - using arguments

A really crappy thing about our backup script is that it only works for backing up exactly the file welcome.sh and nothing else. If we want to backup another file, we need another script. Not good. Imagine if the ls command worked like this. One command for listing the files in the home directory, one command for listing the files in the Documents directory etc. That would be really crappy. Of course, we want ls to be capable of listing any directory or file we give to it as an argument!

So, let's create a new script, do_backups.sh . Note the new name, backups in plural. Sounds better?

But how would we go about writing a script that can backup more than one file, and in particular, the files we ask it to backup? The answer is to use arguments.

Just like most commands, scripts can take arguments too. Using arguments for your script makes it more generic. The user provides the information needed, and the script uses the argument to know what exactly to do.

Arguments tell a script or command what to do, like what file(s) to backup. The usage for our new script could be ./do_backups.sh file1 [file2 file3 ... fileN]. This way of writing the usage usually means that the script takes at least one argument (file1) and optionally more arguments (file2 etc). The optional arguments are often written in documentation and manuals between square brackets like the above. The actual arguments when you use the script has no square brackets. Those are just for humans when we describe the usage of the script. So, we want the script to accept one or more arguments for the files to backup.

The way we are going to achieve this behavior, is to loop through every argument, and make a backup of the file representing by the argument.

If we know the exact number of arguments, they can be referenced in a script by $1 for the first argument, $2 for the second argument etc. But for our script, we don't know if there will be one, two or any other number of arguments. So we need a way to loop over each argument in turn. We'll show you one way of doing that.

You can get a list of all arguments to the script by using the variable "$@". Using the for loop, we can iterate over each argument in the list.

The for loop has the following syntax:

for variable in list; do command(s); done

In a script you can afford to put some indentation and get rid of the semicolons, to make the script easier to read for humans:

for variable in list
do
  command(s)
done

Since we know how to get a list of all arguments, we can loop over each one of them. The following small script simply shows you how to loop over all arguments and print them one-by-one:

$ cat arguments.sh
#!/bin/bash

for arg in "$@"
do
  echo "$arg"
done
$ ./arguments.sh a b c d
a
b
c
d

When the script in arguments.sh is executed with the arguments a b c d, the for loop puts each argument in turn in the variable arg, which is used by echo in the loop body. Each argument was printed. This means that we could use this strategy for our backup script.

Now that we know also how to use variables, we'll use a variable called SUFFIX in the script, so that we can put the file suffix of the backup files in that variable. The suffix will be ".bak.YYYY-mm-dd" where YYYY is the year and mm is the month and dd is the day, e.g. ".bak.2019-08-06" .

The script will for each argument (being a file to be backed up) create a copy with the old name followed by the suffix, and then compress the copy with gzip -9.

When dealing with files and arguments, there are a few things you should look out for:

  • Does the script handle the case where there are no variables?
    • How? What should the backup script do if you call it with no variables?
  • Does the script handle variables that actually are not files (like misspelled filenames)?
    • How? What should the backup script do if some or all of its arguments are not files that can be found?

A trick to check if a filename exists as an actual file is:

if [[ -e "$FILE" ]]; then command(s); else command(s); fi

Or with indentation in a script to make it easier to read:

if [[ -e "$FILE" ]]
then
  command(s)
else
  command(s)
fi

Try to write the script yourself, before looking at our solution.

Here's one way of writing the script. We'll explain it line-by-line later:

$ cat do_backups.sh
#!/bin/bash

DATE=$(date +%Y-%m-%d)
SUFFIX=".bak.$DATE"
bad_files=""

for file in "$@"
do
 if [[ -e "$file" ]]
 then
   cp "$file" "$file$SUFFIX" && gzip -9 "$file$SUFFIX"
 else
   bad_files="$bad_files $file"
 fi
done

if [[ -z "$bad_files" ]]
then
 exit 0
else
 echo "These files were not found: $bad_files" >&2  # error message to std err
 exit 1
fi

We are first setting up the variables. DATE contains the formatted date and is used as part of the SUFFIX constant. The bad_files variable will contain any files that could not be found, to be used as part of an error message, giving feedback to the user that some files were not found.

Then we use a for loop to loop over the arguments, using the variables file to contain each filename in turn in the loop body.

The loop body starts with an if-statement to see if the file exists. If it does, then the file is copied to the backup name with the suffix. We use && between the copying and zipping, because there's no point in trying to zip the copy if the copy wasn't successful. Else, if the file didn't exist, we add the bad filename to the variable bad_files to be used in an error message later.

After the loop over each file, we check if the bad_files variable is empty. If it is, we exit the script with an exit status of zero (meaning success). Else, it was not empty, we instead write an error message containing the names of the missing files, to standard error, and exit with a status of 1 (meaning fail).

It is a good practice to always write error messages to standard error, using for instance echo "Error" >&2. The redirection >&2 redirects standard out to standard error.

Another good practice is to exit your script with a non-zero value if there were any errors in the script.

Here's a few test-runs:

$ ls -l
total 24
-rwxrw-r-- 1 rikard rikard 118 aug  6 13:25 do_backup.sh
-rwxrw-r-- 1 rikard rikard 378 aug  6 14:12 do_backups.sh
-rw-rw-r-- 1 rikard rikard   9 aug  6 10:49 file-with-echo
-rwxrw-r-- 1 rikard rikard  22 aug  6 10:59 say_hej.sh
-rwxrw-r-- 1 rikard rikard 211 aug  6 12:19 welcome.sh
-rwxrw-r-- 1 rikard rikard 199 aug  6 13:26 welcome.sh.bak.2019-08-06.gz

$ ./do_backups.sh do_backup.sh do_backups.sh say_hej.sh bad_filename badder_filename
These files were not found:  bad_filename badder_filename

$ echo $?
1

$ ls -l
total 36
-rwxrw-r-- 1 rikard rikard 118 aug  6 13:25 do_backup.sh
-rwxrw-r-- 1 rikard rikard 143 aug  6 14:13 do_backup.sh.bak.2019-08-06.gz
-rwxrw-r-- 1 rikard rikard 378 aug  6 14:12 do_backups.sh
-rwxrw-r-- 1 rikard rikard 271 aug  6 14:13 do_backups.sh.bak.2019-08-06.gz
-rw-rw-r-- 1 rikard rikard   9 aug  6 10:49 file-with-echo
-rwxrw-r-- 1 rikard rikard  22 aug  6 10:59 say_hej.sh
-rwxrw-r-- 1 rikard rikard  68 aug  6 14:13 say_hej.sh.bak.2019-08-06.gz
-rwxrw-r-- 1 rikard rikard 211 aug  6 12:19 welcome.sh
-rwxrw-r-- 1 rikard rikard 199 aug  6 13:26 welcome.sh.bak.2019-08-06.gz

$

As you see, the correct arguments worked and those files were backed up. The bad arguments were reported and the exit status of the script was 0.

Bash functions

We will also mention a few words about functions. We'll show you some examples of functions to show you the syntax of declaring a function (which must be done before it is used) as well as how to call a function and how a function can use arguments too.

$ cat functions.sh
#!/bin/bash

get_date() {
 date +%Y-%m-%d
}

add() {
 echo "$(($1 + $2))" # arguments are called $1, $2 etc in functions too
}

# The action starts here
echo "This is done first"
get_date  # call get_date without arguments
add 10 30 # call add with two arguments

Let's run the sample script:

$ ./functions.sh
This is done first
2019-08-02
40

The first function, get_date() simply runs a command for creating a formatted date. When called, the command date is called from the function.

The second function is perhaps more interesting, since it takes arguments. The arguments are used in an arithmetic expression (an addition), which is echoed to standard out. When calling a function with arguments, you just type the function name and the arguments: add 10 30 .

The declaration of a function must occur before it is called. If it isn't called, nothing happens. You can think of functions as dormant code blocks, waiting to be called later in the script. Arguments to a function will also be called $1, $2 etc.

You can declare a function in the interactive shell too. It will exist as long as the shell is running. If you want persistent functions in your shell, you should put them in some initialization file e.g. ~/.bashrc since those files are read at startup by the shell.

Usually, only standard out comes back from a function. You can return a value too, but that is limited and you should actually only use that for exit codes where zero means success and non-zero values means failure.

We should probably say a few words on the arithmetic expression $(($1 + $2)) too. Using a dollar sign and double parentheses allows you to expand arithmetic expressions in Bash (for integers). Try this:

$ echo "One plus one is $((1 + 1))"
One plus one is 2

$ echo "5/2 is $((5 / 2))"
5/2 is 2

$ echo "10 * 10 is $((10 * 10))"
10 * 10 is 100

$ echo "10^10 is $((10 ** 10))"
10^10 is 10000000000

You can even use arithmetic expressions for something interesting, such as calculating the number of days to your birthday:

$ TODAY=$(date +%j)
$ BIRTHDAY=$(date -d 2019-11-24 +%j)
$ DAYS_TO_BIRTHDAY=$((BIRTHDAY - TODAY))

$ if (( $DAYS_TO_BIRTHDAY > 0 )); then echo "Birthday is in $DAYS_TO_BIRTHDAY days";elif (( $DAYS_TO_BIRTHDAY < 0 ));then echo "Birthday has already been"; else echo "Birthday is today";fi
Birthday is in 110 days

$ BIRTHDAY=$(date -d 2019-01-24 +%j)
$ DAYS_TO_BIRTHDAY=$((BIRTHDAY - TODAY))

$ if (( $DAYS_TO_BIRTHDAY > 0 )); then echo "Birthday is in $DAYS_TO_BIRTHDAY days";elif (( $DAYS_TO_BIRTHDAY < 0 ));then echo "Birthday has already been"; else echo "Birthday is today";fi
Birthday has already been

$ BIRTHDAY=$(date -d 2019-08-06 +%j)
$ DAYS_TO_BIRTHDAY=$((BIRTHDAY - TODAY))
$ if (( $DAYS_TO_BIRTHDAY > 0 )); then echo "Birthday is in $DAYS_TO_BIRTHDAY days";elif (( $DAYS_TO_BIRTHDAY < 0 ));then echo "Birthday has already been"; else echo "Birthday is today";fi
Birthday is today

Why not make a Bash function in the shell, called days_to_birthday?

$ days_to_birthday() {
> BIRTHDAY=$(date -d "$1" +%j)
> DAYS_TO_BIRTHDAY=$((BIRTHDAY - TODAY))
> if (( $DAYS_TO_BIRTHDAY > 0 ))
> then
>   echo "Birthday is in $DAYS_TO_BIRTHDAY days"
> elif (( $DAYS_TO_BIRTHDAY < 0 ))
> then
>   echo "Birthday has already been"
> else
>   echo "Birthday is today!"
> fi
> }

$ days_to_birthday 2019-12-01
Birthday is in 117 days

$ days_to_birthday 2019-01-31
Birthday has already been

$ days_to_birthday 2019-08-06
Birthday is today!

The function was called on 2019-08-06. Note the secondary prompt we got in Bash, since we pressed Enter before the function was finished (Bash waited for the closing curly brace).

There are some more examples and some slides about functions on this page on this wiki. Open that link in a new tab if you want to sneak preview it, and then close the tab to get back here.

Links

Workshop slides

Summary lecture slides

  • TODO

Videos and video slides

  • Video: TODO
  • Video slides TODO

Source code

Further reading

Where to go next

The next page is ITIC:Introduction_to_Bash_scripting_-_Exercises with exercises on writing scripts.

« PreviousBook TOCNext »