Bash idioms

From Juneday education
Jump to: navigation, search

This is a scratch pad for Rikard where he documents bash idioms pro memoriam.

Arguments

Argument variables

Positional arguments are named $1, $2,... etc. The number of arguments is stored in $#. All arguments treated as one long string is $* while all arguments as a list of strings is $@.

Example:

# given the following snippet in args.sh:
for i in "$*"
# Will only loop once (if you put it between quotes)
do
 echo $i
done

Will produce:

$ ./args.sh a b c d
a b c d
$

And respectivly, this snippet:

# will loop over all arguments one by one, regardless of quotes/no quotes
for i in $@
do
 echo $i
done

will produce the following:

$ ./args.sh a b c d
a
b
c
d

Checking the number of arguments provided to script

#!/bin/bash

if [ $# != "0" ]
then
 echo $# args
else
 echo no args
fi

Sample test runs:

~$ argstest these are four arguments
4 args

Require at least a certain number of arguments:

minargs=3

# Check that args are at least 3
if [ $# -lt $minargs ]
then
 usage
fi

(usage is a function defined earlier in the script, that outputs the "Usage: " instructions and does an exit 1 (or so))

Loops

Regular for loop over words in a text file

The for loop in bash treats files (or streams of text) as a long list of words. If you want to process a file, word-by-word, a for loop lets you do that very swiftly:

$ cat textfile.txt
This is a file of text
on several lines.
Let bash count the number of words in it.
$
$ wordcount=0;for i in $(cat textfile.txt);do ((++wordcount));done;echo $wordcount
18

# Prettier on several lines:
$ wordcount=0
$ for i in $(cat textfile.txt)
> do
>  ((++wordcount))
> done
$ echo $wordcount
18
$ 

The ">" is bash's secondary prompt, wating for the closing of the for loop, interactively letting you finish the loop on several lines.

One useful use of this is for files of numbers. You can loop over each number (which bash will treat as a word each) and actually process the numbers. This example shows how to sum up all numbers in a text file and write out the total sum and average:

$ cat salaries.txt
17000
19000
25000
31000
33000
$ cat numbercrunching.sh
#!/bin/bash

count=0
sum=0
avg=0
for i in $(cat salaries.txt)
do
 ((++count))
 ((sum += i))
done
echo Stats:
echo $count salaries found. Total sum: $sum
echo $(echo "scale=2;$sum/$count"|bc) was the average salary.
$ ./numbercrunching.sh
Stats:
5 salaries found. Total sum: 125000
25000.00 was the average salary.

Here, the external calculator bc is used, and scale=2; is sent to it to get results as a real number with two decimals. Together with command substitution, it can be pretty handy (num=$(echo "expression"|bc).

Looping over whole line

Sometimes you want to iterate over each line in a file, saving or processing each complete line. Default, looping over a text file will iterate over each word separated by whitespace. If you want to read a whole line into a variable at the time, this is one way of doing that:

line_number=0
cat my_file.txt|while read LINE  #sub-shell starts here
do
 ((++line_number))
 echo Line number $line_number: $LINE 
done                             #sub-shell ends here

Sample test runs:

$ cat my_file.txt
This is a test file.
With only three lines of text.
Ends here.

$ ./readlines.sh my_file.txt 
Line number 1: This is a test file.
Line number 2: With only three lines of text.
Line number 3: Ends here.

Note that after the while loop, line_number is still 0, because the pipe starts a new sub-shell.

If you want to do the same in one shell, and have line_number affected and usable outside the while loop, you can use the following syntax:

$ line_number=0;while read LINE; do echo "Line number $((++line_number)): $LINE";done < <(cat my_file.txt)
$ echo $line_number
3
$

The same as a script:

#!/bin/bash
line_number=0
while read LINE
  do echo "Line number $((++line_number)): $LINE"
done < <(cat my_file.txt)
echo $line_number # 3

Special bash variables

A useful variable is $! which holds the PID of the last command run or started in the background, which makes it possible to kill it if necessary. The script itself will have its PID stored in $$ .

The exit status of the last function or command called is in $?

The last command's last argument is stored in $_. This shows how to store the last word of a line of text in a variable using $_:

$ last=$((echo one two three;echo $_)|tail -1);echo $last
three
$ last=$((echo one two three four five;echo $_)|tail -1);echo $last
five

Alternatively:

$ last=$(cat <(echo one two three four five;echo $_)|tail -1);echo $last
five

For a whole file:

$ cat my_file.txt 
This is a test file.
With only three lines of text.
Ends here
$ while read LINE; do last=$((echo $LINE;echo $_)|tail -1);echo $last;done< <(cat my_file.txt)
file.
text.
here
$

C-style bash

Arithmetic operations

You can use C-style increments and decrements of variables (both postfix and prefix variants):

a=1
echo $(( a++ )) # 1
echo $a         # 2
echo $(( --a )) # 1

This allows for smooth loop count variable:

count=0
for i in $(cat file.txt)
do
 (( ++count ))
done
echo $count words

Ternary ? operator

Like in C (and Java etc) you can use the ? operator:

# One apple is two Euros. If you buy more than two, they are one euro each.
p=0 # price
n=3 # number of apples
(( p = n==1?2:n*1 ))
echo Price to pay: $p #3

The syntax is (( var = predicate?value if true:value if false ))

Loops

You can use C-style loops:

for ((i=0; i < 10 ; i++))
do
 #something ten times
done 

i=0 # initialize i before loop ;-)
while (( i < 10 ))
do
 #something ten times
 (( i++ ))
done

Rounding numbers using printf

The builtin printf function is useful for formatting numbers. It even rounds off numbers:

a=1.5555
printf "a with one decimal: %1.1f" $a
echo
printf "a with two decimals: %1.2f" $a
echo
printf "a with three decimals: %1.3f" $a
echo

Example output:

$ ./round.sh
a with one decimal: 1.6
a with two decimals: 1.56
a with three decimals: 1.555

Simulating call by reference

In bash, all functions are called by value. And return values are only integers. If you want to have a side-effect, in most cases you'll need to use global variables which can be ugly and hard to maintain. However, there is a way to simulate call by reference, which uses the rather odd concept of de-referencing variables.

Consider a function for turning a string into all lower case letters:

function toLower()
{
 # use tr [A-Z] [a-z] in some way
 # but how do we return the value?
 # return ...? (can only return an integer)
}

The solution (if you really want a call by reference function that manipulates the variable you call it) is to use dereferencing. The concept makes use of a string corresponding to an existing variable name. So if you have a variable called name, the way to get the value in that variable is to prepend a $ to it: $name. The surprising thing about this, is that bash provides a way to take a string value and create a variable of it and obtain its value:

 name="Bill"
 temp="name"
 s=${!temp} # s is now "Bill"
 eval "name=\"$(echo $s Gates)\""

That might seem like a tedious way to change the value of a variable. But it can be used inside a function using the argument $1. This allows for passing any string corresponding to a variable name outside the function, and change that variable as a side effect, thus simulating a call by reference:

function toLower()
{
 local s
 s=${!1}
 #s will have the value of the variable, whose name was passed to the
 # function! eval will change that variable:
 eval "$1=\"$(echo $s|tr [A-Z] [a-z])\""
}

The syntax is awful because of the escapes needed for eval to work (you need quotes to expand the right-hand side). But it works:

function toLower()
{
 local s
 s=${!1}
 eval "$1=\"$(echo $s|tr [A-Z] [a-z])\""
}

URL="HTTP://WWW.FSF.ORG"
toLower URL
echo $URL

Example run:

$ ./tolower.sh 
http://www.fsf.org

String manipulation

Substring substitution

Sometimes you want to replace a substring in a text. There are many ways of doing that with bash. Here are some of them.

${string//substring/replacement}

Replace first occurance of a substring with another string:

$ string="abcdefg"
$ echo ${string/de/DE}
abcDEfg

Replace all occurrances of a substring with another string:

$ string="MSFT up 20%. MSFT considered good buy"
$ echo ${string//MSFT/RHAT}
RHAT up 20%. RHAT considered good buy

Replace something from front of string:

$ file="ogg_free_software_song.ogg"
$ echo ${file/#ogg/OggVorbis}
OggVorbis_free_software_song.ogg

Replace something from end of string:

$ file="OGG_free_software_song.OGG"
$ echo ${file/%OGG/ogg}
OGG_free_software_song.ogg

Substring removal

Deleting a substring from file can be done using a similar syntax from the above, leaving out the replacement string:

$ s="This is a long long sentence"
$ echo ${s/long } #first occurrence
This is a long sentence
$ echo ${s//long } #all occurrences
This is a sentence

There is also a smooth short-hand for deleting substrings using some special characters, namely {# ## % %%}. Remove shortest and longest match from beginning of string:

$ logfile="/var/log/apache2/access.log"
$ #        |---|             shortest
$ #        |---------------| longest

$ echo ${logfile#/*/} #      shortest
log/apache2/access.log
$ echo ${logfile##/*/}       longest
access.log

Remove shortest and longest match from end of string:

$ archive="directory.tar.gz"
#                       |-|    shortest
#                   |-----|    longest
# Strip .gz:
$ echo ${archive%.*gz}
directory.tar
# Strip every file suffix:
$ echo ${archive%%.*gz}
directory

ur1.ca string manipulation example

#!/bin/bash

long=$1
snippet=$(curl -s --url http://ur1.ca/ -d longurl="$long"|grep "Your ur1 is")
link1=${snippet##*href=\"}
ur1=${link1%%\"*}
echo $ur1
Usage example:
$ ur1.sh http://thepublicdomain.org/thepublicdomain1.pdf
http://ur1.ca/7xzrz
$

The above sends the HTTP variable "longurl" with the value of the argument supplied (a long url) to ur1.ca and the result is a webpage that needs to be parsed to get the resulting short URL.

grep will find the line in the resulting HTML page with "Your ur1 is" and throw away everything before href=". Next, everything after the next double quote is trimmed off. For instance:

grep finds:
		<p class="success">Your ur1 is: <a href="http://ur1.ca/a65kw">http://ur1.ca/a65kw</a></p>
Then it is stripped to:
http://ur1.ca/a65kw">http://ur1.ca/a65kw</a></p>
( using  link1=${snippet##*href=\"}    )
And then to:
http://ur1.ca/a65kw
( using ur1=${link1%%\"*}              )

Some hints on external commands

tree

The tree command is neat when you want an ASCII view of the directory layout and all the files:

$ tree
.
|-- ogg
|   |-- a.ogg
|   |-- b.ogg
|   `-- c.ogg
|-- pdf
|   |-- a.pdf
|   |-- b.pdf
|   `-- c.pdf
`-- txt
    |-- a.txt
    |-- b.txt
    `-- c.txt

tree is not a standard command and you'll probably need to install it yourself.

tree and special characters in file names

Sometimes you have files with names that contain foreign or special characters. The default behaviour of tree is to use a carrot notation:

$ ls *
den:
danish_chars_are_\E6\F8\E5.txt

swe:
swedish_chars_are_\E5\E4\F6.txt

$ tree
.
|-- den
|   `-- danish_chars_are_\303\246\303\270\303\245.txt
`-- swe
    `-- swedish_chars_are_\303\245\303\244\303\266.txt

2 directories, 2 files

The simplest way to get a nicer output is to use the -N flag:

$ tree -N
.
|-- den
|   `-- danish_chars_are_\E6\F8\E5.txt
`-- swe
    `-- swedish_chars_are_\E5\E4\F6.txt

2 directories, 2 files

Some silly scripts

A script which kills an already running instance of itself

Here's a script which sleeps 30 seconds. If we start another instance of the script while the first one is running, the second script kills the first:

#!/bin/bash

[[ -e /tmp/slow.pid ]] && ps -p `cat /tmp/slow.pid` &>/dev/null && kill -9 `cat /tmp/slow.pid`

echo $$ > /tmp/slow.pid
sleep 30
rm /tmp/slow.pid

Here's how it works:

The script starts off by checking if there's a file in /tmp called slow.pid. If it is there, it checks if the PID found in the file is running currently. If it is, it kills that PID.

Next, the script creates the file /tmp/slow.pid with its own PID as the contents.

Then it's time to sleep. After the sleep is done, the PID file is removed.

If another instance of the script is started while the first one is running, the second instance will find the PID file and kill whatever PID is inside that file.

A script which gives a PID a certain number of seconds to finish before killing it

#!/bin/bash

TTL=$1
PID=$2

function die(){
 echo $1
 exit 1
}
ps -p $PID &> /dev/null || die "No process with PID $PID is running. Aborting timer."

PROCESS=`ps -p $PID|tail -1|awk '{print $4;}'`
echo "Starting a TimeToLive timer of $TTL seconds for $PROCESS (PID $PID)"
sleep $TTL
echo Time to live of $TTL seconds reached. Checking for PID $PID
ps -p $PID &> /dev/null
if [[ $? -eq 0 ]]
then
 echo "Killing PID $PID ("`ps -p $PID|tail -1|awk '{print $4;}'`")"
 ps -p $PID &> /dev/null && kill -9 $PID
else
 echo No process with PID $PID is running. $PROCESS must have died or completed.
fi

Here's how it works:

The script takes two arguments, the time-to-live and the pid-to-kill.

First, the script makes sure that the PID is running.

Second, the script finds out the name of the program to kill.

Third, it sleeps the time-to-live seconds.

Fourth, after sleeping, it checks if the PID is still running.

If it is, it kills it. If it isn't, it prints a message and completes.

Some uses of the date command

Looping from a date and increasing it one day at the time

$ day=$(date +%Y%m%d) # init day to today's date
for i in $(seq 1 30) # loop 30 times ("days")
do
  echo "Date: $day" # print date
  day=$(date +%Y%m%d -d "$day + 1 day") # increase day with one day
done
Date: 20180107
Date: 20180108
Date: 20180109
Date: 20180110
Date: 20180111
Date: 20180112
Date: 20180113
Date: 20180114
Date: 20180115
Date: 20180116
Date: 20180117
Date: 20180118
Date: 20180119
Date: 20180120
Date: 20180121
Date: 20180122
Date: 20180123
Date: 20180124
Date: 20180125
Date: 20180126
Date: 20180127
Date: 20180128
Date: 20180129
Date: 20180130
Date: 20180131 #wraps to next month
Date: 20180201
Date: 20180202
Date: 20180203
Date: 20180204
Date: 20180205