Bash:Bash-Standard streams

From Juneday education
Jump to: navigation, search


Meta information about this chapter

Expand using link to the right to see the full content.

Books this chapter belongs to

Introduction

This chapter introduces the standard streams for input and output (standard out, standard err, standard in) and pipes.

Purpose

The purpose of this chapter is to give the students insights about streams. It is important to understand that output can go to different destinations; normal output can go to standard out and error messages and diagnostics can go to standard err. This understanding helps students understand how to filter output from programs, and also how to write programs which behaves in a standard way (sending error messages to std err, other output to std out). It is also important for students to understand the concept of standard input. Standard input is one of the fundamentals of flexible (command line based) applications, since it allows the users to choose between an interactive use of the program or automated use. Applications reading from standard in can also be used with pipes. This chapter also mentions redirection with append syntax, >>, mostly for completion. It is not hard to understand or use, but it should be in the student's toolbox. Pipes are included in this chapter because it is one of the things that makes the shell commands so powerful. Being able to combine output from commands to other commands like a chain of filters allows the students to express pretty complicated tasks on a single command line, which could eliminate the need for writing a script or program. We also believe that understanding how to chain commands could be helpful for programming students for understanding complex function calls (or even possibly lambda expressions) where a function call acts as the argument to another function call.

Using pipes can be done interactively, which also might help understanding how to "divide and conquer" a problem through solving a simple task first and then adding complexity gradually until we have reached a full solution to a complex problem. An example could be to use pipes to parse some log file. If the task is to filter out some field from certain lines from a log file. Here, one approach could be to first use grep to get the correct lines. Then pipe the results of grep to some filter which prints only the field we are interested in. This output could in turn be piped to a filter which sorts the input in some way.

We think that this approach could foster students to work in small increments, rather than trying to solve the whole problem at once.

Pseudo code:

Stepwise refinement:

grep 404 logfile ----- did we get all lines containg "404"?
grep 404 logfile | awk '{print $4;}' ------- did we get the correct column/field from those lines?
grep 404 logfile | awk '{print $4;}' | sort -r ---------- sort the result in reverse

Goal

The goal of this chapter is that the students will understand the concepts and uses for the standard streams for output, errors and input, as well as the concept of and use for pipes. An additional goal is that the students will realize the power of the shell realized through combining commands using pipes.

Instructions to the teacher

Common problems

The syntax for redirecting the error stream is not very intuitive or easy. It is OK to encourage the students to use some kind of reference card (or similar, like a cheat sheet) until they are confident in using the syntax.

Another difficult topic is commands or programs which can be used either interactively or through the use of standard input. The bc calculator is, in our belief, helpful as an example for understanding this. The point that could be made about when to use it interactively and when to send output to its input stream, is that if we want to automate a task, we don't want to hire staff to enter a lot of expressions manually and interactively to bc. We'd prefer to accumulate the expressions in a file and then send the complete file as a batch to bc which could be scheduled and run automatically even in the middle of the night. The output with the results could be sent via email, or saved in a file so that we can use it when we have the time.

For database students, it is important to understand streams and interactive mode. Most dbms come with an interactive shell which starts when the dbms is invoke but doesn't start interactively if there is data in the standard input stream. This way, the same command can be used to enter the interactive dbms shell, as well as for running SQL-commands from a file or standard input via a pipe. Therefor it helps them greatly if the teacher and tutors patiently manages to push them over the threshold for understanding this. This scenario, with batch processing of SQL commands can be used as an example to motivate the usefulness of a program being capable of both interactive mode and processing mode. You can give database export and import as an example. It is easy for the students to realize that entering a full SQL export of a large database by hand interactively is not preferable in comparison to sending it to the dbms using redirection or pipes!

See below for individual links to the videos.

Description

Standard streams

There are three standard streams in bash. First we have the standard out stream, which is the default or normal stream for communicating with us. Second, there is the standard error stream (sometimes called standard err), which is the default stream for showing us information for when stuff go wrong. Third, we have the standard input stream (sometimes called standard in). That is the standard way for us, the users, to communicate with bash (or programs running inside bash). So far, you have only used the standard out and standard in (for instance using redirection).

Below, we give a brief description of each of the three streams.

Lecture videos and slides

Lecture - Streams introduction

Lecture - Using streams

Redirecting streams (whiteboard and live videos)

Introduction to using pipes (whiteboard and live videos)

English live coding videos

Standard out

You have so far only used and redirected standard out for redirection to files. Standard out is the default stream for messages and other output and is connected to the terminal running your shell. The simplest example is the echo command. If we ask echo to print "Hello" to its default output stream (the so called standard out) like this: echo "Hello", the resulting text will be printed to the same terminal where you wrote the command. Sometimes people call this "printing to the screen", which is somewhat incorrect, since the screen is a physical device which may display the terminal and other graphical applications. It would be more correct to say that this is writing or printing to the standard output stream (which happens to be connected to the terminal where you typed the command).

Now, as you saw in previous chapters, the standard out stream can be redirected to some destination other than the current terminal window, like a file: echo "Hello" > hello.txt. The stuff that is printed to the standard out, should be expected output from a program. Running echo, we expect output based on the argument to echo. Running ls to list files in current directory, we expect a list of files. So the "standard" part of "standard out" could be thought of as normal, expected or requested output. It is the default channel for programs to interact with us or the outside world.

Video:

Standard err

The standard error stream is primarily meant for writing error messages and diagnostic messages (stuff which can be interesting to the user, but not part of the expected output). The reason for having a dedicated stream for this, is so that we can filter out expected output from errors and diagnostics. As we will see in this chapter, we can actually redirect standard err to some file so we don't have to bother with it until when we have time. We can even redirect standard out to one file and standard err to another file at the same time. This could be useful for instance if you are running some lengthy process (like a server) and you are outputting a lot of logging information for archival purposes and you don't want to mix normal (expected) output with possible error messages. Using the two different streams, you can have one log file for the normal logging and an "error log file" for documenting errors that occurred (for instance for debugging the server).

Both streams have the prefix "standard" which tells us that this is normal behavior for programs. That's the reason we bring it up in this book, so that you are prepared for standard behavior of the commands and programs you run inside the shell. If you are a programmer (or aspire to become one), it is good to know about this practice and follow the standard. It is good practice to write error messages to standard err and other (more expected output) to standard out.

Video:

Standard in

Programs which process input or interacts with the user via the command line (bash for instance does this - you enter commands in bash interactively on the command line), reads input from the standard input stream. This is a standard which allows us to write (and use) flexible programs which can act both interactively and "automatically". We will see in the exercises the power of letting commands operate on input from files, or even operating on input from other commands.

Programs which can operate on input both interactively and from the output from another program are much more useful than programs that depend on static data such as data in a named file. This feature is typical for programs which do some kind of processing such as bc (the calculator), grep (which searches for text and prints matching lines to standard out), database management systems (which process database queries expressed in SQL for instance) and wc (which counts lines, words and characters in texts). It is not typical for more static commands such as ls (which simply lists files in some directory). If we think about ls, it does not process any kind of input, but rather tells us information about the contents of the file system. The data it uses is rather static in nature: "What is the current contents of the file system at some location". It's hard to imagine what data we could send to ls for processing. The calculator bc, on the other hand, needs input from us in order to do some processing. We send to it the mathematical expressions we need it to evaluate for us, and it can be different expressions each time. The ls command, rather goes to the file system and investigate it and gives us a report.

You could compare ls to a TV-guide magazine. We consult the TV-guide by looking at the pages for today's TV shows, for instance. The TV-guide contains the information but we have to look up the page for today's schedule. We use the TV guide to get some specific information. That's a little like asking ls to list the contents of some directory.

The grep could be compared to a dry cleaner. The dry cleaner cleans clothes. We bring or send clothes to the dry cleaner, and the dry cleaner will work on what we provide. The dry cleaner knows how to clean clothes. We provides the clothes and when the dry cleaner is done, they are clean. The grep command works in a similar way. We provide some text and a search phrase (the clothes) and grep gives us the matching lines back (the clothes without the dirt and stains).

In an interactive mode, we'd have to go to the dry cleaner and take of one piece of clothes at the time and wait for it to be cleaned. This could be OK in some situations (such as when we didn't bring an extra shirt on a trip and spill coffee on it). But it is really good to also have the possibility to give the dry cleaner a bunch of clothes which will be processed in sequence.

The standard input stream for a program gives us the possibility to send a stream of tasks to a program, which the program reads and processes one line at the time, so that we don't have to manually type in all those lines.

Videos:

Pipes

Pipes are about connecting the standard out stream of one program with the standard in stream of another program. In this way, we don't need to save intermediate results in a file, in order to process it further with another program. We can run one program and directly connect its output to the input stream of another program, using a pipe.

Videos:

Interactive or non-interactive

Many applications work both in interactive mode (prompting you for a command and executing them one by one) and in non-interactive mode (like performing a task on every line available on standard in).

Some examples include:

  • bc
## Non-interactive using a pipe ##
$ echo -e '1+1\n2+2\n3+3'|bc
2
4
6
## Interactive - absence of a pipe ##
$ bc
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'. 
1+1
2
2+2
4
3+3
6
## Ctrl-D was pressed on the last line ##
  • cat
## Non-interactive - presense of a pipe to standard in ##
$ echo hej | cat
hej
$
## Interactive - absence of a pipe to standard in (and absence of an argument) ##
$ cat
hej  # entered by the user
hej  # echoed back by cat
$
## User pressed Ctrl-D on the last line ##
  • sqlite3
## Non-interactive - presence of a pipe to standard in ##
$ echo "CREATE TABLE dummy(id integer);"|sqlite3
## Interactive - absence of data on standard in - no pipe or redirection! ##
$ sqlite3
SQLite version 3.11.0 2016-02-15 17:29:24
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> .quit
$
  • grep
## Non-interactive - presense of data on standard in via a pipe ##
$ cat file.txt | grep as
asdf
asdf
asdf
asdf
## Interactive - absence of data on standard in (and no argument of file name) ##
$ grep as
is
es
os
ös
as
as ## This is echoed back by grep, since it is a match for as typed on the previous line
## User pressed Ctrl-D on the last line to exit interactive mode ##

Here's a script which works differently depending on whether data is available on standard in (through a pipe or redirection) or if it is invoked in a terminal without available data.

#!/bin/bash

# Is a pipe sent to this script?
if [ -p /dev/stdin ]
then
    while read line
    do
        echo Got: $line
    done
    exit 0
fi
# Is a redirect used for this script
# so it should read from file?
if  [[ ! -t 0 ]]
then
    while read line
    do
        echo Got: $line
    done
    exit 0
fi
# No pipe, and no redirect - so run it
# interactively with a prompt!
echo -n "pipable> "
while read line
do
    echo "Got: $line"
    echo -n "pipable> "
done

Example run:

$ echo -e 'Hello\nand\ngood bye'|./pipeable.sh
Got: Hello
Got: and
Got: good bye

### And now using a redirect ###
$ echo -e 'Hello\nand\ngood bye' > hello.txt
# Run the script with a redirect from hello.txt:
$ ./pipeable.sh < hello.txt 
Got: Hello
Got: and
Got: good bye

### And now without a pipe - so interactive mode with a prompt and all ###
$ ./pipeable.sh
pipable> Hello
Got: Hello
pipable> and
Got: and
pipable> good bye
Got: good bye
pipable>
### Ctrl-D was pressed on the last line ###

Here's a Java program demonstrating the same behavior (doesn't work on Cygwin):

import java.util.Scanner;

public class Interactive {

  public static void main(String[] args)throws Exception {
    Scanner sc = new Scanner(System.in);
    if (System.console() == null) {
      while (sc.hasNextLine()) {
        System.out.println(sc.nextLine());
      }
    } else {
      System.out.print("Interactive >");
      while (sc.hasNextLine()) {
        System.out.println(sc.nextLine());
        System.out.print("Interactive >");        
      }
    }
  }
}
/*
$ javac Interactive.java && java Interactive
Interactive >asdf
asdf
Interactive >asdf
asdf
Interactive >
$ cat file.txt | java Interactive
asdf
asdf
asdf
asdf
$ java Interactive < file.txt 
asdf
asdf
asdf
asdf
*/

Here's a C program showing the same principle:

#define _GNU_SOURCE
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>

int
main(void)
{
  char *line = NULL;
  size_t len = 0;
  if (isatty(0))
    {
      printf("interactive> ");
      while(getline(&line, &len, stdin) != EOF)
        {
          printf("%s", line);
          printf("interactive> ");
        }
    }
  else
    {
      while(getline(&line, &len, stdin) != EOF)
        {
          printf("%s", line);
        }
    }
  free(line);
  return 0;
}
/*
$ ./interactive 
interactive> asf
asf
interactive> rwerg
rwerg
interactive> dfgsdf
dfgsdf
interactive> 
$ cat file.txt | ./interactive 
asdf
asdf
asdf
asdf
$ ./interactive < file.txt 
asdf
asdf
asdf
asdf
*/

That's right, we forgot to mention that bash itself behaves exactly like the above examples:

## Non-interactive - data available on stdin via pipe ##
$ echo "echo hej" | bash
hej
$
## Interactive - no data available on stdin via pipe or redirect ##
$ bash
$ echo hej
hej
$ exit
exit
$
## bash - starts a new interactive shell ##
## exit - leave the new interactive shell ##

Exercises and examples

Ex 01 - Redirecting standard out and standard err

Let’s produce some error messages! Do the following:

$ echo "9/0" > div.txt

We’ve just created a text file with the content "9/0". What if we ask bc to calculate the value of that? We know how to redirect input to come from a file, rather than the keyboard:

$ bc < div.txt
Runtime error (func=(main), adr=5): Divide by zero

That was kind of expected (we hope). But can we redirect that output to a file? Let’s try that!

$ bc < div.txt > result.txt
Runtime error (func=(main), adr=5): Divide by zero

Wait, what now? We still saw the result in the terminal? What ended up in result.txt, then?

$ cat result.txt

Nothing? This is because the error message was not printed on the standard output stream. It was actually written to the standard error stream, which we didn’t try to redirect. Could we redirect also the standard error stream, you ask? Of course we can. But we have to know that the stderr also has a number in the shell, and the number is 2. In order to redirect the stderr, we have to say:

command 2> errors.txt

The syntax means, run "command" (could be any command) and redirect the stream number 2 to the file errors.txt. And now that we know that stream number 2 is the stderr stream, we also know that the syntax also could be read as “run command but redirect stderr to the file erros.txt”.

To see if we are kidding you, let’s try it:

$ bc < div.txt 2> result.txt
$ cat result.txt
Runtime error (func=(main), adr=5): Divide by zero

What do you know, it actually worked. We silenced the errors from bc by redirecting them to the file result.txt, and when we looked in it, it contained the error message.

But it would have been nicer (we think) if any good result always went to result.txt and any errors instead went to errors.txt (a file that didn’t exist yet, but would be created on error). That would look like this:

$ bc < div.txt > result.txt 2> errors.txt
$ cat result.txt
$ cat errors.txt
Runtime error (func=(main), adr=5): Divide by zero

It actually worked as we planned! Nothing was written to result (actually, the empty string was written to result.txt which is now completely empty), and the errors went to errors.txt. Nice?

You have now learned how to redirect standard error to a file, and also how to redirect standard out to one file, and standard error to another file, at the same time on the same command line!

Ex 02 - Appending standard out to a file

Next, let's look at how we can write lines to the end of a file, so that the file gets longer and longer. This is done using the "append" operator >>

$ echo First Line > textfile
$ echo Second Line >> textfile
$ echo Third Line >> textfile
$ cat textfile
First Line
Second Line
Third Line

Ex 03 - Building up log files

Let’s create some input files for bc, and run them all as input to bc, while appending the results to the result.txt and any errors to errors.txt . We'll start with the input files:

$ echo "9*9" > product.txt
$ echo "8*8" > product2.txt
$ echo "6*8" > product3.txt
$ echo "9/0" > div.txt
$ echo "6/8" > div2.txt
$ echo "6/0" > div3.txt
$ echo "0/8" > div4.txt

Now, we'll remove the old files from the previous examples:

$ rm result.txt errors.txt

Now, we're ready to put bc to work and separate the good output from any errors:

$ bc < product.txt >>result.txt 2>> errors.txt
$ bc < product2.txt >>result.txt 2>> errors.txt
$ bc < product3.txt >>result.txt 2>> errors.txt
$ bc < div.txt >>result.txt 2>> errors.txt
$ bc < div2.txt >>result.txt 2>> errors.txt
$ bc < div3.txt >>result.txt 2>> errors.txt
$ bc < div4.txt >>result.txt 2>> errors.txt

Let's investigate the resulting files:

$ cat result.txt errors.txt # type both files to the screen
81
64
48
0
0
Runtime error (func=(main), adr=5): Divide by zero
Runtime error (func=(main), adr=5): Divide by zero

As you could see, first were the successful calculations written from the result.txt file. Next, the errors were written from the errors.txt file.

Can you identify what files stood for what successful result? And what files produced the errors?

Ex 04 - Doing some plumbing, using pipes

We suggest watching: Pipes: (eng) before doing this exercise.

While it is nice to be able to redirect streams, the real power of bash lies in something called pipes. We’ll introduce a lot of uses of pipes in later exercises, but already we’ll show you the basic syntax. A pipe is like a tube you put between to commands, making the standard output of the first command the standard input of the second command. It’s kind of like storing the output of the first command in a file, and then making that file the input stream for the next command, only that we don’t need the file at all!

It’s easier to understand this by looking at some concrete examples. Remember this from the sections above?

$ echo "9*9" > multi.txt
$ bc < multi.txt
81

With pipes we could do the exact same thing but skipping the part of storing the expression in a file:

$ echo "9*9" | bc
81

And do you remember this?

$ ls ­1 file* > the_file_files
$ grep [13579] the_file_files

With pipes, again, we don’t need the file since we can feed the ouput of ls directly to grep using a pipe:

$ ls ­1 file* | grep [13579]
file1
file3
file5
file7
file9

This was a short introduction to piping commands together. We hope you enjoyed it and want to learn more. There are lot's of examples available online and we hope we have inspired you to read more and try this yourself!

Solutions

Expand using link to the right to see the full content.

Solution to Ex01

There's not much more to add here! We hope you see the usefulness of being capable of sending normal output to one file and errors to another.

Solution to Ex02

There's not much to say about appending. It is as simple as redirection but writes to the end of the file instead of overwriting the file. Useful when building a file incrementally.

Solution to Ex03

The successful calculations come from:

81      product.txt  (9*9)
64      product2.txt (8*8)
48      product3.txt (6*8)
0       div2.txt     (6/8) -- integer division gives 0
0       div4.txt     (0/8)

The errors came from:

div.txt   (9/0)
div3.txt  (6/0)

Solution to Ex04

Feel free to revisit previous chapters and see if you can do any of the exercises or examples using pipes instead of files. When using a pipe between program1 and program2 like this: program1 | program2 you can read it out loud like this: "take the output from program1 and use as input for program2".

You can add as many pipes as you want. Consider for instance:

$ cat big_list_of_emails.txt | grep gmail | sort | head -10

If big_list_of_emails.txt contains a big list of email addresses with one address on each line, the command line above would keep all email addresses containing the string "gmail", then sort those addresses alphabetically and then keep the first 10 addresses for output to standard out. You can try that yourself by creating a text file with a lot of of fake email addresses, one on each line. Make sure the file is not sorted and contains at least 10 gmail addresses.

Links

External links

Where to go next

« PreviousBook TOCNext »