Difference between revisions of "Chapter:Computer introduction"

From Juneday education
Jump to: navigation, search
m (File format)
m (Description: format)
 
Line 221: Line 221:
 
00002190: 00000000 00000000 00000000 00000000 00000000 00000000  ......
 
00002190: 00000000 00000000 00000000 00000000 00000000 00000000  ......
 
</pre>
 
</pre>
or a printout from a so called disassembled version of the same program (objdump --disassemble):
+
or a printout from a so called disassembled version of the same program (<code>objdump --disassemble</code>):
 
<pre>
 
<pre>
 
   400527: 48 89 e5            mov    %rsp,%rbp
 
   400527: 48 89 e5            mov    %rsp,%rbp

Latest revision as of 18:57, 10 September 2019


Computer introduction

Here's the information and video lectures on the Computer Introduction chapter. The next chapter has exercises/questions covering this chapter. When you have read this chapter, seen the video lectures, you know what to to! Head on to the next chapter and do the exercises/questions to check your progress.

Meta information about this chapter

Expand using link to the right to see the full content.

Books this chapter belongs to

Introduction

The concepts here will give you a very basic introduction to the computer and the hardware components it consist of.

Memory is used in all programs, no matter what programming language. This lecture shows the similarities between how we humans think and how computers “think”. This presentation also gives the needed background for understanding variables, which is an important concept in languages such as Java and C.

You will get a basic introduction to the OS (operating system) of a computer.

When developing software files and directories are used all the time, for storing software code in a file which in turn is located in a directory.

Understanding the file system at a basic level is important to understand the build process of programming languages, especially when developing C programs.

Knowing what a program is and what file types can be executed on an OS and on a computer with no OS is important to understand how a program can be developed, built and executed. It is not possible (right now) to execute a Java program on an Arduino even if we can compile it to a binary (as opposed to the normal Java byte code). Why is that?

Purpose

To get a good understanding of variables and computer programs in general a brief understanding of the computer’s memory is needed.

The concepts are important to understand when programming since programs use the OS (operating system) to manage the underlying hardware. Getting a basic understanding of the services offered to the programmer by the OS is a foundation for understanding the role and surrounding environment for a program, as well as giving an understanding the role of the programming language C.

Giving the student knowledge about with file system.

Programs, audio, photo, . . . everything is stored in files. So an understanding of files and file system is crucial when programming.

These lectures give the student a clearer picture of how programs, processes and source code relate to each other.

Goal

Provide a basic understanding of a computer’s hardware and a basic understanding of how the computer’s hardware is abstracted and offered by the OS to the programmer via a programming language. This will serve as a foundation for the coming lectures.

The student shall:

  • be able to explain what computer memory is and how it is used/addressed.
  • be familiar and be able to create directories
  • be familiar and be able to able to find files in a file system
  • understand what a file system is
  • have an understanding of how programs and processes relate to source code, OS and files.
  • understand the role of and the services offered by an OS

Updating the wiki

If you find bugs/errors or would like to make changes to the texts on this wiki, please contact us.

Exercises and solutions in Git

Source code (exercises, examples and solutions) are stored in a Git repository at: https://github.com/progund/computer-introduction.

Instructions to the teacher

Common problems

Files and directories seems both easy and neglected in software development literature. Based on our experience we put emphasis on these concepts since they ease up explaining the build process as well as serve as a foundation for understanding packages (in Java).

The concept of paths (relative and absolute) and directories become especially important when introducing packages and class path (in Java).

The basic concept of a computer computing values, following instruction, and storing values in the memory has proven to be important for understanding the use for and the concept of variables.

This chapter also includes an introduction to the concept of programs and processes. It is often forgotten in books and courses, in the authors' opinion, to stress the fundamental differences between programming in compile time and executing a program in runtime. The authors feel its our obligation to lay a foundation of these concepts early on, and keep returning to it in chapters further on in the book.

All videos for this chapter

All Englsih videos in this chapter

All Swedish videos in this chapter

See below for individual links to the videos and slides.

Hardware

Description

A computer, be it your laptop or your mobile phone, is physical in the sense that you can hold it in your hand. Contrast this to the apps or programs you've installed on your computer - it's hard to grab a hold of them. All the computer's physical components, such as the screen, make up the computer's hardware. We will look a bit closer at some of the hardware components.

Software, such as apps or programs, will be dealt with later on in this chapter.

CPU

Imagine someone asking you to add 200 and 317. Assuming you don't actually know the result without counting then some part of your brain performs the calculation and comes up with the answer 517 . The part of a computer that performs the corresponding calculation is called the CPU (central processing unit). CPUs come in many different architectures and versions but let's not dive to deep into this. A CPU is responsible for performing instructions such as adding, subtracting, logical operations (AND, OR..), Input/Output (screen, keyboard, ..). The instructions to perform these tasks differ a lot between the different architectures of CPUs.

Memory

Image someone asking you for your name. Directly from memory you'd say your name out loud. Perhaps, every now and then, it may take a while bit it's still quick we hope. The computer has a similar fast memory, called RAM (Random Access Memory). A program is stored in RAM when executed - think of it as you storing the instructions of a recipe for bread in your memory while baking. When the power goes off the computer loses all the data/information in RAM.

On the computer's CPU there are hardware components we could call memory as well, but we will skip this and focus on the stuff that gets us going in this course.

Hard Disk Drive

Let's go back to the recipe for the bread above. You've probably experienced not remembering things and having to look them up. This happens because you, in some way, prioritized to remember some other things instead due to a limit in the amount of things you can store in your brain. Same with your computer's memory, RAM (Random Access Memory) which also has a limit. The RAM is quick but you can not store every piece of information there. Humans have traditionally solved this by using pen and paper, ledgers, note books or similar tools and looked things up using these tools. The same goes with a computer but the pen and paper is replaced by a hard disk. A hard disk is a device used for storing data. There are tons of different kinds of hard disks out there but for us in this course we will use the general term hard disk and be happy knowing that a hard disk can store data.

If you write a document, shoot a movie or take a picture with a digital camera you can store these on the hard disk. It is obvious that information stored on a paper somewhere will take longer time for you to look up than if you'd store it in your memory. Same goes with hard disks - it takes more time to retrieve information on a hard disk than from RAM. A good thing with the hard disk is that can store a lot more information than RAM and still be cheap. Another good thing is that the data/information stored on the hard disk is stored even if the power goes off.

Screen/Keyboard/Mouse

If we look at a typical laptop of today we can quickly identify a couple of hardware components:

  • screen
  • keyboard (well, some computers come with only a "virtual" keyboard)
  • touchpad/mouse

These are used for communication between the computer and the user. Typically you type a command or click a button which is your way of sending commands to the computer. The computer and programs executing on your computer, on the other hand, give feedback to you using the screen, loudspeaker/phone jacket or some other hardware.

Network

When you fetch information, be it a movie, picture or something else, over the internet you're getting data from someone else's computer. Your computer is (most likely) connected to the other computer over internet using network cards, typically wireless, ethernet or bluetooth. This is hardware designed to transfer information over a network of computers.

We have separate course material on network and web which you can look into if you want to go further.

Videos (Hardware)

  1. Hardware (eng) (sv) (download presentation)

Section links (Hardware)

File, file format, and other files

File

A file is a construction in a computer (actually the Operating System, see below) which is used to store and name a piece of information/data you wish to store. Let's say you want to store information about this course (don't we all dream about doing that?). Then you can create a file using a word processor or a text editor and give it a name like prog-java.txt. If you like it you can store a digital image in a file and name it Bolzano-bushery.jpg.

File format

You most likely have opened a file on your computer with a program. It could be a document which you open with a word processor, such as Libre Office or Office, a video file which open with VLC or an image which you open and edit using GIMP or Photoshop. We hope that it will not come as a surprise to you that files are totally different when it comes to their structure and content. Compare this to a physical book, a tape with a movie or a printed image. They are similar in that they contain information but differ greatly when it comes to what they store and how the data is stored. Storing a movie in a book might not give the expected user experience. The format of the book, image and movie tape are different. Same thing with how the files are stored on the hard disk. An image is stored in an image format. A movie is stored in a video format. An image is stored in a separate format.

These format of the files, i e different kinds of files, are called file type (or filetype). If we continue our example with an image stored on a file on a computer there are several different kind of file formats for images such as PNG, JPG, GIF and BMP. Let's say we have a photo representing some stairs in a hotel in Bolzano. This photo can be stored in any of the formats above and be viewed by a user using some program. Why have many different formats? Well, one answer would be because we can. Another answer is that someone may find a format useless since it lacks a feature (e g animated picture) and the answer is to create a new file type (format) with this feature. How many different video file types do you know? We know a few: AVI, MOV, MPG, OGV and so on. A movie stored in MOV can only be read and correctly interpreted by a program knowing the same format. Think of it like a human language - if you speak to me in english I need to understand english to understand what you're saying.

Let's give an example of the above. Here's a picture of a pdf (as displayed by the program evince): Evince-view.png. If we were to look at the actual content of the file message.pdf in a program used to write text only we would see something like this:

%PDF-1.4
%����
2 0 obj
<< /Linearized 1 /L 18807 /H [ 651 128 ] /O 5 /E 18582 /N 1 /T 18649 >>
endobj

xref
2 11
0000000015 00000 n 
0000000602 00000 n 
0000000651 00000 n 
0000000779 00000 n 
0000000983 00000 n 
0000001078 00000 n 

The pdf document contains information about how the text (and possibly pictures) should be presented (fonts, size, location). This information is what we can see above.

Before we end the section about file types we would like to mention a few about file suffixes. If you have a file with the name self-portrait.png you probably assume two things: 1) it is self portrait and 2) the content in the file is an image in PNG format. If a book is called Object Oriented Analysis & Design you'd probably think it is a book dealing with the subject Object Oriented Analysis & Design. But we can't know for sure. Same goees with the image - is it really a self portrait?. Let's move on to the second assumption about the content of the file. Yes, we can assume it is a PNG file. But in no OS, at least no OS known to the authors of the material you're reading, is there a stringent (100% sure) association between the file suffix and the actual content. Ok, in Windows the file explorer tells you that a file is in PNG format if the file has the suffix .png, but let's leave that odd fact for now and for ever - we don't want to bash on an operating system right here and right now. So the file self-portrait.png really could be a pdf, avi or what ever. It is good and recommended to use the suffix to tell what the actual content is but there are no guarantess.

If you have a file stored in the PNG format called me.png and you change the name to liverpool.avi. Do you think that the content in the file all of a sudden will be of Liverpool and not be a image anymoe. It would be like saying that a car is a tree and expect the car to be transformed into a tree. Suffixes are a commen way to hint users something about the content but we can't be sure until we really check.

Other files

Apart from files there's a thing called directories, sometimes called folders or catalogues. A directory can contain files and other directories. A directory is a bit like a book shelf in which you can put books. Just as in a public library where you can put a label (name) on a book shelf you can name a directory. This is used to group files (and other directories) that you consider belong together. We will, in the exercises and assignments in this course, strongly encourage you to create a directory structure for the files you write in this course.

Here's a directory (and file) structure which we use for a coming chapter:

|-- net
|   `-- supermegacorp
|       `-- orgmanager
|           |-- Member.java
|           `-- test
|               `-- MemberTest.java
`-- org
    `-- police
        `-- passportsystem
            |-- Passport.class
            |-- Passport.java
            `-- test
                |-- PassportTest.class
                `-- PassportTest.java

There are two directories net and org which both contain sub directoris. Let's skip org and look into net. This directory contains no files but one sub directory called supermegacorp which in turn contains a direcory called orgmanager. This dorectory (orgmanager) contains a file Member.java and yet another sub directory test which in turn contains MemberTest.java.

There is another file type called link (both hard and soft) which we will skip in this course.

Videos (File, file format, and other files)

Section links (File type)

File system

Description

A hard disk can be used to store files. But how do you know where on the hard disk you stored file assignment.odt? How do you know if this or that piece of memory in the hard disk is free? Where to store the next file? These questions are solved by a file system which is provided to you by the operating system you're using. More on operating systems soon - right now we are going to look at file systems. A file system is basically a structured way of storing many files and keeping track of them all in a centralized way. A bit like a library where you can find tons of books all arranged by the staff and easy to find since there is a structure and a record of the books (and book shelves).

Videos (File system)

  1. File system (eng) (sv) (download presentation)

Section links (File system)

OS

Description

A typical laptop of today has 1-4 CPUs. Still you can run way more than 4 programs at the same time. How is this possible? The CPUs are shared between the programs so that each program frequently gets to run for a small amount of time giving you (the user) a feeling of the program running constantly and in parallell with the other programs. When storing a file using a word processor how can the word processor know what hard disk I have and does it have instructions for each and every hard disk out there? The word processor communicates with a file system provided by the OS. The OS takes care of the details of where on the hard disk the file is. All you need to know is the the name of the file and the directory where the file is located. These kind of services for sharing and abstracting hardware components are included in an OS (Operating System). The OS shares resources such as hard disks, screen, mouse, keyboard, network card and the CPU between the programs. Today an OS is expected to come with a lot more features and it is hard to come up with one definition of an OS. In this course it is enough that your familiar with the definition above. An operating is a piece of software, not hardware.

Examples of OS:

  • Apple OS X
  • BSD (FreeBSD, OpenBSD, NetBSD)
  • GNU/Linux
  • Microsoft Windows
  • Android
  • iOS

Videos (OS)

  1. OS (eng) (sv) (download presentation)

Section links (OS)

OS (wikipedia)

Program

Description

We hope you remember that the CPU executes instructions. A sequence of such instructions stored in a file is called a program. Examples of programs are Firefox (a web browser), Libre Office (a word processor), VLC (a media player). We will show two text printouts of a simple C program on a GNU/Linux: Binary printout (using xxd -b):

00002160: 01001000 00000110 00000000 00000000 00000000 00000000  H.....
00002166: 00000000 00000000 00011110 00000000 00000000 00000000  ......
0000216c: 00101111 00000000 00000000 00000000 00001000 00000000  /.....
00002172: 00000000 00000000 00000000 00000000 00000000 00000000  ......
00002178: 00011000 00000000 00000000 00000000 00000000 00000000  ......
0000217e: 00000000 00000000 00001001 00000000 00000000 00000000  ......
00002184: 00000011 00000000 00000000 00000000 00000000 00000000  ......
0000218a: 00000000 00000000 00000000 00000000 00000000 00000000  ......
00002190: 00000000 00000000 00000000 00000000 00000000 00000000  ......

or a printout from a so called disassembled version of the same program (objdump --disassemble):

  400527:	48 89 e5             	mov    %rsp,%rbp
  40052a:	48 83 ec 10          	sub    $0x10,%rsp
  40052e:	8b 45 fc             	mov    -0x4(%rbp),%eax
  400531:	89 c6                	mov    %eax,%esi
  400533:	bf f0 05 40 00       	mov    $0x4005f0,%edi
  400538:	b8 00 00 00 00       	mov    $0x0,%eax
  40053d:	e8 be fe ff ff       	callq  400400 <printf@plt>
  400542:	48 8d 45 fc          	lea    -0x4(%rbp),%rax
  400546:	48 89 c6             	mov    %rax,%rsi
  400549:	bf 01 06 40 00       	mov    $0x400601,%edi
  40054e:	b8 00 00 00 00       	mov    $0x0,%eax
  400553:	e8 a8 fe ff ff       	callq  400400 <printf@plt>

The instructions in the program are, as we think you agree with, not easy to read or write by hand. It is possible but not in any way part of this course. We showed the above to give you an idea of what a CPU and a program are.

Note: it is easy to write a program in a programming language and compile this to a program which has a content like the above.

Compiled program

A program containing CPU instructions (such as Firefox above) is called a compiled program. Compiled programs are written as text in a programming language (see below) that is much more human friendly. This text, called source code, is compiled ("converted") into a program cotaining CPU instructions. Examples of compiled programming language are: C, C++, and Ada.

Here's a short C program that prints "Hello world":

#include <stdio.h>
int main()
{
  printf("Hello world");
  return 0;
}

Hopefully this is easier to read and at least partly understand than the above binary printout.

With compiled programs you execute the compiled source code and make changes in the program in the source code which you need to compile again to create a new program with the new changes built in.

Interpreted program

There are other types of programs, one of them being interpreted programs. These are written as text in a programming language but they are not compiled into programs directly. Instead the programs (in text form) are read by an interpreter which in turn converts the instructions (in the programming language) in to instructions for the CPU. So you need a program (the interpreter) to run your program (you wrote). Examples of interpreted programming languages are: bash, Python, php and perl.

Here's a short bash program that prints "Hello":

#!/bin/bash
echo "Hello"

With interpreted programs there's no compiled version so the source code and the program are one and the same.

Byte compiled program

Another type of program is a byte compiled program. This is, as above, written as text in a programmaing language. As with the compiled program the text (source code) is compiled. But the result is not instructions for the CPU but for a so called virtual machine or a byte code interpreter. Java, which is the subject of this course, is one example of such a programming language.

Here's a short Java program that prints "Hello world":

public class HelloWorld {
  public static void main(String[] args) {
    System.out.println("Hello World");
  }
}

Just as with compiled programs you make changes to your program in the source code which must be compiled (again) if you want your changes to be present in the program that you execute.

Videos

  1. Program (eng) (sv) (download presentation)

Section links (program)

Questions and Answers

Expand using link to the right to see the full content.

Q: “The RAM memory is fast and suitable for storing instructions for the CPU which can read and execute those instructions in a quick fashion. Those instructions are often called a “computer program” or a piece of “software”. But where are the whole program stored?
A: The program, with all its instructions, are often stored on a persistent memory (you may think of it as a ‘long time storage’) e.g. on a hard drive. The reason is that most programs in a computer system typically is not running all the time. The user launches a program when the user needs the program to perform some task. The operating system loads the program from the hard drive into the RAM when the program is started. Usually, the program is run many times at different occasions, which is why programs are normally stored on this ‘long time storage’. The faster, but short term, memory called RAM is more of a temporary player in the system. It doesn’t remember anything when the computer is powered down. Programs and other files on the hard drive, remain even when the power is turned off. They will remain until someone uses some program to actively delete them from the hard drive.

Q: “You were talking of ‘named memory’, or ‘variables’ as you also called it. I understand that a name I come up with myself is much easier for me to remember and understand than some hardware address in a numeric format. But when and how do I create a variable?”
A: Great question! The variables (which is the more technical name for this named memory) are typically part of the source code for a computer program. You will learn how to write programs (using the Java programming language) throughout this course. But there are actually variables in other places than in the source code of programs! Most operating systems use variables (often called environment variables) to keep track of information that might vary from user to user or vary over time. For instance, many systems have an environment variable to keep track of the user’s preferred language for interaction. Other common environment variables include HOME (the path to the user’s home directory), EDITOR (the path to the user’s preferred text editor software), TZ (the standardized name of the user’s time zone), and, PATH (a list of directory paths where the system should expect to find the program executables the user invokes using a command line interface).

In this course, however, we will touch upon environments when we need them, but we will focus on variables used in the programs we learn how to write. Typically, a program deals with data (information) and such data is typically stored in variables. We give variables good name so that we, when we read the code later on, will understand what data is stored in them. Variable names are for humans, memory addresses (where the data is actually stored) are for computers!

Q: ”You talked about two main types of files: text files and binary files. Is it true that the computer can store text directly on the hard drive in the form of a text file? How is such text actually stored on the harddrive?”
A: Again, a very good question! This is actually a common source of confusion. When we say “text file” and that it “stores text”, we actually simplify things a lot. It might be easy to understand that an audio file doesn’t store sounds (like notes and keys) directly on the hard drive. The audio file must be coded into some format that some program later can decode and interpret as music. We need a program to make sense of an audio file (and to enjoy for instance the music in the file). How is this different from text files, you may ask? Actually, text files are also coded before being saved on the hard drive as a text file. But the coding is very simple - it just codes every character or symbol into a number according to some table (usually the ASCII code table). The only thing the numbers represent are what symbols (like letters, digits and special characters) from a standard table (like the ASCII table) the file contains and in what order they occur. There is no information about the fonts or style of the text - so no notion about bold or italic text exists, and no notion about text size. It is only what symbols occur in the file and in what order they occur that is saved in a text file. This is why such text is often referred to as “plain text”. The translation from symbols to numbers from a symbol table (like the ASCII table) is a very simple coding and decoding scheme compared to coding audio or video for instance. In fact it is so simple that there are a great number of programs that can display or save plain text available. These, so called text editors, are so simple and common that we tend to think of the content of the file (the number codes for all symbols) as “human readable” even though it is actually not the numeric symbol code numbers we are reading, it is their character representation. So we could alternatively think of text files as a special case of binary files. Most terminal programs (the text based interface to the operating system where we can enter commands using the keyboard) have built in support (or come with commands) for displaying the contents of a text file in the terminal window itself. This adds to the notion of textfiles being human readable, compared to for instance audio files, which cannot be displayed as any human readable content neither in a terminal, nor in any other program!

Q: “If text files after all are not that different from binary files in general, what’s the fuzz about them, then?”
A: Text files are important in the context of learning and practicing programming, because programs are entered as plain text (without any styling, sizes or fonts) using text editors. Understanding that a program is just a text file, is also important for understanding the process of compiling programs. Compiling a program means using another program (called compiler) to read the text file with the program you have written, understanding what was written so that it can be checked for correctness of syntax and other rules, and lastly transformed into a binary file which can be run (executed) on the computer. All this also helps us understand that in order to write a program, we need to use some kind of text editor capable of handling plain text!

Q: “OK, so all program source code files are written as plain text. But is it always necessary to compile programs into a binary format, before we can execute the programs?”
A: Great that you ask that! No, there are actually programming languages where the program is run “as-is” in the plain text format, using some other program to interpret the program code typically line-by-line. Such programming languages are said to be “interpreted”.

Q: “I think you said in the lecture video that Java programs are compiled before they can run. So Java programs are not interpreted, then?”
A: That’s a tricky question, but we’ll try to answer it the best we can! It is true that you need to compile a Java program before you can run it. The thing is, however, that it is not the operating system that is running the resulting binary program directly. It is a program called “Java Virtual Machine” that is responsible for executing the binary file from the compilation. The virtual machine is actually interpreting the binary files, but that is not something that is important to know in order to learn to program in Java in this introductory course! But since you asked, we felt obliged to answer!

Q: “It seems, from looking at the lectures, that most operating systems use something called ‘file system’ for organizing the files on a hard drive. Can’t files simply be stored on the hard drive in sequence or some other order?”
A: Of course, the files could be organized in this way or any other way actually. But, if files were stored simply in sequence, couldn’t that actually also be called a system? We could call that too a ‘file system’. The thing about file systems is that they allow us to make abstractions over how files are arranged on the hard drive! We don’t need to care where on the physical thing the data is stored. We don’t need to care about how to make use of the free space that opens up on the physical thing when we delete a large file somewhere. What file systems allow us to do, is to focus on a logical organization of files instead, where we can have names of files and locations where they are stored using the abstraction of directories. Directories are arranged in a file system in a hierarchical manner so that we can get relatively short paths describing the location of a particular files. The directories are often given descriptive names so that the hierarchical path makes more sense to us humans. For instance, we could have the following paths (using / as the directory separator):

/home/adam/music/classical/Bach/BrandenburgConcertos.ogg
/home/adam/music/rock/IronMaiden/RunToTheHills.ogg
/home/ana/programming/java/MyFirstProgram.java

The paths contain the order of directories from the root directory and the directories have descriptive names, helping us to understand and remember where files are located and what type of files to expect next to some file in the same directory.

Q: “Why do we need operating systems? What do they do for us?”
A: Operating systems are also about abstractions. Humans like abstractions because they help us think about and relate to the world in a way that is much less complicated than it really is. Even if a computer isn’t really a complicated machine, it is a very technical one. We like to be able to use a computer (and even write new programs for it to change or add to its capabilities) without having to be electrical engineers. For instance, we like to think about files as the representation of something (like an image) and that we can simply “save” the image, and later find and look at it. We don’t like to think about all the wiring and electrical signalling that takes place in such a simple operation as opening or saving an image. The same applies when we are programming. We might write a program capable of displaying and saving images. As programmers, we are like most of the rest of the humans, we don’t like to focus on technical details. As programmers too, we prefer to use abstractions that allows us to write code that “saves” an “image” in some “folder”. Someone else has (thankfully) written sets of really technical and complicated stuff which offer us these high-level abstractions for operations such as saving files. The sets of such complicated software are called Operating Systems. Without an operating system, using a computer would be a very complicated and technical business. Having said this, parts of the operating systems may be quite complicated and technical too. As programmers to be, we would be greatly rewarded if we took the time to also learn a lot about the operating we are using, so that we better can tune and use it to our benefit and convenience!

Links

Further reading

Where to go next

« PreviousBook TOCNext »