Difference between revisions of "ITIC:Digital representation  Binary"
m (Rikard moved page Computing:Binary representation to ITIC:Digital representation  Binary: Scrapped the Computing book, created Introdution to IT book) 
(→Navigation) 

(One intermediate revision by the same user not shown)  
Line 486:  Line 486:  
* TODO  * TODO  
=Navigation=  =Navigation=  
−  {{Nav linksprev=  +  Next page contains [[ITIC:Digital_representation__Binary__Exercisesexercises on binary representation]]. 
+  
+  {{Nav linksprev=ITIC:Computers_and_hardwareTOC=Introduction_to_IT_and_computingnext=ITIC:Digital_representation__Binary__Exercises}} 
Revision as of 10:56, 24 June 2019
Contents
Digital representation
Binary numbers
bits and Bytes
A computer stores data in "digital" format, meaning that it stores information as numerical representation, using what is called "binary" numbers.
A Binary number is a number which can be represented using only two symbols (normally 0 and 1). One such binary digit (a 1 or a 0) is called a "bit" (which is actually short for BInarydigiT).
Using only 0 and 1, we can represent two values in one such bit. The decimal numbers we can represent using only one bit are also 0_{10} and 1_{10}.
If we use two bits, we can represent four different decimal numbers:
 0_{10}
00_{2}
 1_{10}
01_{2}
 2_{10}
10_{2}
 3_{10}
11_{2}
What if we use three bits? How many combinations can we get? Each combination will represent one decimal number. It turns out that we get eight combinations from three bits, and 16 combinations from four bits. The pattern comes from powers of 2:
 (2^{0} = 1) here for completion  we don't use 0 bits to store information ;)
 2^{1} = 2 using one bit
 2^{2} = 4 using two bits
 2^{3} = 8 using three bits
 2^{4} = 16 using four bits
 etc etc
Of course, the value doubles every time the exponent increases by one (because that's what happens when you multiply a number by 2 ;) ).
The powers of 2 are the number of decimal numbers we can represent. If we start at 0 and only represent positive numbers, then the decimal numbers we can represent will be 0 through the exponent minus 1:
 2^{1} = 2  the numbers 0  1 since (21=1)
 2^{2} = 4  the numbers 0  3 since (41=3)
 2^{3} = 8  the numbers 0  7 since (81=7)
 2^{4} = 16  the numbers 0  15 since (161=15)
We take the power minus one, since one combination will be the zero, the next one etc.
A common term used for data in binary format is "byte". A byte normally means eight bits. How many combinations can you get using eight bits? 256! Since 2^{8} is 256. So, for positive numbers, we can represent 0  255 using a byte.
Negative numbers
What about negative numbers?
A common way to represent negative numbers is to use a "sign bit", for instance the leftmost bit (when we write the binary number down). This means that we have one bit less to represent the value (since we are wasting one bit for the sign). Typically, the sign bit is implemented such that a one means "negative" and a zero means "positive".
The following could be one way of representing the number minus one using eight bits: 10000001
, in binary. But then the problem becomes deciding how to represent the value zero. Should we have two representations for zero? Minus zero: 10000000
and positive zero: 00000000
?
Back in the day, a technique called one's complement was used for representing negative numbers. A leftmost 1 still signifies "negative" and a leftmost 0 "positive". To represent a negative number, one takes its positive representation and then flips all bits (all 0s become 1 and all 1s become 0). This way too renders two zeros, negative zero (in eight bits): 1111 1111
and positive zero 0000 0000
. Counting with one's complement numbers is also a little tricky when it comes to subtraction. See one's complement for details.
A common way to "solve" these "problems" is to use an algorithm (yeah, we love to use that word, it gives us 5 imaginary academic points every time we write or say "algorithm") called "two's complement". In this way of representing numbers, there is no representation of "minus zero" because in this way of representing numbers in binary, we only need "one" single representation of zero. In fact, trying to represent minus zero using this technique, renders the same pattern as for positive zero.
The "trick" (or "algorithm" if you must) used in two's complement, is to take these steps:
 For a positive number, represent it with a leading zero for the sign bit, and use the remaining bits to represent the number in binary format
 For a negative number, represent it with a leading one for the sign bit, then represent the number as the binary number with all bits flipped (all ones become zeros and vice versa), and then add one to the resulting number.
 Or: represent the number as normal, flip all bits (including the sign bit), add one to the result.
Some examples:
 the number +3_{10} in eight bit binary:
00000011_{2}
To evaluate that binary number using the two's complement, we only have to look at the sign bit (leftmost) and see that it is a positive number, so the seven "value bits" are then to be evaluated as normal:
 the rightmost bit is a one, so it represents the value 1 (1*2^{0} = 1)
 the next (going left) bit is also one, so it represents the value 2 (1*2^{1} = 2)
 1 + 2 = 3
 the value
00000011_{2}
is then 3_{10}
To represent the value 3_{10} in eight bit binary (using two's complement), we'd do the following:
 calculate the binary number for 3 in eight bits, which is
0000 0011_{2}
 flip all the bits, which gives us
1111 1100
 add one, which gives us:
1111 1100 +0000 0001  1111 1101
Conclusion: 3_{10} is 1111 1101_{2}
using two's complement.
To go the other way, from 1111 1101_{2}
to decimal, we'd look at the sign bit and see that it is a negative number.
It's a negative number, so we have to calculate its "two's complement" representation.
a) Flip all the bits 1111 1101 becomes: 0000 0010
Yeah. Then, according to academic scholars (ask your local informatics doctor for advice), we have to add one to this new binary number:
b) add one: 0000 0010 +0000 0001  0000 0011 (3 in decimal)
The number we now have in binary, should be interpreted "normally", and then changed to its inverse number (itself times 1):
Change the sign of 3 to 3.
Answer: 11111101 in binary is 3 in decimal.
Another example:
What decimal number is the binary number, in eight bits, 0000 1000 (using two's complement)?
The number starts with 0, which means positive, so we don't need to take the number's two's complement.
The answer is: The number is 8 in decimal.
One more example: What decimal number is the binary number, in eight bits, 1000 1000 (using two's complement)?
The number starts with a 1, so it's negative and we have to take its two's complement:
a) Flip all bits: 0111 0111 b) add one 111 (carry) 0111 0111 +0000 0001  0111 1000 64 + 32 + 16 + 8 = 120 Change the sign, so the answer is: 1000 1000 represented using two's complement is 120 in decimal (using eight bits).
Finally, lets see how the two's complement work with the value negative zero. We'll take zero and flip the bits and add one:
1111 1111 +0000 0001  1 0000 0000 In eight bits, that's 0000 0000
(the leftmost 1 in the result is truncated off, since we are only looking at eight bits)
This means we end up with the same bit pattern as we started with, meaning we have only one representation of zero.
Adding and subtracting
How does addition and subtraction work in binary? The same as in decimal (or any base!). Let's look at some examples using four bits for brevity:
1 < carry since 1+1 = 10 in binary 0001 +0001  0010 In decimal: 1 +1  2 Let's subtract: 1 < borrow 1 from the digit to the left 0010 0001  0001 In decimal: 2 1  1 One more example: ***1 (borrow) 1000 (8) 0001 (1)  0111 (7)
Working from right to left, 0  1, we have to borrow. But the next bit is also 0, so we have to borrow. And again, until we borrow from the leftmost 1 in 1000.
We borrow 10 which is 2 in decimal. When we borrow 10 from the leftmost 1 in 1000, we get 10 to the right of it and so on.
Substraction in decimal works the very same:
Decimal 1000  1: ***1 (borrow 10s) 1000  1  0999
Only in decimal, we borrow 10 which is 10 in base 10. In binary, we also borrow 10 (which is 2 in decimal).
In decimal, when we borrow 10, and subtract 1, we get 9.
In binary, when we borrow 10, and subtract 1, we get 1 (2  1 is 1)
Put differently, in binary:
1  0 = 1 1  1 = 0 0  1 = 1 (borrow 10 from left) 10  1 = 1
What numbers can we represent?
Now that we know how the two's complement work (and that it was "invented" to allow us to get only one representation of zero), and that signed numbers use one bit for the sign (1 means negative, 0 means positive), we also understand that we can represent fewer values when we have signed values.
If we take a type for a signed byte of eight bits, like the Java primitive byte
type, we will thus only be able to represent the following values: 128_{10} through +127_{10} (256 values including the 0). This is because we only have seven bits for the value and one bit for the sign. Again, 2^8 is 256, so that's the number of combinations of ones and zeros, and thus the number of representations for decimal numbers. The span of the0 negative numbers are 128 through 1, then we have zero, and then the positive numbers go from 1 through 127. This is a pattern, an eight bit type (eight bits of memory) can only produce 127 different positive numbers and zero since 2^7 is 128. All zeros would represent 128.
So, in two's complement, the values are the from minus 2^(number of bits 1) to 2^(number of bits 1) 1:
 8 bits: 2^{7}  (+2^{7})  1 (128  +127)
 16 bits: 2^{15}  (+2^{15}) 1 (32,768  32,767)
 32 bits: 2^{31}  (+2^{31}) 1
 etc
Here's how you would arrive at the eight bit two's complement for 127:
In decimal, 127 is 64 + 32 + 16 + 8 + 4 + 2 + 1, so the bits to set are for the binary number is: 111 1111   `2^0, which is 1     `2^1, which is 2     `2^2, which is 4    `2^3, which is 8   `2^4, which is 16   `2^5, which is 32  `2^6, which is 64 Padded with one zero to get eight bits we get: 0111 1111 Flip the bits: 1000 0000 Add one: 1000 0000 +0000 0001  1000 0001 Answer: the two's complement representation for 127 is 1000 0001
Again, to go the other way around, if we have the eight bit two's representation 1000 0001
and want to know what decimal number that represents, we'd do:
We have: 1000 0001 1. Flip the bits: 0111 1110 2. Add one: 0111 1110 +0000 0001  0111 1111 3.Calculate the value of the result (as if it were a positive number): 111 1111   `2^0, which is 1     `2^1, which is 2     `2^2, which is 4    `2^3, which is 8   `2^4, which is 16   `2^5, which is 32  `2^6, which is 64 so, 1 + 2 + 4 + 8 + 16 + 32 + 64 = 127 (decimal) 4. Take the inverse of that value: 127 * 1 = 127 Answer: The two's complement representation 1000 0001 represents the decimal number 127.
We can check the maximum positive value possible to represent in two's complement representation in eight bits, 0111 1111, and calculate that it is 127 (see above).
To calculate the minimum value, we know that it starts with a 1 (sign bit for negative). Since 2^{7} is 128, this would be 1000 0000
using unsigned binary. Let's try to represent 128 in two's complement to see if it is possible.
1. Take the absolute number of 128 and represent it in binary: 1000 0000 2. Flip the bits: 0111 1111 3. Add one: 1111 111 (carry) 0111 1111 +0000 0001  1000 0000 Answer: The two's complement representation for 128 is 1000 0000 Comment: It is a rather weird number, since its two's complement is the very same number.
Would it be possible to represent an even smaller number than 128 in eight bits using two's complement representation? No, since we have only eight bits, the maximum number of representations is 256 (2^{8}). We saw that we can represent 127 positive numbers and zero. That's 128 representations (bit combinations). 256  128 = 128. We saw that we could represent 128 negative numbers. So there can't be a bit pattern for 129.
Padding
For positive values, the number is padded with zeros from left to the first one. For negative values, the number is padded with ones from left to the first zero.
Example: eight bit 3_{10} is:
0000 0011
Same number using 16 bits:
0000 0000 0000 0011
Eight bit 3_{10} is:
1111 1101
Same number in 16 bits:
1111 1111 1111 1101
Text
Text is stored in a computer as any data is stored  using bits representing binary numbers. But, you may rightfully wonder, can a number represent a piece of text? There are standard character tables (the most common of which is the ASCII table), which map numbers to characters.
So when a computer stores a string of characters, it stores the numbers of the characters from some table as binary numbers (ones and zeros as above).
Let's look at an example of storing ABC using the ASCII table. Here's the relevant part of the ASCII table for the letters A, B and C:

A
65 
B
66 
C
67
In binary, A is coded as 65 in binary, which is 100 0001. B is 66 which is 100 0010, and C is 67 which is 100 0011. So when the computer stores ABC
in e.g. a file, using the ASCII table, the computer will store the following bits using 8 bit bytes: 0100 0001 0100 0010 0100 0011
. Note that we group the bits in chunks of four for increased readability here (the computer stores them consecutively). This is so common, that one actually uses special notation for a chunk of four bits, using base 16 (a.k.a. hexadecimal). Since in base 16, unsigned numbers go from 0 to 15, we need notation to express those values. In hexadecimal, values go from 0 to F:
 0
 1
 2
 3
 4
 5
 6
 7
 8
 9
 A (10_{10})
 B (11_{10})
 C (12_{10})
 D (13_{10})
 E (14_{10})
 F (15_{10})
Since we need four bits to represent decimal values 0  15, grouping bits in fours and translating to hexadecimal is very convenient (and common).
Here's another example of encoding text, the text XYZ
will be encoded like this:
 X binary: 0101 1000 decimal: 88 hex: 58
 Y binary: 0101 1001 decimal: 89 hex: 59
 Z binary: 0101 1010 decimal: 90 hex: 5A
So the data stored will be:
0101 1000 0101 1001 0101 1010 which can be written for us humans as hex: 58 59 5a the mapping is: 0101 1000 0101 1001 0101 1010 5 8 5 9 5 A
What about text with lines, tabs, spaces and sections?
Those characters are also present in the ASCII table. A line of text is simply text ending with a newline character. The newline character has decimal value 10 in the ASCII table. Two newline characters in a row will make the text look like it is divided in sections. A tab is simply a number in the ASCII table, 9. Space has number 32 (decimal) in the ASCII table.
Note that we can only represent structured text using the ASCII table. There is no formatting such as fonts, font faces (bold, italics, underline etc). Data which contains only structured text from the ASCII table is called "plain text". In order to represent style and fonts, we need a different encoding (like a file format for Libre Office Write, or Microsoft Word). Styled documents are NOT called text documents or plain text. They are called something else, like word processor documents etc.
Overflow and underflow
If the largest number we can store in eight bits, using two's complement, is 127, what happens if we try to add one to that number?
1111 111 < carry 0111 1111 +0000 0001  1000 0000
In binary, 1+1 = 10. So we get one in carry. Eventually, the sign bit gets changed.
This means that if we have an eight bit signed type like byte
in Java, adding 1 to 127 using this type, will produce 128. This might seem surprising, but if we think of it, how would you represent +128 if you had as many bits you needed?
1000 0000 = 2^7 2^7 = 128
But using two's complement, the leftmost bit isn't worth 2^{7}, since it is the sign bit. So, if we are using eight bits and two's complement, we'd interpret the same bit pattern differently:
1. Flip the bits: 0111 1111 2. add one: 1111 111 (carry) 0111 1111 + 0000 0001  1000 0000 (128) So the number is 128 in eight bit binary (using two's complement).
So, the same bits represent 128 and 128. If we have 16 bits, 128 would be:
0000 0000 1000 0000
If we only have eight bits, we can't represent 128 (if we represent it using two's complement), so the same bit pattern will represent 128:
1000 0000
From this we hopefully learn that how to interpret a given bit pattern depends on what method is used for representation. 1000 0000
represents 128 if we are using two's complement using eight bits. If we are using the binary number system mathematically for unsigned numbers, 1000 0000
instead represents +128.
A related question is, what happens if we subtract 1 from 128 using eight bits (also using two's complement)?
**** *** (borrow) 1000 0000 (128) 0000 0001 (1)  0111 1111 (127)
Now the sign changed to positive, and the value is 127.
Here's a small Java program showing what happens:
public class Bin {
public static void main(String[] args) {
String padding = "000000000000000000000000";
byte b = 125;
System.out.println(b + " + 1 = " + ++b);
System.out.println(b + " + 1 = " + ++b);
System.out.println(b + " + 1 = " + ++b);
System.out.println(b + " + 1 = " + ++b);
System.out.println(b + "  1 = " + b);
System.out.println(b + "  1 = " + b);
System.out.println(b + "  1 = " + b);
System.out.println();
System.out.println("32 bits for 128: " + Integer.toBinaryString(128));
System.out.println("32 bits for +128: " + padding + Integer.toBinaryString(128));
System.out.println("Using 8 bits:"); // int is actually 32 bits, but we cheat and take the last eight chars
System.out.println("128: " + Integer.toBinaryString(128).substring(24));
System.out.println("+128: " + Integer.toBinaryString(128)); // the 24 leading zeros are not shown
}
/*
$ javac Bin.java && java Bin
125 + 1 = 126
126 + 1 = 127
127 + 1 = 128
128 + 1 = 127
127  1 = 128
128  1 = 127
127  1 = 126
32 bits for 128: 11111111111111111111111110000000
32 bits for +128: 00000000000000000000000010000000
Using 8 bits:
128: 10000000
+128: 10000000
*/
Links
Further reading
 WolframAlpha  calculator
 Oracle tutorial  Java datatypes
 Wikipedia  Two's compliment
 Wikipedia  Binary number
 Signed Binary/Decimal Conisadfsdfnversion Using the Two's Complement Representation (University of British Columbia, Canada)
  Exploring Binary  Decimal/Two’s Complement Converter
Lecture presentation and videos
 Binary Representation (Full playlist)  Binary Representation 1/8  2/8  3/8  4/8  5/8  6/8  7/8  8/8  Presentation: Computing  Binary representationtwo s complement.pdf
 Binary Representation  Text (Full playlist)  Binary Representation  Text  1/5  2/5  3/5  4/5  5/5  Presentation: Computing  Binary representation  Text.pdf
Source code
 TODO
Next page contains exercises on binary representation.
« Previous • Book TOC • Next »