- Richard Feynman -
Difference between revisions of "Database:Assignment 2 - JOINs and Constraints"
m (→Some tips for those who want to learn some more tricks: spelling) |
(→Task 2 - Create the tables colors and makes: Coding style changes) |
||
Line 204: | Line 204: | ||
<div class="mw-collapsible-content"> | <div class="mw-collapsible-content"> | ||
<source lang="SQL"> | <source lang="SQL"> | ||
− | CREATE TABLE colors(color | + | CREATE TABLE colors(color TEXT, color_id INTEGER PRIMARY KEY); |
− | CREATE TABLE makes(make | + | CREATE TABLE makes(make TEXT, make_id INTEGER PRIMARY KEY); |
</source> | </source> | ||
</div></div> | </div></div> | ||
Line 221: | Line 221: | ||
Log into the sqlite3 interactive shell for the database <code>many_cars</code> and verify that you have the colors and makes inserted in your tables: | Log into the sqlite3 interactive shell for the database <code>many_cars</code> and verify that you have the colors and makes inserted in your tables: | ||
<source lang="SQL"> | <source lang="SQL"> | ||
− | sqlite> | + | sqlite> SELECT * FROM colors; |
Blue|1 | Blue|1 | ||
Red|2 | Red|2 | ||
Line 230: | Line 230: | ||
Brown|7 | Brown|7 | ||
Silver|8 | Silver|8 | ||
− | sqlite> | + | sqlite> SELECT * FROM makes; |
Volvo|1 | Volvo|1 | ||
Mazda|2 | Mazda|2 |
Revision as of 11:02, 15 November 2018
Contents
- 1 Assignment 2 - JOINs and constraints
- 1.1 Goal of the assignment
- 1.2 Files to be handed in to the teacher
- 1.3 The Database and a short introduction
- 1.4 Getting started with the assignment
- 1.5 Task 1 - Create the table cars with a constraint for LicenseNumber
- 1.6 Task 2 - Create the tables colors and makes
- 1.7 Task 3 - Create the final version of the cars table
- 1.8 Task 4 - Populate the cars table with cars
- 1.9 Task 5 - Make queries about cars
- 1.10 Check list
- 2 Links
Assignment 2 - JOINs and constraints
This chapter makes most sense if you are using this book as the course literature in a database course. If you are using it for self-studies, you obviously must ignore the passages mentioning handing stuff in to the "teachers".
Goal of the assignment
The goal of this assignment is to verify that you are confident with the concepts from the chapters about JOIN and constraints. It also contains parts about using bash to work with your database from the command line. As with assignment 1, you will get information and knowledge from the text for the various tasks so you will hopefully learn some new stuff from this assignment too.
Files to be handed in to the teacher
You shall create a zip file with the exact name answer-handin02.zip
. This is how you create it in bash:
- Create a directory named exactly
answer-handin02
- The directory must contain the following files (created throughout the assignment):
-
name.txt
(shall contain your complete name - you create this file manually) -
license_number_constraint.txt
(text file with the create talbe statement with constraints for validLicenseNumber
) -
test_cases_license_number.txt
(text file with yourINSERT
statements to verify that illegalLicenseNumber
genereratesError: CHECK constraint failed
) -
silver_cars.txt
(see below) -
dodges.txt
(see below) -
red_mazdas.txt
(see below) -
select_statements.txt
(see below)
-
Make sure you have the zip
command installed in your bash environment. In order to zip the directory with all files (of course, this is something you do when you are done with all steps in the assignment!) in bash, you do this from the directory containing answer-handin02
. See assignment 1 if you forgot the syntax for zipping the directory and all its contents. The file to hand in must be called answer-handin02.zip
.
Hint: You cannot "be" in the directory you are zipping. You should be in the same directory where you created the answer-handin02
directory. When you are finished with a task, put the files you create in the task (from the list above) in this directory. Make sure every file is there (and has correct name) before you zip and hand in the zip file.
The Database and a short introduction
We will continue to work with the database many_cars
and the table cars
from the first assignment. The first thing we will do is to add a constraint to the column LicenseNumber
so that we only allow LicenseNumber
texts which are valid, i.e. three capital letters followed by one space followed by three numbers. A formal way to express that could be
[A-Z][A-Z][A-Z] [0-9][0-9][0-9]
.
The next step will be to create two additional tables, colors
and make
in order to "normalize" the cars
table. Remember, normalizing a table was what we did when we moved columns from one self-contained table to tables of their own, in order to avoid repetition of values in many rows in the first table. When we are finished with the re-design of our database, the cars
table will only contain the following columns: make_id
, color_id
and LicenseNumber
. We will also take measures to keep the integrity of the tables (make sure the tables contain sane data).
Getting started with the assignment
Create a directory named handin2
. This is where you will do all the work for this assignment. You will not submit this directory, it is only your project directory where you do the work.
Task 1 - Create the table cars with a constraint for LicenseNumber
We will start from scratch with a new database called many_cars
, initially without any tables. Create a table called cars2
with only one column, LicenseNumber TEXT PRIMARY KEY
, but we will also add a constraint for the license numbers. This table is only for testing and learning how to create this constraint! When we see that the constraint works as specified, we'll throw away this table and start over.
When you have set up a working test table cars2
, you shall come up with test cases for your constraint (INSERT statements showing that the constraint works). This task will produce two text files to put in your answer-handin02
directory:
-
license_number_constraint.txt
(where you shall paste the CREATE TABLE statement with the constraint) -
test_cases_license_number.txt
(where you should paste at least three INSERT statements with malformed license numbers)
The point of the files is to let the teachers verify that the CREATE TABLE creates a table with a working constraint for license numbers and that the SQLite program rejects the INSERT statements in your test cases. The teachers will, of course, also verify that the table accepts well-formed license numbers! You don't have to provide any working INSERT statements, however, the teachers will handle that part.
Now, lets get to work! In order to be able to create a constraint for the license numbers, we'll have to learn the function called GLOB
(below shown as an "infix operator" - an operator which sits between two operands).
One way of understanding the concept of "globbing" is to use it in Bash! The examples below are just to get you familiar with globbing and you don't have to submit any bash code. Make a new directory for this part of the assignment so that you can practice globbing without interfering with your assignmnet code. You will now use some bash in order to learn how *
works and a couple of useful operators.
The operators []
, *
and {}
We will create nine files in your test directory. The files will be empty but created with one single command:
$ touch file_{1..9}.txt # {} is specific to bash and actually not part of globbing!
$ ls
file_1.txt file_3.txt file_5.txt file_7.txt file_9.txt
file_2.txt file_4.txt file_6.txt file_8.txt
Now, lets learn about the globbing symbol *
. A single *
means "any number of any characters". The expression [0-9]
means "one character which is a number between 0 and 9".
Let's use a combination of *
and [0-9]
:
$ ls *_[0-9].txt
#all files that have *_ followed by one digit followed by .txt:
file_1.txt file_3.txt file_5.txt file_7.txt file_9.txt
file_2.txt file_4.txt file_6.txt file_8.txt
Next, create a new empty file:
$ touch file_99.txt
Let's list all the files which have only one number character after the underscore:
$ ls *_[0-9].txt #the file_99.txt file will not be listed!
file_1.txt file_3.txt file_5.txt file_7.txt file_9.txt
file_2.txt file_4.txt file_6.txt file_8.txt
As you see, the new file file_99.txt
didn't match the globbing expression *_[0-9].txt
, since it had more than one number between the underscore and the .txt
part.
Now, let's use a globbing expression which will match file_99.txt
but not the other files:
$ ls *_[0-9][0-9].txt
file_99.txt
# The file was listed, because it, and only it,
# had _ followed by two digits followed by .txt
# When we are done playing, remove the newly created text files.
The function GLOB
in SQLite3 works with the same syntax for the expressions to match. The function returns 1
if the string operator matches the pattern, and 0
otherwise. In SQLite3 you may think of 1
as meaning TRUE and 0
as meaning FALSE.
Some examples in the interactive shell for SQLite3 (you must try this in order to learn about GLOB
!):
sqlite> SELECT 'ABCabc012' GLOB '[A-Z]*';
1
-- any upper case letter followed by anything
sqlite> SELECT 'ABCabc012' GLOB '[A-Z][A-Z][A-Z]*';
1
-- any three upper case letters followed by anything
sqlite> SELECT 'ABCabc012' GLOB '[A-Z][A-Z][A-Z][A-Z]*';
0
-- No match, because string didn’t start with four upper case letters
sqlite> SELECT 'ABCabc012' GLOB '[A-Z][A-Z][A-Z]*[0-9]';
1
-- Matches three A-Z followed by anything followed by a number
sqlite> SELECT 'ABCabc012' GLOB '[A-Z][A-Z][A-Z][a-z][a-z][a-z][0-9][0-9][0-9]';
1
-- Matches the form of the string exactly
sqlite> SELECT 'A 0' GLOB '[A-Z] [0-9]';
1
-- Matches one upper case letter, a space, and one number.
Now you should be familiar with how to use GLOB
as an infix operator to match "one or several capital letters", "a space", and, "one or several numbers". So it should be fairly simple now to create a slightly more complex pattern for the constraint for valid license numbers. Your task now is to create the test table cars2
which uses a GLOB
expression for valid Swedish license numbers: "three capital letters between A and Z" followed by "a single space" followed by "three digits(numbers) between 0 and 9".
We'll give you the CREATE TABLE
syntax but you'll have to figure out the GLOB
part:
CREATE TABLE cars2(
LicenseNumber TEXT PRIMARY KEY
CHECK( LicenseNumber GLOB 'enter the glob pattern here!' )
);
If it's easier to read on one long line, here it is (but your browser will probably break the line anyway):
CREATE TABLE cars2( LicenseNumber TEXT PRIMARY KEY CHECK( LicenseNumber GLOB 'enter the glob pattern here!' ) );
It is that statement you shall save in the file license_number_constraint.txt
to be included in the zip file.
Verify that the following INSERT statements fail:
sqlite> INSERT INTO cars2 (LicenseNumber) VALUES('abc 123');
Error: CHECK constraint failed: cars2
sqlite> INSERT INTO cars2 (LicenseNumber) VALUES('ABC123');
Error: CHECK constraint failed: cars2
sqlite> INSERT INTO cars2 (LicenseNumber) VALUES('ABC 12A');
Error: CHECK constraint failed: cars2
sqlite> INSERT INTO cars2 (LicenseNumber) VALUES('ABC 1234');
Error: CHECK constraint failed: cars2
sqlite> INSERT INTO cars2 (LicenseNumber) VALUES('ABCD 123');
Error: CHECK constraint failed: cars2
Now, you shall come up with at least three test cases of your own which will look like spelling mistakes for license numbers. Verify that they render the error message as shown above before you add them to the file test_cases_license_number.txt
also to be included in the zip file.
Without creating any files, make sure that the table works for correctly formatted license numbers, e.g.:
sqlite> INSERT INTO cars2 (LicenseNumber) VALUES('ABC 123');
sqlite> INSERT INTO cars2 (LicenseNumber) VALUES('ZZZ 999');
You don't have to include such tests. They are mentioned here to give you a method to verify that your constraint isn't too narrow and accepts the format you are allowing!
Some tips for those who want to learn some more tricks
It is rather tiresome to write a lot of test cases manually in the interactive shell. What we will learn instead, is to create a text file with the name bad_licenses.txt
with one malformed license number on each row. Then we'll test that every license number in the file is rejected thanks to our constraint, by creating INSERT
statements from the file. This is just to show you some tesing techniques, so do not include this file in the zip file to be handed in! Here's the syntax for testing every license number from the file (in bash):
$ cat bad_licenses.txt|while read license;do echo "INSERT INTO cars2 VALUES('"$license"');";done | sqlite3 many_cars
Error: near line 1: CHECK constraint failed: cars2
Error: near line 2: CHECK constraint failed: cars2
Error: near line 3: CHECK constraint failed: cars2
Error: near line 4: CHECK constraint failed: cars2
Error: near line 5: CHECK constraint failed: cars2
Error: near line 6: CHECK constraint failed: cars2
Error: near line 7: CHECK constraint failed: cars2
Error: near line 8: CHECK constraint failed: cars2
Now, we might be interested to check how many errors we got. In order to do so, we'll redirect the standard error stream from SQLite to standard out using 2>&1
and pipe the result to wc -l
(the expert in counting rows!):
$ cat bad_licenses.txt|while read license;do echo "INSERT INTO cars2 VALUES('""$license""');";done | sqlite3 many_cars 2>&1 | wc -l
8
$ wc -l bad_licenses.txt
8 bad_licenses.txt
We confirmed that we got 8 errors from 8 rows in the file.
Explanation of the command line used above:
cat bad_licenses.txt # print the contents of the file to standard out
| while read license; # use the printout to read each line into the variable license
do # loop:
echo "INSERT INTO cars2 VALUES('""$license""');";
# print the INSERT statement and use the value of the variable license inside the single quotes
done # end of the while-loop
| sqlite3 my_cars 2>&1 # send the result of the loop to
# SQLite3 for the database my_cars
# but let error printouts from SQLite3 go to standard out
# rather than standard error
| wc -l # pipe any error printouts to wc -l
# which will echo out the number or rows of errors
When you are done experimenting with this, make sure you have created the two text files for this part, put them in the directory to be zipped, and drop the table cars2. Use a search engine if you forgot the syntax for dropping a table.
Task 2 - Create the tables colors
and makes
Note that in task 2, no files should be produced for inclusion in the zip file - you will merely create two tables and read data into them.
In this part, we'll prepare two tables with data, in order to be able to create a more normalized cars
table than the one in assignment 1. We'll do this by changing the table cars
to only contain these columns: make_id
, color_id
and LicenseNumber
.
We move the colors and makes into tables of their own right, so that we don't have to repeat a lot of text strings for color and make in the cars
table. Remember, normalizing was about consolidating data by moving some columns out to tables of their own and reference the tables instead.
So we need two new tables, colors
and makes
.
Create the table colors
with the following columns:
-
color TEXT
-
color_id INTEGER PRIMARY KEY
Create the table makes
with the following columns:
-
make TEXT
-
make_id INTEGER PRIMARY KEY
Expand using link to the right to get a sneak-peek of the create table syntax.
CREATE TABLE colors(color TEXT, color_id INTEGER PRIMARY KEY);
CREATE TABLE makes(make TEXT, make_id INTEGER PRIMARY KEY);
Download the files insert_colors.sql and insert_makes.sql and let SQLite3 run the statements in those files for the database many_cars
.
Expand using link to the right to get a sneak-peek of the syntax.
$ sqlite3 many_cars < insert_colors.sql
$ sqlite3 many_cars < insert_makes.sql
Those statements will only work correctly if you have been successful in creating the tables as detailed above.
Log into the sqlite3 interactive shell for the database many_cars
and verify that you have the colors and makes inserted in your tables:
sqlite> SELECT * FROM colors;
Blue|1
Red|2
Green|3
Yellow|4
Black|5
White|6
Brown|7
Silver|8
sqlite> SELECT * FROM makes;
Volvo|1
Mazda|2
Honda|3
Suzuki|4
Ford|5
Chevrolet|6
Saab|7
Simca|8
Dodge|9
If that is what you see, then this task is done. There is nothing to add to the zip file from this part.
Task 3 - Create the final version of the cars
table
In this task, we'll create the cars
table which you will use in task 4 below. Note that there are no files produced here in task 3 for inclusion in the zip file, but you must finish this task in order for the following tasks to work.
You should now create the cars
table with two foriegn keys, color_id
and make_id
and the column LicenseNumber
with the constaint you created in a previous task. Here's the nearly complete CREATE TABLE
statement, but or course, we'll hide the constraint from you, since that is part of a previous task!
sqlite> CREATE TABLE cars(color_id INTEGER, make_id INTEGER,
LicenseNumber text primary key check(
LicenseNumber GLOB 'your GLOB pattern here!'),
FOREIGN KEY(color_id) REFERENCES colors(color_id),
FOREIGN KEY(make_id) REFERENCES makes(make_id));
Again, note that you must put the globbing pattern you created earlier between the single quotes of the GLOB
expression above.
Task 4 - Populate the cars
table with cars
Now that you have the tables colors
, makes
and cars
, it's time for you to populate the cars
table with rows of some cars, and see how you can make SELECT
statements to get information about the cars, by joining the three tables.
You will want SQLite3 to enforce the foreign keys, so that you make sure that you don't enter rows referencing bad IDs in colors
or makes
. Do you remember the directive to SQLite3 for enabling foreign key checks?
Expand using link to the right to see the directive.
sqlite> PRAGMA FOREIGN_KEYS=ON;
Let's make sure that it works by trying to insert a car with a color id of 9 (which is illegal since there only exist 8 colors):
sqlite> insert into cars values(9,1,'ABC 123');
Error: FOREIGN KEY constraint failed
Do the same, but this time check that the constraint for make_id
works. Use 99
as the value for make_id
.
Expand using link to the right to see the directive.
sqlite> insert into cars values(1,99,'ABC 123');
Error: FOREIGN KEY constraint failed
Since there is no make_id
99
in the makes
table, it should fail.
Now that you verified that the foreign key constraints work, it's finally time for some valid cars. Download insert_normalised_cars.sql and execute the statements in that file for the database many_cars
.
Expand using link to the right to see how to create the directory and enter it.
$ sqlite3 many_cars < insert_normalised_cars.sql
Again, there are no files produced in this task to be included in the zip file. This task was only here for you to set up the tables and populate them with data so that you can do the following tasks where you will demonstrate that you know how to make SELECT
statements joining the three tables together.
Task 5 - Make queries about cars
To refresh your knowledge about joining tables together, here's a SELECT
statement which fetches all information about every car where the value for color is fetched from the colors
table and the value for make is fetched from the makes
table, using the color_id
and make_id
columns as the joining factors:
sqlite> SELECT make, color, LicenseNumber
FROM cars c, colors co, makes m
WHERE
c.color_id=co.color_id
AND
c.make_id=m.make_id;
Alternative syntax using JOIN:
sqlite> SELECT make,color,LicenseNumber
FROM cars
JOIN makes,colors ON
cars.make_id=makes.make_id
AND
cars.color_id=colors.color_id;
An alternative syntax is:
sqlite> SELECT make, color, LicenseNumber
FROM cars
JOIN makes ON
cars.make_id=makes.make_id
JOIN colors ON
cars.color_id=colors.color_id;
An alternative syntax using NATURAL JOIN is:
sqlite> SELECT make, color, LicenseNumber
FROM cars
NATURAL JOIN colors
NATURAL JOIN makes;
We recommend one of the versions using JOIN, because when you add conditions the WHERE clause quickly becomes complicated. The NATURAL JOIN is kind of neat, since it is certainly the shortest. It works because the columns we are joining on have the same name in the joined tables.
Now that you have refreshed the syntax for JOIN, your task is to produce four text files:
-
silver_cars.txt
-
dodges.txt
-
red_mazdas.txt
-
select_statements.txt
The file silver_cars.txt
shall contain all the data on cars with the color Silver. Here's a hint to get you started:
.once silver_cars.txt
SELECT make, color, LicenseNumber -- you'll have to fill in the rest here - look at the examples above!
Hint: Do not use the id columns for your condition. You should use the text 'Silver'
, because it is clearer than a magic number representing Silver in the colors
table. And what if the id for Silver changes in the future?
The file dodges.txt
shall contain all the data on cars of make Dodge. Here's a hint to get you started:
.once dodges.txt
SELECT make, color, LicenseNumber -- you'll have to fill in the rest here - look at the examples above!
Hint: Do not use the id columns for your condition. You should use the text 'Dodge'
, because it is clearer than a magic number representing Dodge in the makes
table. And what if the id for Dodge changes in the future?
The file red_mazdas.txt
shall contain all the data on cars whose make is Mazda and whose color is Red. Hint:
.once red_mazdas.txt
SELECT make, color, LicenseNumber -- you'll have to fill in the rest here - look at the examples above!
Hint: Do not use the id columns for your condition. You should use the texts 'Mazda'
and 'Red'
, because it is clearer than a magic number representing Mazda in the makes
table and a magic number representing Red in the colors
table. And what if the ids for Mazda and Red change in the future?
The file select_statements.txt
shall contain all the SQL-statements you used to create the three files above. You don't need to include the .once
directives.
Hint: Look carefully at the example JOINs above. Add suitable conditions to filter the results using a WHERE
clause at the end of the statements.
Check list
Before submitting the zip file, verify that your directory answer-handin02
contains the following files with the exact same names:
-
name.txt
-
license_number_constraint.txt
-
test_cases_license_number.txt
-
silver_cars.txt
-
dodges.txt
-
red_mazdas.txt
-
select_statements.txt
Zip the directory and the resulting zip file should be named exactly answer-handin02.zip
.
The reason we are so particular about the exact names of the files, is that we use a script to correct the submitted files. The script will only work if you follow our naming conventions. But a more important reason for requiring exact names, is to prepare you for very specific requirements. It is a skill which often is required when working with computers. If you have difficulties spelling names etc, you can use copy and paste for the file names. And double check with a friend before submitting.
The file name.txt
will be used by our script in order to extract your name when correcting, and create a directory with your name. So it is very important that the file is called exactly name.txt
and that it contains your name and nothing else.
Links
Source code etc
- Download the files in one big zip here (via DownGit)
Where to go next
This is the last chapter of the database book! Well done!
« Previous • Book TOC • Next »