- Richard Feynman -
ITIC:HTML
Contents
- 1 Introduction to HTML
- 2 Links for Introduction to HTML
- 3 Below is an inclusion of another page on this wiki
- 4 Meta information about this chapter
- 5 Chapter videos
- 6 Introduction
- 7 Links
- 8 Exercises on HTTP
- 9 Exercises
- 9.1 Find header of www.sunet.se
- 9.2 Visit a simple HTML page
- 9.3 Use a command line program to fetch the same simple page
- 9.4 Netcat in listening mode
- 9.5 Connect to your server
- 9.6 Send data between the two
- 9.7 Write a script printing HTTP header and HTML content
- 9.8 Start netcat in listning mode and use wserver.sh
- 9.9 Connect to the above server using curl
- 9.10 Connect to the above server using a browser
- 9.11 Connect to the above server using a browser
- 9.12 Make netcat (listening mode) listen forever
- 9.13 Download an image using curl
- 9.14 Providing data in JSON format
- 9.15 Providing data from a database
- 9.16 Access the JSON data using curl
- 10 Links
- 11 Structure of an HTML document
- 12 More about tags
Introduction to HTML
This module introduces HTML. The purpose is not that you shall be fluent in designing web pages, but rather to give you a basic understanding of the structure of web pages, the meaning of markup and some basic elements, enough for you to write a very simple, albeit not very aesthetically appealing, web page, as well as understand the HTML code written by others.
HTML stands for HyperText Markup Language, and is a way to "markup" text to give it structure and layout, for software to present (typically as web pages). It's not a programming language, but a way to annotate text so that a computer running software that can parse (make sense of) HTML can represent it visually. The hypertext in the name, refers to text with links that allows the reader to instantly follow references to other documents (typically by clicking on a hyperlink with the reference text).
Full frontal - Code up-front
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>This is shown as the title of the window/tab</title>
</head>
<body>
<p>A paragraph of text looks like this.</p>
</body>
</html>
The above shows the basic structure of an HTML document.
History
HTML was created by Tim Berners-Lee around 1991, and together with the first browser and the HTTP protocol (also credited to Mr. Berners-Lee), made up the world wide web (www). The very first web page was at http://info.cern.ch/hypertext/WWW/TheProject.html and the code of that page shows the early version of the HTML markup.
HTML markup
To markup text is to annotate the text to give it meaning in terms of structure and layout, by adding specially crafted "tags" to the text. The tags are not part of the actual textual content, but marks up parts of the text, to make it possible for computers (running programs that understand HTML) to understand how to layout and present the text.
In HTML, you use elements that surround some text, or even surround some text including other elements. The ability to nest elements within elements creates an hierarchical structure (that can be represented as a tree) of the whole document.
An element consists of an opening tag, some content, and a closing tag. An example from the document above would be the p
element, which has the opening tag <p>
and some content (text) and the closing tag </p>
:
<p>A paragraph of text looks like this.</p>
Each tag is created by a less than sign, the tag name and possible other information (attributes), and a greater than sign. The closing tag also has a slash immediately after the less than sign that starts the closing tag:
<p>Text</p>
<!-- <p> is the opening tag, </p> is the closing tag -->
Nesting elements
As you might have guessed, the p element stands for paragraph. Most elements have short names or abbreviations like p.
Elements, like we said before, can be nested:
<body>
<p>Paragraph text here.</p>
</body>
Above, the p element (a paragraph of text) is nested inside the body element (the body is the bulk of the document text).
When nesting elements, you must close the nested element inside the enclosing element. The following is wrong:
<p>Some <strong>text</p></strong>
Take some time to look at the full page example at the top of this page (under Full frontal) and make sure you understand how to correctly nest elements.
Block elements and inline elements
One way of categorizing elements is to think about them as two kinds; Block elements and Inline elements.
Block elements make the content inside appear (when rendered and presented by e.g. a browser) surrounded by newlines. The typical example of a block element is the p element which creates a paragraph. A paragraph of text is a block of texts displayed by itself, which is to say, surrounded by newlines. Note that the text inside a paragraph wraps the lines to fit the element width. In a large browser window, there are fewer lines per paragraph, since more text fits each line. In a small browser window, a paragraph has more lines, since less text fits each line.
The purpose of block elements is typically to provide structure to the document, headers, paragraphs etc.
The other kind of elements are the inline elements. As the name indicates, these elements are used inline, for example inside a line of text, without creating a newline before or after. Inline elements are therefore always (at least we can't think of any exception to this rule) nested inside block elements, like a paragraph (block) containing text, where part of the text is bold, or italic. So inline elements often are used for styling text, but also for transforming part of the text, like to make a link out of some words in the text:
<p>
This text contains a <a href="https://en.wikipedia.org/wiki/Hyperlink">hyperlink</a> and some <em>emphasized</em> text.
</p>
The above would render like:
This text contains a hyperlink and some emphasized text.
Attributes - customize elements
There is sometimes a need to provide some metadata to an element. If we take a link (the a element), we need to provide the URL for the linked document. Metadata is data about data, so the link has information about where it links to.
Attributes have names and values. You write the attribute inside the opening tag by adding a space and the attribute name after the element name, and then an equal sign and a quoted value. For the anchor tag, the a tag, one of the possible attributes is the href
(hyper reference) attribute. The value of the href
attribute is the URL to the resource we want to link to. You can have more than one attributes to elements. For the anchor element, you can also have the title
attribute, which will produce a tooltip for the user when she hovers the mouse over the link:
<a href="https://en.wikipedia.org/wiki/Hyperlink" title="Click here to get to a page explaining what hyperlinks are">hyperlink</a>
Would be rendered as:
Sometimes a value is Boolean-valued, that is either true or false. In this case, you don't need a value. Just putting the attribute name in an opening tag means that the attribute has the value true:
<video src="apa.mpg" muted></video>
The video will have the attribute muted set to true, that is, the video will be muted (sound is off).
White space
HTML was created for text documents (even if you can include images and other media). Text documents are easier to read when there is conventional whitespace like one single space between words, one single empty line between paragraphs etc. So computers that parse HTML and render web pages from HTML, ignore extra whitespace. Both between elements and within text inside an element. So this:
<p> Hello whitespace in HTML </p>
Is equivalent to this:
<p>Hello whitespace in HTML</p>
However, it is customary to indent the code in an HTML document. Each level of nesting has a fixed number of spaces for indentation - just to make it easier for humans to read the document and see the structure. So, it's a good thing that extra whitespace is ignored, because it allows humans to prettify the code and make it easier to read.
Special characters
As with any computer language, we get to a situation where the parts of the language conflicts with parts of normal text. Since tags start with ;lt; and end with >, then how do we represent text that also has a < or a >?
We have to use special codes (so called HTML entities) for representing special characters to be presented in the text of a web page. Here's a list of some of the most common special codes:
< is produced by < > is produced by > " is produced by " ' is produced by ' & is produced by &
Comments
Often we write comments in the source code, so that we and other humans can get some comment about the code.
In HTML, a comment is written like this:
<!-- This text is ignored -->
<p>Normal markup</p>
<!--
even the whole of this crap
on several lines
including markup like <strong>monkey wall paper</strong>
is ignored by the software parsing HTML
-->
Conclusion
This is all that we will share with you about HTML. There are so many resources online that are excellent at teaching HTML, so there's no point that we (who are no experts in the matter) try to create yet another full HTML tutorial here. Please read the external links below, and have a look at the included page at the end.
Links for Introduction to HTML
Summary lecture slides
Video lecture and slides
- TODO - We will add videos here
Workshop slides
- TODO - We will create a workshop and add information about it here
Source code with the examples
- https://github.com/progund/web-misc/tree/master/web-basics/html
- If you know git, you may clone git@github.com:progund/web-misc.git or download files individually from the link above
Further reading
- https://www.girldevelopit.com/materials/intro-web
- https://www.girldevelopit.com/materials/html-intro
- https://www.codecademy.com/learn/learn-html
- https://developer.mozilla.org/en-US/docs/Learn/HTML/Introduction_to_HTML
- https://en.wikipedia.org/wiki/HTML
Where to go next
The next page is Privacy on the web.
« Previous • Book TOC • Next »
Below is an inclusion of another page on this wiki
Note: Below is an inclusion of this page: Web:HTTP and HTTP_-_Exercises.
If you are interested in HTML, you can use the table of contents links to go down to the HTML part of this page, but we recommend that you read the HTTP sections as well. Understanding the HTTP protocol is a useful skill for anyone interested in HTML and web technology.
Meta information about this chapter
Expand using link to the right to see the full content.
Introduction
Introduces the HTTP protocol, to prepare readers/students on web technology and web programming.
This chapter is used in Java Web programming and (as a partial include) in Web basics.
Purpose
In order to understand web, the students/readers need to understand the HTTP protocol.
Requirements
- Basic network terminology skills
- Basic command line skills with Bash
Goal
The student shall:
- Have basic knowledge of the HTTP protocol used in client-server connections between a web client and a web server
Concepts
- Protocol
- HTTP
- Web
- Request
- Response
- Headers
- Response body
- Request body
- Status code
Instructions to the teacher
Common problems
- Students don't understand the concept of "protocol"
- Students confuse HTML and HTTP
- Students don't understand the concept of Header and Body
- Students don't understand the importance of the content-type header
Chapter videos
All videos in this chapter:
- HTTP Introduction (Full playlist) | HTTP Introduction - 01 | 02 | 03 | 04 | 05 | pdf
Introduction
The Hypertext Transfer Protocol (HTTP) is an application protocol for distributed, collaborative, and hypermedia information systems.[1] HTTP is the foundation of data communication for the World Wide Web. - HTTP (wikipedia)
Links
External links
Exercises on HTTP
These exercises are tested on Ubuntu, Fedora, MacOS and cygwin using netcat as installed using Juneday's installation scripts (Software Used). Please note that there might be small differences on how to use these commands on cygwin and mac os (as well as differences between different versions of bash on different mac os versions). Unfortunately there are a couple of programs called netcat/nc.
We try to keep the list below updated:
- MacOS:
- HomeBrew: netcat is started using
nc
(alt/usr/local/bin/nc
) - MacPorts: netcat is started using
nc6
(alt/opt/local/bin/nc6
) - Without the above the netcat exercises below are, as far as we know, not possible to do
- HomeBrew: netcat is started using
- Cygwin:
nc6
- Debian, Ubuntu, Fedora, RedHat:
nc
A good way to figure out what to do if something below doesn't work on your computer, is to as the manual (e.g. man nc
) or ask the command itself:
nc -help
OpenBSD netcat (Debian patchlevel 1.105-7ubuntu1)
This is nc from the netcat-openbsd package. An alternative nc is available
in the netcat-traditional package.
usage: nc [-46bCDdhjklnrStUuvZz] [-I length] [-i interval] [-O length]
[-P proxy_username] [-p source_port] [-q seconds] [-s source]
[-T toskeyword] [-V rtable] [-w timeout] [-X proxy_protocol]
[-x proxy_address[:port]] [destination] [port]
Command Summary:
-4 Use IPv4
-6 Use IPv6
-b Allow broadcast
-C Send CRLF as line-ending
-D Enable the debug socket option
-d Detach from stdin
-h This help text
-I length TCP receive buffer length
-i secs Delay interval for lines sent, ports scanned
-j Use jumbo frame
-k Keep inbound sockets open for multiple connects
-l Listen mode, for inbound connects
-n Suppress name/port resolutions
-O length TCP send buffer length
-P proxyuser Username for proxy authentication
-p port Specify local port for remote connects
-q secs quit after EOF on stdin and delay of secs
-r Randomize remote ports
-S Enable the TCP MD5 signature option
-s addr Local source address
-T toskeyword Set IP Type of Service
-t Answer TELNET negotiation
-U Use UNIX domain socket
-u UDP mode
-V rtable Specify alternate routing table
-v Verbose
-w secs Timeout for connects and final net reads
-X proto Proxy protocol: "4", "5" (SOCKS) or "connect"
-x addr[:port] Specify proxy address and port
-Z DCCP mode
-z Zero-I/O mode [used for scanning]
Port numbers can be individual or ranges: lo-hi [inclusive]
Trouble shooting
If you run into trouble while running the various scripts on this page, you can look at our Bash FAQ for some common problems and their solutions.
Exercises
Find header of www.sunet.se
Use a command line program (curl, wget, telnet, nc (netcat), ..) to find the HTTP headers sent by www.sunet.se
Expand using link to the right to see a suggested solution/answer.
Curl:
$curl --head www.sunet.se
HTTP/1.1 301 Moved Permanently
Date: Sat, 18 Nov 2017 12:21:45 GMT
Server: Apache/2.4.7 (Ubuntu)
Location: https://www.sunet.se/
Content-Type: text/html; charset=iso-8859-1
Wget: Wget is a bit tougher to use to find the headers. Here's our suggested way to do find the headers:
$ wget --server-response www.sunet.se
--2017-11-18 13:25:18-- http://www.sunet.se/
Resolving www.sunet.se (www.sunet.se)... 192.36.171.231, 2001:6b0:8:2::233, 2001:6b0:8:2::232
Connecting to www.sunet.se (www.sunet.se)|192.36.171.231|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 301 Moved Permanently
Date: Sat, 18 Nov 2017 12:25:18 GMT
Server: Apache/2.4.7 (Ubuntu)
Location: https://www.sunet.se/
Content-Length: 306
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=iso-8859-1
Location: https://www.sunet.se/ [following]
--2017-11-18 13:25:18-- https://www.sunet.se/
Connecting to www.sunet.se (www.sunet.se)|192.36.171.231|:443... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Date: Sat, 18 Nov 2017 12:25:19 GMT
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 11627
Content-Type: text/html; charset=UTF-8
Age: 0
X-Cache: MISS
X-Cache-Hits: 0
Connection: keep-alive
Accept-Ranges: bytes
Length: 11627 (11K) [text/html]
Saving to: ‘index.html’
index.html 100%[===================>] 11,35K --.-KB/s in 0,001s
2017-11-18 13:25:19 (7,80 MB/s) - ‘index.html’ saved [64246]
telnet: You use telnet interactively so to make things clearer we've highlighted the stuff you should write:
$ telnet www.sunet.se 80
Trying 192.36.171.231...
Connected to www.sunet.se.
Escape character is '^]'.
HEAD / HTTP/1.0
HTTP/1.1 301 Moved Permanently
Date: Sat, 18 Nov 2017 12:28:42 GMT
Server: Apache/2.4.7 (Ubuntu)
Location: https:///
Connection: close
Content-Type: text/html; charset=iso-8859-1
Connection closed by foreign host.
netcat or nc: You use netcat interactively so to make things clearer we've highlighted the stuff you should write:
$ nc www.sunet.se 80
HEAD / HTTP/1.0
HTTP/1.1 301 Moved Permanently
Date: Sat, 18 Nov 2017 12:35:23 GMT
Server: Apache/2.4.7 (Ubuntu)
Location: https:///
Connection: close
Content-Type: text/html; charset=iso-8859-1
you may need to end the session by pressing Ctrl-d
.
lwp-request -m HEAD (or HEAD):
$ HEAD http://www.sunet.se/
200 OK
Connection: close
Date: Sat, 18 Nov 2017 12:39:02 GMT
Age: 0
Vary: Accept-Encoding
Content-Type: text/html; charset=UTF-8
Client-Date: Sat, 18 Nov 2017 12:39:02 GMT
Client-Peer: 192.36.171.231:443
Client-Response-Num: 1
Client-SSL-Cert-Issuer: /C=NL/ST=Noord-Holland/L=Amsterdam/O=TERENA/CN=TERENA SSL CA 3
Client-SSL-Cert-Subject: /C=SE/ST=Stockholm/L=Stockholm/O=SUNET/CN=*.sunet.se
Client-SSL-Cipher: ECDHE-RSA-AES256-GCM-SHA384
Client-SSL-Socket-Class: IO::Socket::SSL
X-Cache: MISS
X-Cache-Hits: 0
Visit a simple HTML page
Point your browser to http://wiki.juneday.se/example-data/simple.html. View the source of the page.
Expand using link to the right to see a suggested solution/answer.
On chrome and firefox you press Ctrl-u
Use a command line program to fetch the same simple page
Use a command line program to download the same web page (http://wiki.juneday.se/example-data/simple.html) as you viewed with a browser.
Expand using link to the right to see a suggested solution/answer.
Wget:
$ wget http://wiki.juneday.se/example-data/simple.html
--2017-11-18 16:28:55-- http://wiki.juneday.se/example-data/simple.html
Resolving wiki.juneday.se (wiki.juneday.se)... 129.16.69.98
Connecting to wiki.juneday.se (wiki.juneday.se)|129.16.69.98|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 83 [text/html]
Saving to: ‘simple.html’
simple.html 100%[===================>] 83 --.-KB/s in 0s
2017-11-18 16:28:55 (3,76 MB/s) - ‘simple.html’ saved [83/83]
Curl:
$curl -o simple.html http://wiki.juneday.se/example-data/simple.html
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 83 100 83 0 0 83 0 0:00:01 --:--:-- 0:00:01 1566
Compare the content of the source you viewed in the previous exercise with the content of the file simple.html
. Are they similar?
Expand using link to the right to see a suggested solution/answer.
The contents should be the same:
<html>
<head><title>simple</title></head>
<body>
<p>Hello HTTP</p>
</body>
</html>
Netcat in listening mode
You can start up netcat in something called listening mode, which makes netcat listen for incoming connections (netcat acts like a server). A webserver also listens to incoming connections. The web server waits until some client (browser, curl ..) connects and then the two can begin to communicate (using http). You should start netcat (listening mode using port 8181
) in one terminal. In a way this is faking be a proper web server.
Note: check the text above about netcat/nc to make sure you invoke netcat/nc using the correct name.
Expand using link to the right to see a suggested solution/answer.
$ nc -l -p 8181
Note: netcat/nc need to be started using different names on the platoforms Juneday supports. Here's a list:
- MacOS:
- HomeBrew: netcat is started using
nc
(alt/usr/local/bin/nc
) - MacPorts: netcat is started using
nc6
(alt/opt/local/bin/nc6
) - Without the above the netcat exercises below are, as far as we know, not possible to do
- HomeBrew: netcat is started using
- Cygwin:
nc6
- Debian, Ubuntu, Fedora, RedHat:
nc
Connect to your server
You should now start up netcat in normal mode on port 8181
in another terminal, while keeping the listening netcat running.
Note: You should have two terminals with one netcat process running in each terminal.
Expand using link to the right to see a suggested solution/answer.
$ nc localhost 8181
Send data between the two
Type in some data in one of the terminals and see what happends.
Expand using link to the right to see a suggested solution/answer.
Typing data in the listening netcat:
Hi there
the following can be seen in the client netcat:
Hi there
If we type something in the client:
I am a client
we should be able to see the same text in the listening netcat:
I am a client
Using netcat we can send data between processes. This is what is done with a normal webserver. Can we use netcat as a webserver? Checkc out the following exercises.
Write a script printing HTTP header and HTML content
Write a script that prints out a header, typically like this
HTTP/1.0 200 OK
Connection: close
Date: sat nov 18 21:15:21 CET 2017
Server: netcat special deal
Content-Type: text/html; charset=utf-8
Cache-Control: max-age=60
and some content, typically:
<html>
<head><title>Some web page title</title></head>
<body>
<p>Hello HTTP</p>
</body>
</html>
Invoke the program to make sure it works.
Expand using link to the right to see a suggested solution/answer.
Store the following text in a text file, typically called wserver.sh
#!/bin/bash
#
# Function to output header
#
header()
{
echo "HTTP/1.0 200 OK"
echo "Connection: close"
echo "Date: $(date)"
echo "Server: netcat special deal"
echo "Content-Type: text/html; charset=utf-8"
echo "Cache-Control: max-age=60"
echo ""
echo ""
}
#
# Function to outout content
#
content()
{
echo "<html>"
echo "<head><title>Some web page title</title></head>"
echo "<body>"
echo "<p>Hello HTTP</p>"
echo "</body>"
echo "</html>"
}
CONTENT=$(content)
CLENGTH=$(echo $CONTENT | wc -c)
LENGTH=$(( $CLENGTH + 1 ))
header
echo ${CONTENT}
# close with EOF
exec 1>&-
Make sure to make the script executable: chmod a+x wserver.sh
and then invoke the script:
$ ./wserver.sh
HTTP/1.0 200 OK
Connection: close
Date: lör nov 18 21:24:48 CET 2017
Server: netcat special deal
Content-Type: text/html; charset=utf-8
Cache-Control: max-age=60
<html>
<head><title>Some web page title</title></head>
<body>
<p>Hello HTTP</p>
</body>
</html>
The source above can be found here: wserver.sh or here wserver.sh if you want to view in your browser.
Start netcat in listning mode and use wserver.sh
We can now start netcat in listening mode on port 8181 and use wserver.sh
to "answer" (output) text when someone connects. To start netcat in listnening mode we do:
$ nc -l -p 8181
but then we need to type in the HTTP response and the HTML content our selves. Let's use pipes instead:
$ ./wserver.sh | nc -l -p 8181
Connect to the above server using curl
Use curl to connect to localhost
on port 8181
. What do you see?
Expand using link to the right to see a suggested solution/answer.
$ curl localhost:8181
<html>
<head><title>Some web page title</title></head>
<body>
<p>Hello HTTP</p>
</body>
</html>
What we see is the content from wserver.sh
. Cool isn't it :)
Connect to the above server using a browser
Start netcat in listening mode with wserver.sh
as you did before. Now, instead of connecting using curl, you should use a browser. Type in http//localhost:8181
in the address bar (location). What do you see?
Use curl to connect to localhost
on port 8181
. What do you see?
Expand using link to the right to see a suggested solution/answer.
You should see
Hello HTTP
Connect to the above server using a browser
Try reloading the page above. It doesnät work. Why?
Expand using link to the right to see a suggested solution/answer.
Since the netcat (listneing mode) exits after communication is done there is no longer a server listening on port 8181
.
Make netcat (listening mode) listen forever
You can use a while loop in bash to make netcat start up again once it has finished communicating with a client. If you do this you can reload the page in a browser.
Expand using link to the right to see a suggested solution/answer.
You should see
$ while (true); do ./wserver.sh | nc -l -p 8181; done
Download an image using curl
You can use curl to download other stuff than HTML pages. Let's try with this image http://wiki.juneday.se/example-data/progund.png
:
Expand using link to the right to see a suggested solution/answer.
$ curl --output progund.png http://wiki.juneday.se/example-data/progund.png
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 10135 100 10135 0 0 10135 0 0:00:01 --:--:-- 0:00:01 224k
or
$ curl -Oq http://wiki.juneday.se/example-data/progund.png
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 10135 100 10135 0 0 10135 0 0:00:01 --:--:-- 0:00:01 224k
Note that also curl may come in different versions. If you want to investigate the various options, you can ask curl what they mean:
$ curl --help |grep output -f, --fail Fail silently (no output at all) on HTTP errors (H) -i, --include Include protocol headers in the output (H/F) -N, --no-buffer Disable buffering of the output stream -o, --output FILE Write to FILE instead of stdout -O, --remote-name Write output to a file named as the remote file -R, --remote-time Set the remote file's time on the local output -s, --silent Silent mode (don't output anything) --trace-ascii FILE Like --trace, but without hex output --trace-time Add time stamps to trace/verbose output -w, --write-out FORMAT Use output FORMAT after completion $ curl --help |grep '\-O' -O, --remote-name Write output to a file named as the remote file
Providing data in JSON format
Let's say we want to provide a client (a program) with a list of products. One way would be to open up our database and let anyone connect. Not very secure is it? Ok, we know, you could require user and password, but still this is not a good solution. The usual way to transfer information such as a list of products is to use XML or JSON. Instead of writing a server reading data from a database and producing JSON we can fake it all. This is oftan a useful strategy if you want to focus on the client insetad of the server. It is not far fetched to imagine a case where a development team is divided into two smaller teams. One developing the server and the other team developing the client.
Download (info on downloading) the script produce-json-static.sh. Make sure the script has execute permission (chmod a+x produce-json-static.sh
). You can now execute the script:
$ ./produce-json-static.sh
HTTP/1.0 200 OK
Connection: close
Date: Sat Nov 18 23:17:22 CET 2017
Server: netcat special deal
Content-Type: application/json
Cache-Control: max-age=60
[
{
"name" : "Johanneshof Reinisch Pinot Noir",
"price" : "143.0",
"alcohol" : "13.0"
},
{
"name" : "Dobogó Tokaji Furmint",
"price" : "187.0",
"alcohol" : "13.5"
}
]
You should now start up netcat in listening mode and use the script's output to be sent to the client. The netcat program should restart for ever and ever ....
Expand using link to the right to see a suggested solution/answer.
while (true) ; do ./produce-json-static.sh | nc -l -p 8181 ; done
Providing data from a database
This time we're going to use We have written a small script that outputs (to stdout) products in a database in JSON format.
Note: we do not require you to understand the JSON format nor the bash script producing JSON. We want you to have seen a typical way of transferring data between a client and a server.
Download the script produce-json.sh and the database bolaget.db. Make sure the script has execute permission (chmod a+x produce-json.sh
). You can now execute the script:
$ ./produce-json.sh
The script limits the number of products to 20. If you want to output them all:
$ ./produce-json.sh --all
If you want to use another limit (e g 3)
$ ./produce-json.sh 3
Ok, finally we can start to work. You should now start up netcat in listening mode and use the script's output to be sent to the client. The netcat program should restart for ever and ever ....
Note: this bash script is painfully slow. We do not in any way recommend writing critical webservers using bash together with netcat. We use it here since we believe it gives you a good insight in how data are transferred.
Expand using link to the right to see a suggested solution/answer.
while (true) ; do ./produce-json.sh 2 | nc -l -p 8181 ; done
Access the JSON data using curl
Use curl (in another terminal) to get the JSON data.
Expand using link to the right to see a suggested solution/answer.
$ curl localhost:8181
[
{
"name" : "Johanneshof Reinisch Pinot Noir",
"price" : "143.0",
"alcohol" : "13.0"
},
{
"name" : "Dobogó Tokaji Furmint",
"price" : "187.0",
"alcohol" : "13.5"
}
]
Links
Solutions (source code)
TODO: Verify that these are the correct files
- https://raw.githubusercontent.com/progund/web-misc/master/exercises/produce-json-static.sh
- https://github.com/progund/web-misc/blob/master/exercises/produce-json.sh
- https://github.com/progund/web-misc/raw/master/exercises/bolaget.db
- https://raw.githubusercontent.com/progund/web-misc/master/exercises/wserver.sh
End inclusion Web:HTTP and HTTP_-_Exercises
Note: this page is an inclusion of this page: Introduction_to_HTML.
This chapter introduces the very basics of HTML - HyperText Markup Language.
HTML is a text markup language used on the web. It is typically viewed with a web browser, which is a program for navigating the web and parsing and displaying documents written in HTML. Using tags to mark up content, HTML will represent a "page" with structure and elements.
Most of a page written in HTML consists of text, but HTML can also include images, sound and even video. Using embedded content written in certain web programming languages, a web page can also contain applications with interactive capabilities, like games and user interfaces. Client-side logic can also be achieved by incorporating (or referencing) JavaScript, a scripting language originally developed for adding programmable logic to (otherwise static) web pages.
Another key component of a web page is the hyperlink. A hyperlink is a clickable object (typically text but images can also be used as a clickable link).
Now, a web page is meant to be rendered by a browser (or some other application with HTML capabilities) and it has to be written using the HTML markup language. However, this is not a web design course (and the authors are certainly not web developers or designers!). We include this introduction to HTML in our Web basics book, purely for orientation since we do believe that a basic understanding of the web requires also basic knowledge of HTML and related topics.
Structure of an HTML document
The document starts with a doctype declaration. In HTML5 it looks as the following:
<!DOCTYPE html>
In HTML4, an example of a doctype declaration could be:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
HTML prior to HTML5 includes a DTD in the doctype. A DTD is a document type definition, a formal grammar (set of rules) for the markup language version in question. The one above is for HTML 4.01 with strict rules.
After the doctype declaration follows a hiearchy of "tags". Tags are written using <
and >
. A tag is usually a pair of "start tag" and "end tag" with some content (or other tags) in between. The hierarchical properties of the tags comes from the fact that they can be nested in a hierarchy, like a tree structure. The root element is the root of the tree, and inside it you put branches and leafs. The root element in HTML is <html>
. The end-tag following <html>
is </html>
(which is also the last tag in the document). Most tags work like this, you open with <sometag>
, then some content and close with </sometag>
Inside the root element, we find the <head>
element (which can contain for instance a <title>
element). After the <head>
follows the body.
Here's a small but complete document:
<!DOCTYPE html>
<head>
<title>This is the page title</title>
</head>
<body>
<h1>This is a level one header</h1>
<p>
This is a paragraph of text with a <a href="http://wiki.juneday.se">link to Juneday</a>.
</p>
</body>
</html>
You can save the above markup in a file called example.html
and open it in a browser to investigate how the browser will display the content. You can download the file here (use e.g. wget
to download it).
As you see in the example above, you may use indentation to make the structure more visible. White space is generally ignored in HTML.
More about tags
Opening tags, closing tags and attributes
So, a tag usually comes in a pair with an opening tag and some content (which might be text or other tags) and a closing tag, like <title>This is the title</title>
. Tags have a name (specified in the HTML specification) but can also have "attributes". Attributes are like metadata about the tag defining some property of the tag.
For instance the so called anchor tag for links has an attribute defining the target of the link (the destination for those clicking on the link):
<a href="http://wiki.juneday.se">This is the link text</a>
The attribute has a name and a value. The value follows the equal sign after the name and should be placed inside double quotes.
Structural elements
The tags for purely structuring the document are many. We'll list a few of them here.
Headers
Headers are for a logical structure of your document. They have levels where the level one headers ( <h1>Top level header</h1>
) are for top level section headers of your text. A header for a subsection would then be <h2>Subsection header</h2>
and then level 3, level 4 etc. The name of the header tags start with an "h" and then the number of the level:
<h1>This is the top level header</h1>
<p>A paragraph of text for the top level section of your text</p>
<h2>This it a level 2 subsection header</h2>
<p>This is some text for the subsection</p>
The top level header will be rendered larger by the browser. The following levels will be rendered increasingly smaller and smaller (to some degree - rendering varies between browsers). The structural elements (tags) are for logical structure and how they affect the rendered page varies between browsers.
Paragraphs
Paragraphs of text are created with the "p tag":
<p>
This is a paragraph of text.
</p>
<p>
This is another paragraph of text.
</p>
Lists
You can create structural lists using the ol tag (for ordered lists) or the ul tag (for un-ordered lists). The elements of a list is created using the li tag:
<ul>
<li>This is element one of an unordered list</li>
<li>This is element two of the list</li>
<li>And this is element three</li>
</ul>
<ol>
<li>This list is automatically ordered (numbered)</li>
<li>This will be numbered 2 then</li>
<li>And this will be numbered 3!</li>
</ol>
End inclusion Introduction to HTML