CSCE 515: Computer Network Programming ------ Web Servers & Dynamic Web Documents Wenyuan Xu http://www.cse.sc.edu/~wyxu/csce515f07.html Department of Computer Science and Engineering University of South Carolina Reference: http://www.cs.rpi.edu/~hollingd/netprog/notes/dyn_doc/dyn_doc.ppt
Web Server
Talks HTTP
Looks at METHOD, URI to determine what the client wants.
For GET, URI often is just the path of a file (relative to some directory on the web server).
2007
CSCE515 – Computer Network Programming
GET /foo/blah /
usr
bin
foo
www
fun
blah 2007
CSCE515 – Computer Network Programming
etc
gif
In the good old days...
Years ago… the WWW was made up of (mostly) static documents.
Each
URL corresponded to a single file stored on some hard disk.
Today - many of the documents on the WWW are built at request time.
URL
2007
doesn’t correspond to a single file.
CSCE515 – Computer Network Programming
Dynamic Documents
Dynamic Documents can provide:
automation
of web site maintenance
customized advertising
database access ! a e id l o Co
shopping carts
date and time service
2007
CSCE515 – Computer Network Programming
Web Programming Writing programs that create dynamic documents has become very important. There are a number of general approaches:
Create
custom server for each service desired.
Each is available on different port.
Have
web server run external programs.
Develop a real smart web server
2007
SSI, scripting, server APIs. CSCE515 – Computer Network Programming
Custom Server Write a TCP server that watches a “well known” port for requests. Develop a mapping from http requests to service requests. Send back HTML (or whatever) that is created/selected by the server process. Have to handle http errors, headers, etc.
2007
CSCE515 – Computer Network Programming
An Example Custom Server We want to provide a time and date service. Anyone in the world can find out the date and time (according to our computer)!!!
2007
CSCE515 – Computer Network Programming
Custom Server request
We don’t care what is in the http request, our reply doesn’t depend on it.
We assume the request comes from a browser that wants the content formatted as an HTML document.
2007
CSCE515 – Computer Network Programming
WWW based time and date server Copyright @2003 DaveH Enterprises
Listen
on a well known TCP port.
Accept a connection.
Find out the current time and date
Convert time and date to a string
Send back some http headers (Content-Type)
Send the string wrapped in HTML formatting.
Close the connection. loop forever 2007
CSCE515 – Computer Network Programming
Accessing our custom server. We can publish the URL to our server, or embed links to the server in other HTML documents. We need to make sure the server is always running (on the published host and port). Once we are famous we can include advertisements and make money!
2007
CSCE515 – Computer Network Programming
Another Example
Keep track of how many times our server is hit each day.
Report on the number of hits our server got on any day in the past!
2007
CSCE515 – Computer Network Programming
The Request and Reply
The reply now does depend on the request.
We have to remember that the request comes from a HTTP client, so we need to accept HTTP requests.
2007
CSCE515 – Computer Network Programming
Time & Date Hit Server Each request comes as a string (URI) specifying a resource. ! t n a li p Our requests will look like this: m o C 2K Y /mm/dd/yyyy An example URL for our service:
http://www.timedate.com:4567/02/10/2000
We will get a request like: GET /02/10/2000 HTTP/1.1
2007
CSCE515 – Computer Network Programming
Fancy means $$$
We want to provide a table that lists the number of hits received each hour of the day in question timedate.com hit report for 01/17/1999 hour 12-1AM 1-2AM 2-3AM
2007
number of hits 4,320 18,986 246
CSCE515 – Computer Network Programming
HTML Basics
2007
HTML Tables:
,
, |
, |
start/end a table start/end a table row start/end a table cell start/end table header cell
CSCE515 – Computer Network Programming
timedate.com Hit Table
hour | number of hits |
hour 12-1AM 12-1AM | 1-2AM 4,320 | 2-3AM
1-2AM | 18,986 |
2007
number of hits 4,320 18,986 246
CSCE515 – Computer Network Programming
New code
Record
Read
the “hit” in database.
request - parse request to
month,day,year
Lookup
Send
back some http headers (Content-Type)
Create
Close
2007
hits for month,day,year in database.
HTML table and send back to client.
the connection.
CSCE515 – Computer Network Programming
Drawbacks to Custom Server Approach
We might have lots of ideas custom services.
Each
requires dedicated address (port)
Each needs to include: basic TCP server code parsing HTTP requests error handling headers access control (might want users to pay each time they check the time and date!)
2007
CSCE515 – Computer Network Programming
Another Approach Take a general purpose Web server (that can handle static documents) and have it process requested documents as it sends them to the client. The documents could contain commands that the server understands (the server includes some kind of interpreter).
2007
CSCE515 – Computer Network Programming
Example Smart Server Have the server read each HTML file as it sends it to the client. The server could look for this:
<SERVERCODE> some command
The server doesn’t send this part to the client, instead it interprets the command and sends the result to the client. Everything else is sent normally.
2007
CSCE515 – Computer Network Programming
Example Commands <SERVERCODE> Time <SERVERCODE> Date <SERVERCODE> Hitlist <SERVERCODE> Include file <SERVERCODE> randomfile directory
2007
CSCE515 – Computer Network Programming
Example Document <TITLE>timedate.com Home Page Welcome to timedate.com
<SERVERCODE> include fancygraphic The current time is <SERVERCODE> time . Today is <SERVERCODE> date . Visit our sponser: <SERVERCODE> random sponsor
2007
CSCE515 – Computer Network Programming
Real Life - Server Side Includes
Many real web servers support this idea (but not the syntax I’ve shown).
Server Side Includes (SSI) provides a set of commands that a server will interpret.
2007
CSCE515 – Computer Network Programming
SSI Configuration
2007
Typically the server is configured to look for commands only in specially marked documents (so normal documents aren’t slowed down).
CSCE515 – Computer Network Programming
SSI Directives SSI commands are called directives Directives are embedded in HTML comments. A comment looks like this:
A directive looks like this:
2007
CSCE515 – Computer Network Programming
Some SSI Directives echo: inserts the value of an environment variable into the page. SSI servers keep a number of useful things in environment variables: DATE_LOCAL
2007
CSCE515 – Computer Network Programming
SSI echo example The GMT time is: The local time is: Today is
2007
CSCE515 – Computer Network Programming
SSI Directives include: inserts the contents of a text file.
flastmod: inserts the time and date that a file was last modified. This file last modified This file last modified 2007
CSCE515 – Computer Network Programming
SSI Directives (cont.) exec: runs an external program and inserts the output of the program. Current users:
Danger! Danger! Danger!
2007
CSCE515 – Computer Network Programming
SSI Example
It is now:
Today is:
This file last modified
2007
CSCE515 – Computer Network Programming
More Power
Some servers support elaborate scripting languages. Scripts are embedded in HTML documents, the server interprets the script:
Microsoft
Active Server Pages (ASP)
VBScript, .asp
JaveServer
java, .jsp
Hypertext
2007
Pages (JSP)
Preprocessor (PHP)
.php
CSCE515 – Computer Network Programming
Server Mapping and APIs Some servers include a programming interface that allows us to extend the capabilities of the server by writing modules. Specific URLs are mapped to specific modules instead of to files. We could write our timedate.com server as a module and merge it with the web server.
2007
CSCE515 – Computer Network Programming
External Programs
Another approach is to provide a standard interface between external programs and web servers.
We
can run the same program from any web server.
The web server handles all the http, we focus on the special service only.
It doesn’t matter what language we use to write the external program.
2007
CSCE515 – Computer Network Programming
Common Gateway Interface CGI is a standard interface to external programs supported by most (if not all) web servers. The interface that is defined by CGI includes:
Identification
of the service (external
program).
Mechanism for passing the request to the external program.
2007
CSCE515 – Computer Network Programming
CGI Programming
CGI programs are often written in scripting languages (perl, tcl, etc.),
JAVA Servlets
java
technology’s answer to CGI
Install software that implements Java Servlet
2007
Apache Tomcat JavaServer Web Development Kit Sun’s Java Web Server
CSCE515 – Computer Network Programming
An Servlets example import java.io.* import javax.servlet.*; import javax.servlet.http.*; public class HelloWWW extends HttpServlet { public void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { response.setContentType("text/html"); PrintWriter out = response.getWriter(); out.println("\n" + "\n" + "
<TITLE>Hello WWW\n" + "\n" + "Hello WWW
\n" + ""); } }
2007
CSCE515 – Computer Network Programming
Result
2007
CSCE515 – Computer Network Programming
SSI vs. CGI
SSI:
Static
HTML + embedded command
add small amounts of dynamic content to pages, without doing a lot of extra work
CGI
Generate
a complete HTML document in real time
Powerful
2007
CSCE515 – Computer Network Programming
Concurrent Server Design Alternatives
Concurrent Server Design Alternatives One child process per client Spawn one thread per client Preforking multiple processes Prethreaded Server
2007
CSCE515 – Computer Network Programming
One child per client •
Traditional Unix server:
TCP:
after call to accept(), call fork().
UDP: after recvfrom(), call fork().
Each process needs only a few sockets.
Small requests can be serviced in a small amount of time. •
Parent process needs to clean up after children!!!! (call wait() ).
2007
CSCE515 – Computer Network Programming
One thread per client •
•
•
2007
Almost like using fork - call pthread_create instead. Using threads makes it easier (less overhead) to have sibling processes share information. Sharing information must be done carefully (use pthread_mutex)
CSCE515 – Computer Network Programming
Example main(int argc, char **argv) { listenfd=socket(…) Bind(listenfd…) Listen(listenfd,LISTENQ); Signal(SIGGHLD, sig_chld); Signal(SIGINT,sig_int); For( ; ;) { connfd = Accept(listenfd, …); if ( (pid = Fork())==0) { Close(listendf); doit(connfd); Close(connfd); exit(0); } } Close(connfd); }
Process version 2007
main(int argc, char **argv) { pthread_t tid; listenfd=socket(…) Bind(listenfd…) Listen(listenfd,LISTENQ); For( ; ;) { connfd = Accept(listenfd, …); Pthread_creat(&tid, NULL, &doit, (void *) connfd); } } static void doit (void *arg) { … Close( (int) arg); return NULL; }
CSCE515 – Computer Network Programming
Thread version
Prefork()’d Server •
Creating a new process for each client is expensive.
•
We can create a bunch of processes, each of which can take care of a client.
•
Each child process is an iterative server.
2007
CSCE515 – Computer Network Programming
Prefork()’d TCP Server • •
• •
2007
Initial process creates socket and binds to well known address. Process now calls fork() a bunch of times. All children call accept(). The next incoming connection will be handed to one child.
CSCE515 – Computer Network Programming
listen() Sum of both queues cannot exceed backlog
Server accept
TCP
Completed connection queue
3-way handshake complete
Incomplete connection queue
arriving SYN
2007
CSCE515 – Computer Network Programming
Preforking •
•
•
2007
As the book shows, having too many preforked children can be bad. Using dynamic process allocation instead of a hard-coded number of children can avoid problems. The parent process just manages the children, doesn’t worry about clients.
CSCE515 – Computer Network Programming
Sockets library vs. system call •
A preforked TCP server won’t usually work the way we want if sockets is not part of the kernel:
calling
accept() is a library call, not an atomic operation.
•
2007
We can get around this by making sure only one child calls accept() at a time using some locking scheme.
CSCE515 – Computer Network Programming
Prethreaded Server •
Same benefits as preforking.
•
Can also have the main thread do all the calls to accept() and hand off each client to an existing thread.
2007
CSCE515 – Computer Network Programming
What’s the best server design for my application? •
Many factors:
expected
number of simultaneous clients.
Transaction size (time to compute or lookup the answer)
Variability in transaction size.
Available system resources (perhaps what resources can be required in order to run the service). 2007
CSCE515 – Computer Network Programming
Server Design •
•
•
2007
It is important to understand the issues and options. Knowledge of queuing theory can be a big help. You might need to test a few alternatives to determine the best design.
CSCE515 – Computer Network Programming
Related Documents