doing TODAY and not getting caught in the HYPE of tomorrow

Alan Williamson

Subscribe to Alan Williamson: eMailAlertsEmail Alerts
Get Alan Williamson: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Related Topics: Apache Web Server Journal, Java Developer Magazine

Apache Web Server: Article

To Serve or not to Serve

To Serve or not to Serve

Introduction
Just when you were beginning to get the hang of Java and had figured out it was more than just an animation tool, out comes yet another Java-related technology, complete with its own set of rules and conditions to dazzle and confuse. But what is so special about this new one, that we should stop and take note of it?

Up until now, Java has been very much associated with the client side of the client/server equation, popping up in applets and Beans. The server side has been relatively untouched. In fact, for a while it looked as if the Java community had forgotten all about the server, determined to take over the world with applets. But now that is about to change. The Java Servlet API has arrived and world domination is just around the corner!

OK, I may be a little bit passionate, but the Servlet API is going to make an enormous impact on the Web community and especially the client/server arena. This article addresses what the Servlet API actually is and where it can be used. Towards the conclusion, we will look at implementing an example program that demonstrates the power of this new technology.

Servlets
So what is the Servlet API? A servlet is a small program that runs in response to a client connection or request. The server facilitates the connection process with the client and, once the connection has been established, hands over the processing to a servlet.

As a demonstration of its power and flexibility, JavaSoft has developed a Java-based web server, JWS. JWS is a commercially available web server supporting all the features currently found in mainstream web servers but with the addition of supporting servlet execution. Fortunately, JWS is not the only web server to be supporting the servlet API. At the time of this writing, Apache, Netscape and Microsoft have released extensions to their existing web servers to support servlets.

Note: The term 'servlets' is not unique to JavaSoft's implementation. Netscape has its own Servlet API that works very similarly to JavaSoft's offering. However, this version, although Java-based, is supported only on Netscape's own Web and Enterprise Servers and no plans for other server vendors to support it have been announced.

Servlets are to the server what applets are to the client. They extend the functionality offered by the server just as applets enrich the browser environment.

For this reason, this article concentrates on the servlet API, demonstrating the power this new set of classes has over existing technologies.

A Better CGI?
The CGI (Common Gateway Interface) standard is a set of rules governing the collection and passing of data to back end programs. For this reason, CGI was well suited to the Web community. CGI gave the ability to create dynamic Web pages, depending on input from the user. For example, HTML form processing is commonly implemented with a CGI script. Another popular area in which CGI has been dutifully employed is hit counters - counting the number of visits to a particular page or resource.

So, servlets are just another alternative to the CGI interface? If only the world was as simple as this, wouldn't it be a wonderful place to live? In the most rudimentary definition, the answer would be yes. However, to appreciate the servlet API, one has to look at what CGI actually is and what its shortcomings are.

The CGI interface does not specify the underlying programming language with current script implementations found in PERL, C and C++, to name a few. Every time a CGI script is called upon to execute, a new program instance is created. This involves loading the code, allocating memory and then executing the program, taking the standard output and passing it straight back to the client. This takes a finite amount of time, and with hundreds and possibly thousands of simultaneous requests, the system can become clogged up with processes very quickly. This is because for every new connection, a new program instance is created.

Servlets, on the other hand, operate differently. When a client connection comes in, the servlet is loaded into memory and then run. However, the servlet is not removed. Subsequent requests are run with a simple method call, with multiple requests being handled by the same servlet under the Java multithreading environment. The server maintains a pool of threads and then allocates a thread of execution to each client request that comes in. If a thread is not available, the client blocks until one does become free. This approach has many advantages.

The server never becomes uncontrollably consumed with processes. A fixed number of threads is set by the administrator, which eliminates the scenario in which 1000 client requests equates to 1000 separate processes being created, using up memory and processing cycles. The need to continually reload and initialize programs is completely eliminated, using a 'load-once, run-many' policy.

To be fair to the CGI community, they had foreseen this shortcoming long before servlets were on the agenda. To address this, they came up with a new standard called FastCGI. This standard insisted on CGI programs being written in C and, instead of being loaded every time, the web server would cache them in memory. Using C ensured the fastest possible execution time and by caching the program, the load overhead was eliminated. While significantly improving the CGI performance, as you will discover, FastCGI is still far short of the offerings from the servlet based solution.

Since CGI is a set of rules on how to move data from one point to another, it has no standard libraries or methods as such. For this reason, CGI scripts rely heavily on third party programs to do the majority of their work. For example, when an HTML form has been posted, the most common action to be taken is to mail the data to an e-mail account somewhere on the Internet. To achieve this, the CGI script must package the data into a file and then call upon some other program to perform the actual sending. As you can see, this is not really that efficient as not only does the server have to load and initialize the CGI script itself, it now has to load another program and start its execution. Suddenly the 1000 client connections results, in a worse case scenario, of 2000 processes running.

Servlets are implemented in Java, which comes with a rich set of standard libraries that ensure the majority of tasks are kept inside the Java environment and do not require the assistance of any other program.

Another area of hot debate is the portability of CGI. Many claim it to be very portable between different platforms. However, if a script relies on a third party piece of software, such as the standard 'sendmail' program used for sending e-mails in the UNIX environment, how can this be portable? In order for it to work on an NT system, a 'sendmail' utility would need to be available somewhere on the system with exactly the same operational characteristics. For this reason, CGI may never be 100% portable across platforms.

Java, on the other hand, is Java; a true multiplatform language that relies on no underlying operating system. If everything is implemented in Java, then porting is no longer an issue. It is merely a question of copying the bytecodes (or .class files) to the server and then running. No recompiling and no awkward path names or environment variables to wrestle with. To many, this feature alone is enough to make servlets the winner hands down. Table 1 summarizes the differences between the two methods of processing client connections.

"Hello World"
Now that you have seen why you should be using servlets, let's take a look at how you use servlets and what the class hierarchy is.

All servlets are derived from the GenericServlet class from the javax.servlet package. This class defines all the functionality required in processing a client connection. Listing 1 shows the most famous program in the world: 'Hello World'.

As you can see, the servlet has two methods: init(...) and service(..). The init(...) method is called once at servlet initialization. In this method, all global variable initialization can be performed, or if the servlet has a state change, then the loading of a configuration file can be performed here. The init(...) is not required to be overridden by all implementations.

The service(..) method is where all the work is performed. For each client connection that is accepted, this method is called. This is why a servlet implementation is much more efficient than a CGI implementation. Instead of a new program being created for each client request, a method is simply run.

A servlet has an input and an output stream. The input stream is where data from the client is passed. For example, a POST or a GET may contain information from the client as a result of a form posting. The output stream is the data that is sent back to the client and is typically displayed in their browser. Two classes have been designed to support this mechanism:
ServletRequest and ServletResponse.

Those of you familiar with the Java environment will know all input and output is performed using Streams, not unlike the Streams used in C++.

InputStream in = _Req.getInputStream ();
OutputStream out = _Res.getOutputStream ();

Having a reference to both streams means any of the standard stream handlers can be used for communication. For instance, in our 'Hello World' example we used the PrintStream to give us a very easy means to send strings to the client. We then used it to send out a very simple HTML page with the words 'Hello World!!!' embedded in it.

This example, although not exactly rocket science, demonstrates the use of servlets in creating dynamic HTML pages. This servlet is referenced just like any other resource on the Internet - through the use of a URL. For example, this servlet can have an alias set up to it, which would result in it being accessed using:

http://www.n-ary.com/helloworld.html

The next example will look at a real life problem and show how the servlet is best suited for the task of providing server side solutions.

HTML Form Example
To illustrate servlets in more detail, let's take a look at a very simple servlet that will take all the fields filled in on an HTML page and e-mail them to the user admin@somewhere.com (see Listing 2). For convenience we have employed the services of the sun.net.smtp.SmtpClient class from the Sun libraries, which forms part of the Java Web Server release. The use of these classes are not recommended for production software as they are liable to change in the future, but for this example they will be perfect.

The form2email class is based on the javax.servlet.http.HttpServlet class, since we are servicing an HTTP request. Since this servlet requires no initializing, the init(...) has not been overridden but the service(...) method has. It is here that the client request comes in.

The first part of the service method deals with setting up the outgoing mail; setting the to, from and subject fields of the mail message. The next stage retrieves all the parameters that have been sent by the browser to the servlet. Each parameter can be addressed using the HttpServletRequest.getParameter(...) method, using the name of the variable as defined in the HTML file as the input. However, in this example prior knowledge of the variable names is unknown. Not to worry. The HttpServletRequest.getParameterNames() returns an Enumeration to a list of variable names that are contained within the request. By looping around and calling the getParameter(...) method, we can collate all known parameters from the client.

With the construction of the e-mail message and having sent it, the final task to complete is to inform the client that the request was dealt with. We will send out a simple web page thanking them for filling out the form. As you can see in Listing 2, this is very easy. We set the MIME type of the outgoing data, get the output stream and then write the lines of text to the output stream. That's it.

Although it is a very trivial example, this serves to illustrate the ease by which servlets can be used to process client requests.

A Day in the Life of a Servlet
A servlet is loaded once and remains in memory until the server is restarted. Although on the whole this is true, a servlet can be loaded and unloaded on demand if required. This is controlled through the NAME attribute of a servlet. If a servlet has been given a logical or symbolic name, then it will remain in memory and every time a call to the same logical named servlet is made, the servlet's service(..) method will be called. However, if a servlet has been given no name, after execution it is removed from memory again, thus rendering the operation along the same lines as the CGI alternative.

Future of the Server
In today's heavily used Web environment, the need for more and more processing is becoming quite evident. However, at what side of the equation should this processing be performed: the client or the server?

Fortunately, answering this question is a lot easier when you take the current constraints of the Internet into consideration. At the moment, the limitation in bandwidth dictates that the amount of data being sent to the client should be kept smaller instead of larger. For this reason, the Java Applet is not always the best solution. How many times have you sat looking at a grey rectangle, waiting for the applet to load and begin execution? This introduces an unnecessary waiting factor and tugs at the patience of the user.

The reason the growth in applets took place was to increase the functionality at the client side while reducing the load at the server end. This was before servlets came onto the scene. While a lot of the areas where applets are used cannot be placed back onto the server, there are a lot of applications that could benefit from a servlet implementation instead of an applet solution.

The applet is bogged down with the speed of loading. Applets are growing in size, with the average applet size ranging from anywhere between 10k to 200k. This, in today's Internet environment, is unacceptable to the majority of users. HTML is fast because the actual data transferred is very small. This results in pages being constructed as the data starts arriving and not after all the data has arrived, as with the applet.

Areas that can benefit from a redesign, include database front-ending. Typically, solutions involved a Java applet talking to a CGI script or custom server application to retrieve data from the database. This solution is fraught with problems. Not one section of the solution could stand a change from another section. For example, if the database were to be replaced with a much bigger system, the CGI scripts would need to be redeveloped. This has a knock, in effect, to the Java applet that is front-ending the whole system.

Before servlets, providing a complete Java solution was not that practical. Something still had to take the request from the client and pass it on to the database, and this was either a custom built server or reliance on a CGI solution. Now a complete Java solution is possible.

Servlets can be employed to talk to the client and the database using the JDBC (Java Database Connectivity) interface. JDBC gives a generic interface to the database without worrying about the actual details of the connection. This is handled by a JDBC driver, supplied by the database vendor. By employing this technique the database can be switched in and out without replacing or recompiling any code elsewhere in the solution. The servlet can either send the data back to an applet to display, or alternatively create dynamic HTML pages that not only load quicker, but are faster to process.

This is but one area where a complete Java solution is now practical. On a smaller scale, servlets can be employed everywhere CGI solutions were implemented. These include:

  • Search engines
  • Form processing
  • Page Counters
  • Push technology
  • Random links
  • Localization of pages
  • HTML filters
  • HTML-based news groups
  • HTML chat
  • Guest books
  • Live video/music feeds

    Summary
    This article presented an overview of the new Servlet API available from JavaSoft. Looking at the past history of Java, it won't take long before we start seeing servlets replacing the time honored tradition of the CGI scripts.

    Developers can now develop custom server side applications without worrying about the internal details of each platform. They can get on with solving the problem and not 'work around' to each individual operating system.

  • More Stories By Alan Williamson

    Alan Williamson is widely recognized as an early expert on Cloud Computing, he is Co-Founder of aw2.0 Ltd, a software company specializing in deploying software solutions within Cloud networks. Alan is a Sun Java Champion and creator of OpenBlueDragon (an open source Java CFML runtime engine). With many books, articles and speaking engagements under his belt, Alan likes to talk passionately about what can be done TODAY and not get caught up in the marketing hype of TOMORROW. Follow his blog, http://alan.blog-city.com/ or e-mail him at cloud(at)alanwilliamson.org.

    Comments (0)

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.