Scrigroup - Documente si articole

     

HomeDocumenteUploadResurseAlte limbi doc
AccessAdobe photoshopAlgoritmiAutocadBaze de dateCC sharp
CalculatoareCorel drawDot netExcelFox proFrontpageHardware
HtmlInternetJavaLinuxMatlabMs dosPascal
PhpPower pointRetele calculatoareSqlTutorialsWebdesignWindows
WordXml

AspAutocadCDot netExcelFox proHtmlJava
LinuxMathcadPhotoshopPhpSqlVisual studioWindowsXml

Connecting Java to CGI

java



+ Font mai mare | - Font mai mic



Connecting Java to CGI

A Java program can send a CGI request to a server just like an HTML page can. As with HTML pages, this request can be either a GET or a POST. In addition, the Java program can intercept the output of the CGI program, so you don't have to rely on the program to format a new page and force the user to back up from one page to another if something goes wrong. In fact, the appearance of the program can be the same as the previous version.

It also turns out that the code is simpler, and that CGI isn't difficult to write after all. (An innocent statement that's true of many things - after you understand them.) So in this section you'll get a crash course in CGI programming. To solve the general problem, some CGI tools will be created in C++ that will allow you to easily write a CGI program to solve any problem. The benefit to this approach is portability - the example you are about to see will work on any system that supports CGI, and there's no problem with firewalls.



This example also works out the basics of creating any connection with applets and CGI programs, so you can easily adapt it to your own projects.

Encoding data for CGI

In this version, the name and the email address will be collected and stored in the file in the form:

First Last <email@domain.com>;

This is a convenient form for many mailers. Since two fields are being collected, there are no shortcuts because CGI has a particular format for encoding the data in fields. You can see this for yourself if you make an ordinary HTML page and add the lines:

<Form method='GET' ACTION='/cgi-bin/Listmgr2.exe'>

<P>Name: <INPUT TYPE = 'text' NAME = 'name'

VALUE = '' size = '40'></p>

<P>Email Address: <INPUT TYPE = 'text'

NAME = 'email' VALUE = '' size = '40'></p>

<p><input type = 'submit' name = 'submit' > </p>

</Form>

This creates two data entry fields called name and email, along with a submit button that collects the data and sends it to a CGI program. Listmgr2.exe is the name of the executable program that resides in the directory that's typically called "cgi-bin" on your Web server. (If the named program is not in the cgi-bin directory, you won't see any results.) If you fill out this form and press the "submit" button, you will see in the URL address window of the browser something like:

https://www.myhome.com/cgi-bin/Listmgr2.exe?
name=First+Last&email=email@domain.com&submit=Submit

(Without the line break, of course). Here you see a little bit of the way that data is encoded to send to CGI. For one thing, spaces are not allowed (since spaces typically separate command-line arguments). Spaces are replaced by '+' signs. In addition, each field contains the field name (which is determined by the HTML page) followed by an '=' and the field data, and terminated by a '&'.

At this point, you might wonder about the '+', '=,' and '&'. What if those are used in the field, as in "John & Marsha Smith"? This is encoded to:

John+%26+Marsha+Smith

That is, the special character is turned into a '%' followed by its ASCII value in hex.

Fortunately, Java has a tool to perform this encoding for you. It's a static method of the class URLEncoder called encode( ). You can experiment with this method using the following program:

//: EncodeDemo.java

// Demonstration of URLEncoder.encode()

import java.net.*;

public class EncodeDemo

} ///:~

This takes the command-line arguments and combines them into a string of words separated by spaces (the final space is removed using String.trim( )). These are then encoded and printed.

To invoke a CGI program, all the applet needs to do is collect the data from its fields (or wherever it needs to collect the data from), URL-encode each piece of data, and then assemble it into a single string, placing the name of each field followed by an '=', followed by the data, followed by an '&'. To form the entire CGI command, this string is placed after the URL of the CGI program and a '?'. That's all it takes to invoke any CGI program, and as you'll see you can easily do it within an applet.

The applet

The applet is actually considerably simpler than NameSender.java, partly because it's so easy to send a GET request and also because no thread is required to wait for the reply. There are now two fields instead of one, but you'll notice that much of the applet looks familiar, from NameSender.java.

//: NameSender2.java

// An applet that sends an email address

// via a CGI GET, using Java 1.02.

import java.awt.*;

import java.applet.*;

import java.net.*;

import java.io.*;

public class NameSender2 extends Applet

public boolean action (Event evt, Object arg)

str = email.getText().trim();

if(str.indexOf(' ') != -1)

if(str.indexOf(',') != -1)

if(str.indexOf('@') == -1)

if(str.indexOf('@') == 0)

String end =

str.substring(str.indexOf('@'));

if(end.indexOf('.') == -1)

// Build and encode the email data:

String emailData =

'name=' + URLEncoder.encode(

name.getText().trim()) +

'&email=' + URLEncoder.encode(

email.getText().trim().toLowerCase()) +

'&submit=Submit';

// Send the name using CGI's GET process:

try catch(MalformedURLException e) catch(IOException e)

}

else return super.action(evt, arg);

return true;

}

} ///:~

The name of the CGI program (which you'll see later) is Listmgr2.exe. Many Web servers are Unix machines (mine runs Linux) that don't traditionally use the .exe extension for their executable programs, but you can call the program anything you want under Unix. By using the .exe extension the program can be tested without change under both Unix and Win32.

As before, the applet sets up its user interface (with two fields this time instead of one). The only significant difference occurs inside the action( ) method, which handles the button press. After the name has been checked, you see the lines:

String emailData =

'name=' + URLEncoder.encode(

name.getText().trim()) +

'&email=' + URLEncoder.encode(

email.getText().trim().toLowerCase()) +

'&submit=Submit';

// Send the name using CGI's GET process:

try

public boolean action (Event evt, Object arg) catch(Exception e)

}

else return super.action(evt, arg);

return true;

}

} ///:~

The beauty of the URL class is how much it shields you from. You can connect to Web servers without knowing much at all about what's going on under the covers.

The CGI program in C++

At this point you could follow the previous example and write the CGI program for the server using ANSI C. One argument for doing this is that ANSI C can be found virtually everywhere. However, C++ has become quite ubiquitous, especially in the form of the GNU C++ Compiler (g++) that can be downloaded free from the Internet for virtually any platform (and often comes pre-installed with operating systems such as Linux). As you will see, this means that you can get the benefit of object-oriented programming in a CGI program.

To avoid throwing too many new concepts at you all at once, this program will not be a "pure" C++ program; some code will be written in plain C even though C++ alternatives exist. This isn't a significant issue because the biggest benefit in using C++ for this program is the ability to create classes. Since what we're concerned with when parsing the CGI information is the field name-value pairs, one class (Pair) will be used to represent a single name-value pair and a second class (CGI_vector) will automatically parse the CGI string into Pair objects that it will hold (as a vector) so you can fetch each Pair out at your leisure.

This program is also interesting because it demonstrates some of the pluses and minuses of C++ in contrast with Java. You'll see some similarities; for example the class keyword. Access control has identical keywords public and private, but they're used differently: they control a block instead of a single method or field (that is, if you say private: each following definition is private until you say public:). Also, when you create a class, all the definitions automatically default to private.

One of the reasons for using C++ here is the convenience of the C++ Standard Template Library. Among other things, the STL contains a vector class. This is a C++ template, which means that it will be configured at compile time so it will hold objects of only a particular type (in this case, Pair objects). Unlike the Java Vector, which will accept anything, the C++ vector template will cause a compile-time error message if you try to put anything but a Pair object into the vector, and when you get something out of the vector it will automatically be a Pair object, without casting. Thus, the checking happens at compile time and produces a more robust program. In addition, the program can run faster since you don't have to perform run-time casts. The vector also overloads the operator[] so you have a convenient syntax for extracting Pair objects. The vector template will be used in the creation of CGI_vector, which you'll see is a fairly short definition considering how powerful it is.

On the down side, look at the complexity of the definition of Pair in the following code. Pair has more method definitions than you're used to seeing in Java code, because the C++ programmer must know how to control copying with the copy-constructor and assignment with the overloaded operator=. As described in Chapter 12, occasionally you need to concern yourself with similar things in Java, but in C++ you must be aware of them almost constantly.

The project will start with a reusable portion, which consists of Pair and CGI_vector in a C++ header file. Technically, you shouldn't cram this much into a header file, but for these examples it doesn't hurt anything and it will also look more Java-like, so it will be easier for you to read:

//: CGITools.h

// Automatically extracts and decodes data

// from CGI GETs and POSTs. Tested with GNU C++

// (available for most server machines).

#include <string.h>

#include <vector> // STL vector

using namespace std;

// A class to hold a single name-value pair from

// a CGI query. CGI_vector holds Pair objects and

// returns them from its operator[].

class Pair

Pair(char* name, char* value)

const char* name() const

const char* value() const

// Test for 'emptiness'

bool empty() const

// Automatic type conversion for boolean test:

operator bool() const

// The following constructors & destructor are

// necessary for bookkeeping in C++.

// Copy-constructor:

Pair(const Pair& p) else

}

// Assignment operator:

Pair& operator=(const Pair& p) else

return *this;

}

~Pair()

// If you use this method outide this class,

// you're responsible for calling 'delete' on

// the pointer that's returned:

static char*

decodeURLString(const char* URLstr) else // An ordinary character

result[j] = URLstr[i];

}

return result;

}

// Translate a single hex character; used by

// decodeURLString():

static char translateHex(char hex)

};

// Parses any CGI query and turns it

// into an STL vector of Pair objects:

class CGI_vector : public vector<Pair>

// Destructor:

~CGI_vector()

private:

// Produces name-value pairs from the query

// string. Returns an empty Pair when there's

// no more query string left:

Pair nextPair()

return Pair(name, value);

}

}; ///:~

After the #include statements, you see a line that says:

using namespace std;

Namespaces in C++ solve one of the problems taken care of by the package scheme in Java: hiding library names. The std namespace refers to the Standard C++ library, and vector is in this library so the line is required.

The Pair class starts out looking pretty simple: it just holds two (private) character pointers, one for the name and one for the value. The default constructor simply sets these pointers to zero, since in C++ an object's memory isn't automatically zeroed. The second constructor calls the method decodeURLString( ) that produces a decoded string in newly-allocated heap memory. This memory must be managed and destroyed by the object, as you will see in the destructor. The name( ) and value( ) methods produce read‑only pointers to the respective fields. The empty( ) method is a way for you to ask the Pair object whether either of its fields are empty; it returns a bool, which is C++'s built‑in primitive Boolean data type. The operator bool( ) uses a special case of C++ operator overloading, which allows you to control automatic type conversion. If you have a Pair object called p and you use it in an expression in which a Boolean result is expected, such as if(p)

Pair(char* name, char* value)

const char* name() const

const char* value() const

// Test for 'emptiness'

bool empty() const

// Automatic type conversion for boolean test:

operator bool() const

(Also, for this case decodeURLString( ) returns a string instead of a char*.) You don't need to define a copy-constructor, operator=, or destructor because the compiler does that for you, and does it correctly. But even if it sometimes works automatically, C++ programmers must still know the details of copy-construction and assignment.

The remainder of the Pair class consists of the two methods decodeURLString( ) and a helper method translateHex( ), which is used by decodeURLString( ). (Note that translateHex( ) does not guard against bad user input such as "%1H.") After allocating adequate storage (which must be released by the destructor), decodeURLString( ) moves through and replaces each '+' with a space and each hex code (beginning with a '%') with the appropriate character.

CGI_vector parses and holds an entire CGI GET command. It is inherited from the STL vector, which is instantiated to hold Pairs. Inheritance in C++ is denoted by using a colon at the point you'd say extends in Java. In addition, inheritance defaults to private so you'll almost always need to use the public keyword as was done here. You can also see that CGI_vector has a copy-constructor and an operator=, but they're both declared as private. This is to prevent the compiler from synthesizing the two functions (which it will do if you don't declare them yourself), but it also prevents the client programmer from passing a CGI_vector by value or from using assignment.

CGI_vector's job is to take the QUERY_STRING and parse it into name-value pairs, which it will do with the aid of Pair. First it copies the string into locally-allocated memory and keeps track of the starting address with the constant pointer start. (This is later used in the destructor to release the memory.) Then it uses its method nextPair( ) to parse the string into raw name-value pairs, delimited by '=' and '&' signs. These are handed by nextPair( ) to the Pair constructor so nextPair( ) can return the Pair object, which is then added to the vector with push_back( ). When nextPair( ) runs out of QUERY_STRING, it returns zero.

Now that the basic tools are defined, they can easily be used in a CGI program, like this:

//: Listmgr2.cpp

// CGI version of Listmgr.c in C++, which

// extracts its input via the GET submission

// from the associated applet. Also works as

// an ordinary CGI program with HTML forms.

#include <stdio.h>

#include 'CGITools.h'

const char* dataFile = 'list2.txt';

const char* notify = 'Bruce@EckelObjects.com';

#undef DEBUG

// Similar code as before, except that it looks

// for the email name inside of '<>':

int inList(FILE* list, const char* emailName)

return 0;

}

void main()

// For a CGI 'GET,' the server puts the data

// in the environment variable QUERY_STRING:

CGI_vector query(getenv('QUERY_STRING'));

#if defined(DEBUG)

// Test: dump all names and values

for(int i = 0; i < query.size(); i++)

#endif(DEBUG)

Pair name = query[0];

Pair email = query[1];

if(name.empty() || email.empty())

if(inList(list, email.value()))

// It's not in the list, add it:

fseek(list, 0, SEEK_END);

fprintf(list, '%s <%s>;n',

name.value(), email.value());

fflush(list);

fclose(list);

printf('%s <%s> added to listn',

name.value(), email.value());

} ///:~

The alreadyInList( ) function is almost identical to the previous version, except that it assumes all email names are inside '<>'.

When you use the GET approach (which is normally done in the HTML METHOD tag of the FORM directive, but which is controlled here by the way the data is sent), the Web server grabs everything after the '?' and puts in into the environment variable QUERY_STRING. So to read that information you have to get the value of QUERY_STRING, which you do using the standard C library function getenv( ). In main( ), notice how simple the act of parsing the QUERY_STRING is: you just hand it to the constructor for the CGI_vector object called query and all the work is done for you. From then on you can pull out the names and values from query as if it were an array. (This is because the operator[] is overloaded in vector.) You can see how this works in the debug code, which is surrounded by the preprocessor directives #if defined(DEBUG) and #endif(DEBUG).

Now it's important to understand something about CGI. A CGI program is handed its input in one of two ways: through QUERY_STRING during a GET (as in this case) or through standard input during a POST. But a CGI program sends its output through standard output, typically using printf( ) in a C program. Where does this output go? Back to the Web server, which decides what to do with it. The server makes this decision based on the content-type header, which means that if the content-type header isn't the first thing it sees, it won't know what to do with the data. Thus, it's essential that you start the output of all CGI programs with the content-type header.

In this case, we want the server to feed all the information directly back to the client program (which is our applet, waiting for its reply). The information should be unchanged, so the content-type is text/plain. Once the server sees this, it will echo all strings right back to the client. So each of the strings you see, three for error conditions and one for a successful add, will end up back at the applet.

Adding the email name uses the same code. In the case of the CGI script, however, there isn't an infinite loop - the program just responds and then terminates. Each time a CGI request comes in, the program is started in response to that request, and then it shuts down. Thus there is no possibility of CPU hogging, and the only performance issue concerns starting the program up and opening the file, which are dwarfed by the overhead of the Web server as it handles the CGI request.

One of the advantages of this design is that, now that Pair and CGI_vector are defined, most of the work is done for you so you can easily create your own CGI program simply by modifying main( ). Eventually, servlet servers will probably be ubiquitous, but in the meantime C++ is still handy for creating fast CGI programs.

What about POST?

Using a GET is fine for many applications. However, GET passes its data to the CGI program through an environment variable, and some Web servers can run out of environment space with long GET strings (you should start worrying at about 200 characters). CGI provides a solution for this: POST. With POST, the data is encoded and concatenated the same way as a GET, but POST uses standard input to pass the encoded query string to the CGI program. All you have to do is determine the length of the query string, and this length is stored in the environment variable CONTENT_LENGTH. Once you know the length, you can allocate storage and read the precise number of bytes from standard input.

The Pair and CGI_vector from CGITools.h can be used as is for a CGI program that handles POSTs. The following listing shows how simple it is to write such a CGI program. In this example, "pure" C++ will be used so the stdio.h library will be dropped in favor of iostreams. With iostreams, two predefined objects are available: cin, which connects to standard input, and cout, which connects to standard output. There are several ways to read from cin and write to cout, but the following program take the common approach of using '<<' to send information to cout, and the use of a member function (in this case, read( )) to read from cin.

//: POSTtest.cpp

// CGI_vector works as easily with POST as it

// does with GET. Written in 'pure' C++.

#include <iostream.h>

#include 'CGITools.h'

void main()

int len = atoi(clen);

char* query_str = new char[len + 1];

cin.read(query_str, len);

query_str[len] = '0';

CGI_vector query(query_str);

// Test: dump all names and values

for(int i = 0; i < query.size(); i++)

cout << 'query[' << i << '].name() = [' <<

query[i].name() << '], ' <<

'query[' << i << '].value() = [' <<

query[i].value() << ']' << endl;

delete query_str; // Release storage

} ///:~

The getenv( ) function returns a pointer to a character string representing the content length. If this pointer is zero, the CONTENT_LENGTH environment variable has not been set, so something is wrong. Otherwise, the character string must be converted to an integer using the ANSI C library function atoi( ). The length is used with new to allocate enough storage to hold the query string (plus its null terminator), and then read( ) is called for cin. The read( ) function takes a pointer to the destination buffer and the number of bytes to read. The query_str is then null-terminated to indicate the end of the character string.

At this point, the query string is no different from a GET query string, so it is handed to the constructor for CGI_vector. The different fields in the vector are then available just as in the previous example.

To test this program, you must compile it in the cgi-bin directory of your host Web server. Then you can perform a simple test by writing an HTML page like this:

<HTML>

<HEAD>

<META CONTENT='text/html'>

<TITLE>A test of standard HTML POST</TITLE>

</HEAD>

Test, uses standard html POST

<Form method='POST' ACTION='/cgi-bin/POSTtest'>

<P>Field1: <INPUT TYPE = 'text' NAME = 'Field1'

VALUE = '' size = '40'></p>

<P>Field2: <INPUT TYPE = 'text' NAME = 'Field2'

VALUE = '' size = '40'></p>

<P>Field3: <INPUT TYPE = 'text' NAME = 'Field3'

VALUE = '' size = '40'></p>

<P>Field4: <INPUT TYPE = 'text' NAME = 'Field4'

VALUE = '' size = '40'></p>

<P>Field5: <INPUT TYPE = 'text' NAME = 'Field5'

VALUE = '' size = '40'></p>

<P>Field6: <INPUT TYPE = 'text' NAME = 'Field6'

VALUE = '' size = '40'></p>

<p><input type = 'submit' name = 'submit' > </p>

</Form>

</HTML>

When you fill this out and submit it, you'll get back a simple text page containing the parsed results, so you can see that the CGI program works correctly.

Of course, it's a little more interesting to submit the data using an applet. Submitting POST data is a different process, however. After you invoke the CGI program in the usual way, you must make a direct connection to the server so you can feed it the query string. The server then turns around and feeds the query string to the CGI program via standard input.

To make a direct connection to the server, you must take the URL you've created and call openConnection( ) to produce a URLConnection. Then, because a URLConnection doesn't usually allow you to send data to it, you must call the magic function setDoOutput(true) along with setDoInput(true) and setAllowUserInteraction(false). Finally, you can call getOutputStream( ) to produce an OutputStream, which you wrap inside a DataOutputStream so you can talk to it conveniently. Here's an applet that does just that, after collecting data from its various fields:

//: POSTtest.java

// An applet that sends its data via a CGI POST

import java.awt.*;

import java.applet.*;

import java.net.*;

import java.io.*;

public class POSTtest extends Applet

p.add(l);

p.add(submit);

add('North', p);

add('South', ta);

}

public boolean action (Event evt, Object arg)

in.close();

}

catch (Exception e)

}

else return super.action(evt, arg);

return true;

}

} ///:~

Once the information is sent to the server, you can call getInputStream( ) and wrap the return value in a DataInputStream so that you can read the results. One thing you'll notice is that the results are displayed as lines of text in a TextArea. Why not simply use getAppletContext().showDocument(u)? Well, this is one of those mysteries. The code above works fine, but if you try to use showDocument( ) instead, everything stops working - almost. That is, showDocument( ) does work, but what you get back from POSTtest is "Zero CONTENT_LENGTH." So somehow, showDocument( ) prevents the POST query from being passed on to the CGI program. It's difficult to know whether this is a bug that will be fixed, or some lack of understanding on my part (the books I looked at were equally abstruse). In any event, if you can stand to limit yourself to looking at the text that comes back from the CGI program, the above applet works fine.



You can test this under Windows32 using the Microsoft Personal Web Server that comes with Microsoft Office 97 and some of their other products. This is a nice way to experiment since you can perform local tests (and it's also very fast). If you're on a different platform or if you don't have Office 97 you might be able to find a freeware Web server for testing by searching the Internet.

GNU stands for "Gnu's Not Unix." The project, created by the Free Software Foundation, was originally intended to replace the Unix operating system with a free version of that OS. Linux appears to have replaced this initiative, but the GNU tools have played an integral part in the development of Linux which comes packaged with many GNU components.

My book Thinking in C++ (Prentice-Hall, 1995) devotes an entire chapter to this subject. Refer to this if you need further information on the subject.

I can't say I really understand what's going on here, but I managed to get it working by studying Java Network Programming by Elliotte Rusty Harold (O'Reilly 1997). He alludes to a number of confusing bugs in the Java networking libraries, so this is an area where in which you can't just write code and have it working right away. Be warned.



Politica de confidentialitate | Termeni si conditii de utilizare



DISTRIBUIE DOCUMENTUL

Comentarii


Vizualizari: 931
Importanta: rank

Comenteaza documentul:

Te rugam sa te autentifici sau sa iti faci cont pentru a putea comenta

Creaza cont nou

Termeni si conditii de utilizare | Contact
© SCRIGROUP 2024 . All rights reserved