This chapter provides guidlines for setting up Toolkit applications. In order to set up Toolkit applications, one must understands the following concepts:
Empress includes Apache Web Server in the distribution. This document, in many places, refers to the Empress distribution and Apache server specific locations for the purposes of clarity. If you are using Web server of your choice, apply the actual locations to these references.
This document assumes that you have at least a basic knowledge of browser techonlogy, HTML programming and using CGI.
The Empress Web HTML Toolkit (Toolkit) is a set of CGI programs which perform operations on databases using arguments on the URL, input from HTML Forms, and/or markup in template text files. They allow text pages to be built up on the fly from the results of database queries and from user input.
Applications using the Toolkit are designed to use an HTML, or XML, Browser as the user interface, and to communicate with Empress networked databases.
Setting up an environment where a developer has a good connection from HTML to database may be a job for both an Empress Database Administrator and a Network Administrator. The following items must exist for your Toolkit Applications to run correctly:
The HTML/XML Browser
This is the user interface of your application. HTML browser will only interpret standard HTML; XML browsers will only recognise XML for which they have been programmed. Extra Empress- specific tags like <QUERY> must be processed by the Toolkit Program first and converted into HTML/XML. For whatever processing and connectivity goes on, ultimately the user will interact with standard markup documents on their browser.
The HTTP Server
The hypertext transfer protocol server is needed for any Internet site. It serves up HTML documents from your site (host name plus port number) to the users of the web. A browser requests the URL (essentially the internet location) of an HTML document and the HTTP Server delivers that document. Although they were originally made for a one-way transfer of flat unchanging text files, modern HTTP Servers can also accept an incoming stream of values posted from the HTML user-input forms of the Client Browsers and deliver those values to scripts (or compiled applications) called CGI (Common Gateway Interface) programs. CGI programs can process this information however they choose. If those CGI programs are ODBC, JDBC, Dynamic SQL applications, they can talk to a Database Management System. This is how it all fits together. CGI programs also have the option of writing out strings of data in HTML format back to the HTTP Server, which it will pass on to the browser just as if it came from a real file. Thus HTML tables can be built up and displayed at run-time, based entirely on the immediate contents of the database.
There are two common ways of configuring your HTTP server to recognize your Toolkit CGI Programs:
One is to put them all in a ScriptAlias directory, and,
the other is to configure your server to recognize a particular file extension (for example, .cgi for a CGI program) and rename your Toolkit CGI Programs with this extension. These programs then can reside anywhere in the system.
Every HTTP Server is slightly different, please refer to the proper documentation or ask your Webmaster to set up the server correctly.
The Toolkit CGI Programs
There are two Empress Web HTML Toolkit CGI Programs, ehsql.cgi and ehlink.cgi. (Note: there is a third program (ehdbi), but this runs as a sub-process to ehlink.cgi and is not accessed directly. However it must always be placed in the CGI directory with ehlink.cgi).
These programs pass user information through HTML Form submissions (or URL parameters). As well, they can pass entire HTML and EHTML (Empress-specific HTML) documents to use as templates so the user can program how the display should look. Throughout this document, you'll constantly see URL's such as Form Methods being defined as strange double paths. The first will be a path to a CGI program, the second a template to display results. Applications are built up from forms and hyperlinks. A Form is submitted which brings up a template file, which might do more than just display results; it might have it's own Form and it's own call to a Toolkit CGI Program and another template file and so on. Because user-input variables and database query results can be passed on through these forms and URL parameters, branching and control flow can be achieved by allowing the user or database results to specify which page should be brought up next.
All this allows for fairly extensive user-driven web authoring without ever making a database connection, but of course the real power of the Toolkit CGI Programs is their back-end - an Empress database interface (DBI) which can make calls across a network to any Empress Connectivity Server. Almost all of the functionality of a Toolkit Application is geared towards retrieving from and updating to a database.
The HTML and EHTML Pages
HTML and EHTML (empress-specific HTML) are the programming languages of the Toolkit. All applications are written primarily as pages of HTML and EHTML linked together with the database(s). A knowledge of HTML formatting Tags, Forms, and URL syntax is essential for writing a Toolkit application. Some of the tools require that an HTML form be built up containing fields with particular Empress- specific names and values. When posted to the correct Toolkit CGI program, these variables are used to build up an SQL string which is sent to the Connectivity Server. Output is generated largely automatically, but template documents can be passed into the tool as well so the developer can achieve control over output.
The Empress Connectivity Server
The Connectivity Server manages network requests coming into Empress from remote Empress client applications. The CGI programs interpret your Toolkit application and make SQL calls to the Empress Connectivity Server. Coordination between You, the Webmaster, Network Administrator and the DBA is necessary, because:
Any HTTP Server Directory Control that is set up in the Toolkit application will likely affect which users must be registered (and given passwords) in the Empress Connectivity Server password file.
Database privileges need to be granted on the database tables for accessing.
You will also need to declare logical (Toolkit application specific) names to correspond to the pathnames of the databases in an empress_config_exml.ini file on the same machine as the Toolkit.
Finally, the Connectivity Server can be on a different host from the Toolkit and HTTP Server, and all these, can also be on a different host machine from the actual browser.
The Empress RDBMS
The Connectivity Server talks to the Database Management System. It takes various user requests and manages the stored data under its control according to the mathematical definition of the Relational Model of a Database. In this model, databases are organized into sets of tables and tables into sets of attributes. Well-defined rules govern the relationships between these structures. Most commands to a Relational Database Management System are issued via a Structured Query Language (SQL). In order to communicate to the databases, you are required to have knowledge of SQL.
The Database(s)
The database(s) files which you will ultimately connect to lie far, far below the level of HTML authoring. They are stored on disk in a format only Empress RDBMS can read. If you are starting from scratch, you will need to know how to run the Empress utilities empmkdb to make a new database, and empsql to run an SQL session where you can create tables for your new database.The browser decodes the first part of URL and contacts the server.
The browser supplies the remainder of the URL to the server.
The server translates the URL into a path and file name.
The server realize that the URL points to Toolkit CGI program.
This is by means of URL path contains ScriptAlias directory name (refer to as CGI directory, where the CGI programs reside).
The server prepares the environment and execute the Toolkit CGI program.
The server executes the ehsql.cgi program located in the CGI directory.
The Toolkit CGI program executes and reads the environment variables and STDIN.
ehsql.cgi reads resource variables from ehtml.ini file and data source file (if needed) from empress_config_exml.ini file.
The Toolkit CGI program sends the proper MIME headers to STDOUT for the forthcoming content.
The Toolkit CGI program supplies certain code (usually called a header), using the Multipurpose Internet Mail Extensions (MIME) specification, to tell browser what kind of file (audio, graphics, plan text, HTML ... etc.) is about to come accross the network.
The Toolkit CGI program sends the rest of its output to STDOUT and terminates.
The server notices that the Toolkit CGI program has finished and closes the connection to the browser.
The browser displays the output from the Toolkit CGI program.
There are two possible ways to establish a connection to the database:
Specify the connection information in the HTML tag
It is possible to establish a connection to the database by specifying all the connection information in the HTML tag. The syntax is the same as the standard ODBC driver connect syntax:
DBASE="SERVER=server_name;PORT=6322;DATABASE=database_name"
It is optional to add the user name and password for establishing the connection:
DBASE="SERVER=server_name;PORT=6322;DATABASE=database_name;
UID=user_name;PWD=pass_word"
The user name and password set in this way override all other configurations.
It is not advisable to use these options unless you are running an open system, because it exposes information to the client.
Using data source file
Another way to establish database connection is the usage of the data source file.
This file will be read by ehsql.cgi when it encounters a data source name that does not conform to the driver connect syntax (see previous section).
There are several possible locations for the datasource file. These are:
These are search for in sequence, and the first one that is found is used.
The lines within the data source file provide the names of the database to be used in the application, and the information necessary for the server to establish a connection; that is, the information required is the host name of the server machine, and the full path and name of the actual database. A typical data source file might be structured as follows:
[customers] Server=workstn1 Database=/usr/empress/dbases/cust_info [employees] Server=workstn2 Database=/home/server/emp-dbases/employees
This file has two sections, one for each of the Data Sources needed in a hypothetical application:
In this case only the logical name of the database is embedded in the HTML documents. For example:
DBASE="customer"
Resource variables are used by the Toolkit, and their values can be referenced from within HTML documents that run through ehsql.cgi. The following is a list of these Resource Variables and their description:
Table 2-1: Empress Web HTML Toolkit Resource Variables
| Variable | Description |
| DEFAULT_PASSWORD | The password for the Connectivity Server for the default user. The default password is a null string. |
| DEFAULT_USER | A user name in a connectionless, HTTP security-less application. Default is anonymous. |
| DOCUMENT_ROOT | The root directory of the HTTP Server. Some servers, such as Apache send this as part of the CGI. Some servers do not. This variable must be set if the server does not send this as part of the CGI. |
| EH_COOKIE_PATH | Defines the sub-path below the document root for which the
ehlink.cgi connection is valid. The default is /.
Note: normally this should be left under its default setting. |
| EH_DELAY | Retry time-out period for establishing database connections. The default is one second. The range is one to ten seconds. Increasing this value helps to prevent connection failures. |
| EH_DOC_ROOT | This is a path starting at DocumentRoot and ending at a directory which writer tool will treat as the DocumentRoot. This is important for security. See the Chapter on Security. |
| EH_EXPIRE | The expiry date of a HTML page is used by a browser to decide whether to use a client side cached document, or to request an update from the server. Normally HTML pages generated by the Toolkit do not have expiry dates. If EH_EXPIRE is set, this date is transmiited with a document so that the browser knows when a document should be re-retrieved from the server. The most common use for this variable is to set the expiry date to sometime in the past - this guarantees re-retrieval of a document everythime a user loads it into their browser; this is very important for dynamic pages that have content that changes frequently. |
| EH_LINK_TIMEOUT | The time in seconds after which an session-based (ehlink.cgi) connection will time-out if not used by the client. The default is 900 seconds. After this time a new connection will need to be established. |
| EH_MAX_RETRIES | The number or attempts an application will make when establishing a connection with a Database. The default is five with a range of one to ten. Maximum wait time for a connection EH_MAX_RETRIES multiplied by EH_DELAY seconds. |
| EH_SQLTRACE | Setting this value activates tracing in the logs/sql.log
file of the CGI directory.
The logs directory must be manually created. The trace contains
SQL and time stamps for the last submitted page for the
ehsql.cgi program.
The next submitted page overwrites the old entries.
Note: This option is intended for application development and debugging; it should be disabled on a working application. |
| EH_TIMEOUT | Maximum time in seconds for a web page submission to return results. The default is infinite. This setting allows the CGI to gracefully exit before the web server aborts the process. |
| IGNORE_REMOTE_USER | Determines the user ID is for the database connections. If not set the Toolkit uses the REMOTE_USER environment variable set when the web server asks a user to login. If this variable is set to a value, or if no web server security is enabled, the ID defined by DEFAULT_USER becomes the database connection ID. |
| LOCATION | LOCATION is not a true resource variable; because it is
generated dynamically by the toolkit, and cannot be set
in a resource file. It is included in this list, because it is
referenced in the same way as the resource variables in an
application.
When it is included in an application page it expands to the path (from DocumentRoot only) of the document which uses it. This is useful if used in the URL of a form action or hyperlink to make the document hierarchy portable. As an example, suppose a regular .html document contains a tag: <IMG SRC="picture.gif">This references a GIF format bitmap in the same directory as the document. Now, upon converting this document to .ehtml and calling it through ehsql.cgi, the relative link won't work anymore because it looks for the GIF in the ehsql.cgi directory. One solution would be to specify the a complete directory path for the GIF, but that makes it hard to relocate the finished application to another place. The better solution is to prefix the name of the GIF file with the LOCATION variable like this: <IMG SRC="$$LOCATION$$/picture.gif"> Note that this is less important after v3.20 of the Toolkit since the Toolkit itself can load the any files directly. |
| MSCONFIG_FILE | This is a general version of MSCONFIG_FILE_EXML for general use by a variety of clients. |
| MSCONFIG_FILE_EXML | Name for the database configuration file. The default name is empress_config_exml.ini |
| MSHYPERPATH | This defines the path to the Toolkit installation directory.
Note that this is the location of the Empress installation; not the CGI directories. It is used to locate the licence key and the resources directory. |
| MSLANG | Language for runtime messages. Should be set to the name of the language you want the HTTP server logs to write in. The Empress supplied languages are English (en_US), German (de_DE), French (fr_FR), and Japanese (ja_JP.eucJP). The default is en_US. |
| MSUSERLANG | Should be set to the name of the language which is to be displayed by the application. The Empress supplied languages are English (en_US), German (de_DE), French (fr_FR), and Japanese (ja_JP.eucJP). The default is en_US |
| MSUSERPATH | Directory for user defined run time message (RTM) files. |
| TMPDIR | Directory for the CGI scratch files. The default is /tmp.
Note that if you set this incorrectly select will be unable to record queries. |
To configure the Toolkit application, ehtml.ini file is used for setting of resource variables. The following is an example of ehtml.ini file:
MSHYPERPATH ${EMPRESSPATH}/hypmedia
DOCUMENT_ROOT /usr/joe/JIF/docroot
DEFAULT_USER joe
DEFAULT_PASSWORD joeuser
MSUSERPATH /usr/joe/JIF/docroot/.ehtml
The format is one name/value pair per line, with either an equal sign (=) between them, or just whitespace. There is more than one location possible for the resource file:
The default ehtml.ini file is in the $MSHYPERPATH/resources/init directory. This is read first, and is used to set standard, system-wide variables.
Next, the ehtml.ini file can go in a .ehtml directory underneath the the location of ehsql.cgi. This ehtml.ini file is read next, and any variables with the same name as those in the default resource file will be overwritten with new values. This location is included for historical compatibility to Toolkit 2.0.
Finally, an ehtml.ini file can go right in the directory which contains ehsql.cgi. This file will contain any variable settings specific to that area of the application. (This is useful in large applications which might have more than one CGI directory, and different settings of resource variables for security reasons ).
The important thing to remember is that the search will look for all the ehtml.ini files listed above, and set the variables found in each one, overwriting previously set variables in more global ehtml.ini files.
Note that any resource variable that is global to your HTTP Server environment (for example MSHYPERPATH) can be set as an environment variable in your HTTP server process space. For example, with the Apache server, this could be done by adding the following line to the httpd.conf file:
SetEnv EMPRESSPATH empress_installation_path
SetEnv MSHYPERPATH ${EMPRESSPATH}/hypmedia
SetEnv MSCONFIGFILEPATH ${EMPRESSPATH}/config
SetEnv MSNETSERVERCONFIGFILE ${EMPRESSPATH}/config/netserver.cfg
This section provides step by step instructions on how to set up a Toolkit application from scratch. Assuming Empress RDBMS and Hypermedia (which includes WEB HTML Toolkit, and Apache server) were installed properly on the system.
There are four very important locations (directories) for setting up Toolkit applications. They are:
Empress Hypermedia is located using the MSHYPERPATH environment, or resource, variable. This is needed so that the Toolkit CGIs can locate the licence key and the default resource files. The directory where Empress is installed is shown by the value of the environment variable $EMPRESSPATH. Then, MSHYPERPATH environment variable would be ${EMPRESSPATH}/hypmedia.
There is an Apache server in the Empress Hypermedia standard distribution. It is located under ${MSHYPERPATH}/apache (eg ${EMPRESSPATH}/hypmedia/apache).
The CGI directory is a directory where all the CGI programs and data source file are located. When multiple CGI locations are used on a Web server there are usually some resource variables that are global to the entire installation, and others that are specific to each CGI location. Global variables are usually set either as environment variables, or as resources in the global resource file (either ${MSHYPERPATH}/resources/init/ehtml.ini or DOCUMENT_ROOT/.ehtml/ehtml.ini). CGI specific resources are located in an ehtml.ini file in the same directory of the CGI programs. You can have as many CGI directories in the system. CGI directory is defined in conf/srm.conf file using ScriptAlias under the Apache server directory.
The document root directory is the root of a hierarchy of directories where HTML pages are located. Document root directory is defined in conf/srm.conf file using DocumentRoot under the Apache server directory.
In this example:
Let /usr/joe/netapps/My-bin be the CGI directory.
In this case, the Toolkit CGI program (ehsql.cgi), resource variable file (ehtml.ini), and data resource file (empress_config_exml.ini) will be reside in this directory.
There are two ways in dealing with ehsql.cgi:
Copy the Toolkit CGI program from Empress Hypermedia system directory (ie. ${MSHYPERPATH}/ehtml/bin/ehsql.cgi) to this directory (ie. /usr/joe/netapps/My-bin).
or
Soft link the ehsql.cgi file to this CGI directory. For example, in Unix system:
ln -s $MSHYPERPATH/ehtml/bin/ehsql.cgi .
Where $MSHYPERPATH is ${EMPRESSPATH}/hypmedia.
There can be many CGI directories for different applications. You can also rename ehsql.cgi to any name that is viewed to be more meaningful to the application; except if you want to use the select tool the name must begin with ehsql (or ehlink).
Let /usr/joe/netapps be the document root directory.
In this case, all HTML pages can be found under this directory or directories below. For example, input.ehtml and output.ehtml reside in /usr/joe/netapps/docs directory.
Empress distribution comes with a default setting of Apache server configuration for the Web HTML Toolkit administration tools (JumpStart) and the Toolkit demos. Start your own Apache server with different configuration for the applications. For example:
The CGI directory is defined through the ScriptAlias in the Apache server configuration file (conf/srm.conf). For example, edit /usr/joe/netapps/server_conf/conf/srm.conf and insert a line:
ScriptAlias /My-bin/ /usr/joe/netapps/My-bin/
There can be many ScriptAlias definitions.
The document root directory is defined through the DocumentRoot in the Apache server configuration file (srm.conf). For example, edit /usr/joe/netapps/server_conf/conf/srm.conf and modify the default location to:
DocumentRoot /usr/joe/netapps
There can be only one DocumentRoot definition.
Apache server can be started by executing httpd. This program is located under Apache server directory. The command is:
httpd -d server_root_directory
In this example, the command would be:
$MSHYPERPATH/apache/httpd -d /usr/joe/netapps/server_conf
Where $MSHYPERPATH is ${MSHYPERPATH}/hypmedia. And -d means look for the directory specified for the configuration directory (conf) and the log directory (logs). Now the Apache server is running and configured with the specified CGI directory and document root directory using the port number 8285.
Create a simple HTML page, named MyFirst.html in the docs directory under the document root directory, for example, /usr/joe/netapps/docs as follow:
<html> <body bgcolor="#FFFFFF"> <h1>This is first page of Toolkit example</h1> </body> </html>
Assuming the FQDN (Fully Qualify Domain Name) of your system is mine.empress.com, type in the URL as http://mine.empress.com:8285/docs/MyFirst.html. If you see the page displayed on the browser, then the server is working fine.
By default the toolkit treats output as HTML text. This can be modified by either special file extensions, or by the CONTENT-TYPE option of the EXECUTE tag. The Toolkit will accept the .html extension shown above, but it is often useful to use a .eht or .ehtml file extension, so that it is easier to spot database files.
http://mine.empress.com:8285/My-bin/ehsql.cgi/docs/MyFirst.ehtml
or
http://mine.empress.com:8285/My-bin/ehsql.cgi/docs/MyFirst.html
would both return the file as a HTML type (text/html). But
http://mine.empress.com:8285/My-bin/ehsql.cgi/docs/MyFirst.txt
would return the file as plain text (text/plain).
If you see the page displayed on the browser, then you are ready to use Toolkit in full capacity.