CHAPTER 2: |
Persistent Stored Modules |
One of the main features of Empress is Persistent Stored Modules (PSM) that allows users to store user-defined functions, operators, stored procedures and triggers in the database, as objects in a schema. These objects can then be invoked from SQL statement and application programs.
A Stored Procedure is a procedural database object that can be invoked by any client program.
Triggers are special types of stored procedures that get invoked ("fired") automatically whenever certain events occur, and the conditions satisfy. Traditionally, triggers have been associated with database events such as insertions, updates or deletes.
There are reasons why PSM is a very useful enhancement of Empress RDBMS:
Quite often an application program performs a fare number of database accesses to produce a limited number of results. If this program requires to be executed repeatedly in the client/server environments, it can be time consuming because of network overhead. It is more efficient to run this program on the server side. One way to do this is to put the program into a procedure and store into the database as part of schema definition. This way, not only the performance gets improved, but also the application programming becomes more structured.
Empress provides many functions and operators which are included with the standard distribution (refer to Empress SQL: Reference manual under Functions). However, when an additional function is required to operate on the data, you can extend the functions of Empress to fulfill your specific needs through the User Defined Functions capability. Persistent Stored Modules provide an easy interface to incorporate user defined functions into Empress RDBMS.
This section describes how to incorporate a user defined function (UDF) into Persistent Stored Module as a dynamically loadable library or a static library.
Here are the steps to incorporate a user defined function (UDF) into Persistent Stored Module as a dynamically loadable library:
Consider a C program file fact.c located in an imaginary directory /usr/joe. This file contains the factorial function that performs a factorial calculation on the single parameter it accepts, and returns the result.
long factorial(long num)
{
int i;
long res=1;
for (i=1; i<=num; i++)
res *= i;
return (res);
}
Compile fact.c and create shared object fact.dll:
emppsmcc -o fact.dll fact.c
The same command can be used if the source file includes Empress routines, such as mr routines, in order to access Empress database.
The following steps show how to store external function fact into a database as a PSM. Once this is completed, fact becomes an Empress function for that database, and it will be a keyword until the DROP MODULE command is issued.
Invoke Empress Interactive SQL session from the operating prompt:
empsql db_name
Within the Interactive SQL session, issue the following SQL commands:
CREATE MODULE module_fact
FUNCTION fact (GENERIC INTEGER) RETURNS GENERIC INTEGER
EXTERNAL NAME factorial;
END MODULE;
UPDATE MODULE module_fact
FROM "/usr/joe/fact.dll"
Now you can use fact function in SQL statements, for example:
SELECT attr1, attr2, fact(attr2) FROM table_name;
SELECT fact(attr1) - fact<attr2) PRINT "Differences"
FROM table_name;
a User Defined Function can also be used in conditional expression of a WHERE clause:
SELECT * FROM table_name
WHERE attr1 * 1000 > fact(attr3);
where attr1, attr2, and attr3 are numerical attributes in the table table_name.
Empress allows User Defined Functions to be used in several places:
1. In the Empress SQL they can be used in:
Their usage is the same as Build-In Functions in Empress.
2. In the Empress 4GL they can be used in:
The usage is the same as Empress Built-In Functions and invoked 4GL procedures.
3. In Empress Report Writer, they can be used in:
Their usage is the same as Build-In Functions in Empress.
4. In Empress Hypermedia - Internet Application Toolkit, they can be used in:
Their usage is the same as Build-In Functions in Empress.
User Defined Functions can also be used in C programs through expressions built with mr expression routines. (See Empress Host Language: C Kernel Level Interface mr Routines).
On the platforms where dynamically loading shared library is not available (for example: QNX), users can use statically linking PSM to define and call User Defined Functions.
In the platforms where dynamically loading shared library is available, the Statically Linking Persistent Stored Modules would not work. Currently only a few platforms need to use Statically Linking Persistent Stored Modules.Here are the steps to incorporate a user defined function (UDF) into Persistent Stored Module in a statically linking environment:
The example function, fact.c located in /usr/joe directory, performs a factorial calculation. It accepts a single integer argument and returns a long integer result value.
long factorial(long num)
{
int i;
long res=1;
for (i=1; i<=num; i++)
res *= i;
return (res);
}
Compile fact.c and create library fact.a:
emppsmcc -o fact.a fact.c
The following steps show how to store external function fact into a database as a PSM. Once this is completed, fact becomes an Empress function for that database, and it will be a keyword until the DROP MODULE command is issued.
Within the Interactive SQL session, issue the following SQL commands:
CREATE MODULE module_fact
FUNCTION fact (GENERIC INTEGER) RETURNS GENERIC INTEGER
EXTERNAL NAME factorial;
END MODULE;
UPDATE MODULE module_fact
FROM "/usr/joe/fact.a"
Unlike dynamic loading PSM, this function is not available until it is explicitly linked into the executable file(s).
As Empress System Administrator, you can make the function available to all the users.
First, rebuild the library:
cd $EMPRESSPATH/rdbms/custom/src/usrfns ./mklib db_name
Then, make executable file(s), for example, to make fact available to empsql:
cd $EMPRESSPATH/rdbms/conf_bin ./mkexec empsql
Note that $EMPRESSPATH/rdbms/custom/src/usrfns/mklib command is used to modify system library (one need to be Empress System Administrator to issue this command). It has the following syntax:
mklib [db_name ... ]
It takes a list of database names from which the modules will be added to system library.
Now, you can use the function fact in empsql.
Invoke Empress Interactive SQL session from the operating prompt:
empsql db_name
Within the Interactive SQL session, issue the following SQL commands:
SELECT attr1, attr2, fact(attr2) FROM table_name;
SELECT fact(attr1) - fact<attr2) PRINT "Differences"
FROM table_name;
or in the WHERE clause:
SELECT * FROM table_name
WHERE attr1 * 1000 > fact(attr3);
where attr1, attr2, and attr3 are attributes in the table table_name.
It is also possible to make your own executable programs from the PSM routines. This can be done using the interface definition of PSM routines already stored in one or more database.
Consider a C program that uses the factorial function already stored as a user-defined function.
msmain()
{
long i = 5;
printf ("fact(%ld) : %ld", i, fact(i));
}
Use emppsm_mkexec to make an executable:
emppsm_mkexec -psmdb db_name -o fact fact.c
Notes:
Since UDF has to be linked to the executable file before the actual executaion of the program, certain limitation applies to Static Linking PSM:
Currently module name is used to differentiate the table entries for User Defined Function, for static link, there might be conflict if modules in two databases have the same name.
If same external name (C routine name) are being used, there will be conflict when doing static link.
The following is a listing of Persistent Stored Modules SQL commands:
Table 2-1 SQL/PSM Commands
| Command | Descriptions |
| CREATE MODULE | This command is used to define the definition of the module into the data dictionary. This creates an entry point for the user defined functions, operators, procedures or aggregate functions. The UPDATE MODULE command is necessary for linking the module definition with the module loadable shared library. |
| UPDATE MODULE | This command links the module definition with the module loadable shared library. |
| DROP MODULE | This command removes the module definition from the database data dictionary. Any functions, procedures and operators that are defined in the module will no longer recognized by Empress. |
| DISPLAY MODULE | This command displays the module definition from the database data dictionary. |
| CREATE TRIGGER | This command is used to define a trigger definition into the data dictionary. Once it is defined, the trigger is enabled automatically. ALTER TABLE command can be used to temporarily disable the trigger which can then be enabled again when is needed. |
| DROP TRIGGER | This command is used to remove the trigger definition from the data dictionary. |
| ALTER TABLE | this command is used to enable or disable trigger and can also be used to change trigger priority. |
| CREATE COMMENT | this command is used to attach a comment on the module or trigger. |
| DROP COMMENT | this command is used to remove comment on the module or trigger. |
These commands are described in the A4: SQL Reference manual.
When a User Defined Function is declared in CREATE MODULE command, a list of its parameters and a return type are specified. For example:
DECLARE FUNCTION funct1 (integer, IN float)
RETURNS INTEGER
EXTERNAL NAME func2
PARAMETER STYLE GENERAL;
The parameter list can follow one of two styles, PARAMETER STYLE GENERAL and PARAMETER STYLE SQL. These are defined as follows:
This style of parameter passing to a User Defined Function means that the effective parameter type list is the same as the one explicitly stated. Null value is restricted with the parameter type of IN or INOUT mode.
This style of parameter passing to a User Defined Function means that the effective parameter type list is defined as follows:
The first n parameters are the n explicitly defined ones. Parameter mode is as defined.
If and only if the routine is a function, operator or aggregate function, the next parameter is an output parameter corresponding to the result to be returned . Parameter mode is OUT.
The next n parameters (n+1, if the routine is a function, operator or aggregate function) are null indicator parameters, in 1:1 correspondence with the first n (or n+1) parameters. Parameter mode is whatever defined in the first n parameters.
The next parameter is an INOUT parameter; it indicates success or failure of the routine. Its data type character type with length 5.
The next is an IN parameter; it is used to pass the name of the external routine to be invoked.
The next parameter is used to pass the specific name of the external routine to be invoked with parameter mode IN.
The final parameter is message text to be placed in the diagnostics area with parameter mode INOUT.
PARAMETER STYLE GENERAL parameter style, as indicates by it's name, is more general hence it is easier to use; but it does not deal with null value, hence, it is less flexible. PARAMETER STYLE SQL provides users with greater control.
Empress allows three groups of data types for the Parameter Type and Return Value:
Generic data types represent data that is not bound by parameters.
MS data types represent data that is bound by parameters. For Example the CHAR data type accepts parameters for length and type. MS data types are associated with specific generic data types. For example, the MS data types DATE, TIME and MICROTIMESTAMP are based on the generic date, gen_date, which has the C structure definition:
typedef struct
{
long gen_year; /* actual year. e.g. 1999, 2000, etc */
long gen_month; /* 1 to 12. 1 = January, ... */
long gen_day; /* 1 to 31 */
long gen_hour; /* 0 10 23 */
long gen_minute; /* 0 to 59 */
long gen_second; /* 0 to 59 */
long gen_microsec; /* 0 to 999999 */
} gen_date;
This association allows computations involving MS data types to use the same computation routines as the generic data types on which they are based on.
The gen_binary has the following C structure definition:
struct gen_binary_segment
{
long data_len; /* length of binary data in segment */
msbyte* data; /* binary data */
long reserved[6];
};
typedef struct gen_binary_segment gen_binary_segment;
struct gen_binary
{
long total_data_len; /* total length of binary data */
int flags; /* bit arrray of flags */
int num_segments; /* number of segments, typically 1 */
gen_binary_segment segment [1 /* actually: num_segments */];
};
typedef struct gen_binary gen_binary;
#define GEN_BINARY_SEGMENTS_ALLOCATED (1 << 0)
This structure is defined in $EMPRESSPATH/include/usrfns/generic.h
Note that:
The total_data_len should be the sum of data_len from each gen_binary_segment.
If GEN_BINARY_SEGMENTS_ALLOCATED is set in flag, it means segment data is in individually allocated buffers (to be freed); else segment data is in static buffers which will not be freed.
The correspondences to the Empress User Defined Functions C program data type structures (as defined in $EMPRESSPATH/include/usrfns.h file) are as follow:
Table 2-2 Generic Data Type Correspondences
| Generic Data Type | UDF C Program Data Type (Returns/Parameter Mode IN) |
UDF C Program Data Type (Parameter Mode OUT/INOUT) |
| GENERIC CHAR | char * | char ** |
| GENERIC INTEGER | long | long * |
| GENERIC DECIMAL | gen_dec | gen_dec * |
| GENERIC FLOAT | double | double * |
| GENERIC DATE | gen_date | gen_date * |
| GENERIC BOOLEAN | int | int * |
| GENERIC EXTERNAL | char * | char ** |
| GENERIC INTERVAL | gen_date | gen_date * |
| GENERIC BINARY | gen_binary * | gen_binary ** |
Table 2-3 MS Data Type Correspondences
| MS Data Type | UDF C Program Data Type |
| CHAR CHARACTER |
char * |
| SHORTINTEGER SMALLINT |
short * |
| INTEGER INT |
long * |
| LONGINTEGER | long * |
| DATE | gen_date * |
| DOLLAR | gen_dec * |
| REAL | float * |
| LONGFLOAT DOUBLE PRECISION |
double * |
| FLOAT | If decimal places <= 6: float * If decimal places > 6: double * |
| TEXT | char ** |
| BULK | gen_binary ** |
| DECIMAL DEC |
gen_dec * |
| TIME | gen_date * |
| NLSCHAR NLSCHARACTER |
char * |
| NLSTEXT | char ** |
| WON | gen_dec * |
| MICROTIMESTAMP | gen_date * |
Table 2-4 C Data Type Correspondences for UDF
| C Data Type | UDF C Program Data Type (Returns/Parameter Mode IN) |
UDF C Program Data Type (Parameter Mode OUT/INOUT) |
| "C char" | char | char * |
| "C schar" | signed char | signed char * |
| "C uchar" | unsigned char | unsigned char * |
| "C short" | short | short * |
| "C ushort" | unsigned short | unsigned short * |
| "C int" | int | int * |
| "C uint" | unsigned int | unsigned int * |
| "C long" | long | long * |
| "C ulong" | unsigned long | unsigned long * |
| "C float" | float | float * |
| "C double" | double | double * |
| "C string" | char * | char ** |
| "C string_static" | char * | char ** |
Warnings: For some compilers, "C float" for parameter mode IN might not work. Please avoid using "C float" for parameter mode IN.
This section provides a list and brief description of
PSM syntax and user error messages.
2.6.1 Syntax Error Messages
You issued a CREATE MODULE command, but did not specify a module name.
You issued a DROP MODULE command, but did not specify the module name.
You specified the LANGUAGE option in a CREATE MODULE command without language specification.
You specified LANGUAGE option with parameter in a CREATE MODULE command without language parameter after the parentheses.
You specified FOR system_name option in UPDATE MODULE command without the system name.
You issued an UPDATE MODULE command without a DLL file name.
While declaring a SQL-invoked routine in CREATE MODULE command, you did not give a correct routine name.
While declaring an OPERATOR in CREATE MODULE command, you did not specify an operator class (eg. PREFIX, POSTFIX ... etc.) for the operator.
While declaring SQL-invoked routine in CREATE MODULE command, you did not specify the parameter list correctly.
While declaring FUNCTION, OPERATOR or AGGREGATE FUNCTION in CREATE MODULE command, you did not specify the return data type after the keyword RETURNS.
While declaring FUNCTION, OPERATOR or AGGREGATE FUNCTION in CREATE MODULE command, you did not specify the return data type.
While declaring a SQL-invoked routine in CREATE MODULE command, the external specification is missing (ie. keyword EXTERNAL).
While declaring a SQL-invoked routine in CREATE MODULE command, the specification after the keyword EXTERNAL is not correct.
While declaring a SQL-invoked routine in CREATE MODULE command, the parameter style specification is not correct (ie. PARAMETER STYLE SQL or PARAMETER STYLE GENERAL).
You specified ALLOCATE WORKSPACE option without the size of workspace in CREATE MODULE command.
While declaring a SQL-invoked routine in CREATE MODULE command, you specified the BEGIN EXPRESSION with incorrect external name.
While declaring an AGGREGATE FUNCTION in CREATE MODULE command, you did not give BEGIN GROUP specification.
While declaring an AGGREGATE FUNCTION in CREATE MODULE command, you did not give BODY specification.
While declaring an AGGREGATE FUNCTION in CREATE MODULE command, you did not give END GROUP specification.
You issued a CREATE TRIGGER command but did not give the specification for the trigger.
You issued an ALTER TABLE command to change the property of a trigger, but did not give the name of the trigger.
You issued a CREATE TRIGGER command without specifying BEFORE or AFTER of the triggering event(s).
You issued a CREATE TRIGGER command without specifying the triggering event (ie. DELETE, INSERT or UPDATE.
You issued a CREATE TRIGGER command without specifying the trigger procedure name (ie. EXECUTE trigger_procedure_name).
You issued a CREATE TRIGGER command without EXECUTE specification.
You specified the PRIORITY option in the ALTER TABLE command without specifying the priority.
You issued an ALTER TABLE command to change the priority of a trigger but missing the keyword PRIORITY.
You issued a CREATE MODULE command, the EXTERNAL specification is incorrect.
You issued a CREATE MODULE command, the keyword MODULE in END MODULE was missing.
While declaring an AGGREGATE FUNCTION in CREATE MODULE command, the keyword FUNCTION after AGGREGATE was missing.
You specified ALLOCATE WORKSPACE option in CREATE MODULE command, the keyword WORKSPACE after ALLOCATE was missing.
You specified FOR EACH ROW option in CREATE TRIGGER command, the keyword EACH after FOR was missing.
You specified FOR EACH ROW option in CREATE TRIGGER command, the keyword ROW after EACH was missing.
You issued an ALTER TABLE command to ENABLE or DISABLE ALL TRIGGERS, the keyword TRIGGERS after ALL is missing.
You declared a FUNCTION, OPERATOR or AGGREGATE FUNCTION with a parameter mode OUT in CREATE MODULE command, this mode is only valid for PROCEDURE parameters.
You declared a FUNCTION, OPERATOR or AGGREGATE FUNCTION with a parameter mode INOUT in CREATE MODULE command, this mode is only valid for PROCEDURE parameters.
You declared an infix operator (OPERATOR INFIX precedence) with arithmetic parameter in CREATE MODULE command, but the precedence was out of range (1..6).
You declared an infix operator (OPERATOR INFIX precedence) with boolean parameter, but the precedence was out of range (1..2).
You issued a CREATE MODULE command, but did not give SQL-invoked routine specification or language specification.
For prefix and postfix operator, only one parameter is allowed.
For infix, comparison and equality operator, only two parameters are allowed.
You declared a SQL-invoked routine with the same external name as another routine within the same module.
You attempted to use a file which does not exist.
You specified an incompatible parameter.
You specified an invalid module name.
You specified an invalid routine name.
You specified an invalid parameter.
You specified an invalid number of parameters.
You specified an invalid external name for a SQL-invoked routine.
You specified an invalid trigger name.
You attempted to create a module with the duplicate name of an existing module.
You attempted to used a module which does not exist.
You specified a parameter with the same name as another parameter within the same routine.
You attempted to used a parameter which does not exist.
The parameter data type should be boolean.
The parameter data type should be arithmetic.
The return data type should be boolean.
The return data type should be arithmetic.
You declared an aggregate function which returns boolean data type.
You declared multiple routines with the same names but different routine classes.
You specified too many parameters for an operator.
You attempted to create a trigger with the same name as an existing trigger.
You attempted to use a trigger which does not exist.
You attempted to declare a routine with the same name as an existing routine.
You attempted to use a routine which does not exist.
When calling mrefunc(), invalid routine or incompatible parameter are detected.