CHAPTER 4: Command Logging and Backup

4.1 Introduction

Empress provides three kinds of command logging:

audit trail
database log
recovery log

and two kinds of database backup and recovery:

archives
on-line backup

The audit trail is used for keeping track of all accesses to the database. It records all operations on the database, even those that do not modify the data.

The database log is used for keeping track of all accesses to the database and records all operations on the database, even those that do not modify the data.

The recovery log records operations that affect database contents. It is used to recover the current state of a database in the event of damage to the database for instance, from a disk failure.

Database archive will create a copy of database onto a backup medium. No users should access the database during the process.

The on-line backup utility will copy a database onto a backup medium while users still accessing the database. This facility is important to sites with databases that must be available at all times.

4.2 Audit Trail

The audit trail records detailed information regarding all commands run by all users. The information includes the user's identity, the commands entered and time of entry, the tables opened and records examined, and so on. The audit trail is intended to provide you with a record of database access for control purposes.

An audit trail is stored in the file specified by the Empress database administrative variable MSDBAUDITTRAIL1. You can make a copy of this log in the file specified by the Empress database administrative variable MSDBAUDITTRAIL2.

By setting a filename (including the full pathname) to MSDBAUDITTRAIL1 in tabzero the audit trail will start creating the file if it did not previously exist. The trail remains enabled until the variable is set to nothing in tabzero.

Be aware that an audit trail can grow very large quite quickly. If disk space is at a premium, audit trails should be truncated frequently.

4.3 Database Log File

The database log records only accesses to the database. It serves as an input for Empress Database Log Analyzer utility.

Here is the format of the Database Log file, which is created to keep history of accesses to an Empress Databases. Empress Database Log File can be specified in the tabzero file by setting a filename (including the full pathname) to MSDBLOGFILE administrative variable. For example:

     MSDBLOGFILE=/usr/dblogfile.log

If not specified, there will not be any default database log file.

4.3.1 Database Log Analyzer Utility

Empress Database Log Analyzer Utility empdblogstat is used to analyze and display data from the Empress Database Log File. Empress Database's log file is created on a first access to an Empress Database.

It contains useful information that could be used in order to get more insight in the usage of the particular Empress Database. Empress Database Log Analyzer Utility empdblogstat is implemented to give users additional capability to analyze the contents of the Database Log File and display aggregate statistical results for the time frame in question.

Several log analysis functions have been provided. They can report the unique user information, the number of connections, connection duration, and number of rejected connections due to the insufficient licenses.

The Empress Database Log Analyzer utility empdblogstat has the following syntax:

    empdblogstat LOGFILE_SPEC [OPTION ...] [TIME_FRAME] [dump]

where:

`LOGFILE_SPEC`: There are two ways to inform the utility about the location of the Database Log File:	`db_name`	This utility will retrieve the Database Log File name from the tabzero file, which contains default values for a number of Empress database administrative variables.
	`-f file`	Use the -f option to specify the Database Log File name
`OPTION`	user, uniqueusers	Display data about unique users connected to the Database. It shows the user identification, the number of connections and the average connection duration per user. It also displays the total number of unique users.
	conn, connection	Display data about user connections. It shows the start time of the interval, the number of connections in the interval, as well as aggregate data, the minimum the maximum and the average number of connections per interval. It also displays total number of connections for the time frame in question.
	dur, connectiondurations	Display data about user connection durations. It shows the start time of the interval, the average connection duration for of connections started in the interval, as well as aggregate data, the minimum the maximum and the average connection duration per interval. It also displays the average connection duration for all connections for the time frame in question.
	lic, licenceexceeded	Display data about user connections rejected due to insufficient licences.
`TIME_FRAME`: If the optional TIME_FRAME is specified, this utility will analyze a section of the Database Log File. By default, the whole Database Log File is analyzed. The time frame specification requires the start time, the end time and the optional time interval.	start time	\| yyyy-mm-dd hh:mm:ss \| \| hh:mm:ss \| \| start \|
	end time	\| yyyy-mm-dd hh:mm:ss \| \| hh:mm:ss \| \| start \|
	interval (optional)	[hh:mm:ss]
`dump`	The "dump" option suppresses most of the labels and prints values alone. The values are separated by the Empress separator <Ctrl+V>. The output, when using "dump" option, could be useful as input to some of the 3rd party graphical tools as a plot for Y-axis.

For example, if the utility is used to display information associated to all of the keywords (i.e. user, conn, dur, lic) the following command at the operating system prompt could be used:

      empdblogstat db1

where db1 is an Empress database name.

4.4 Recovery Log

For some sites, it is quite sufficient to make a backup of databases once per day or once per week, risking at most one day's worth or one week's worth of information loss. Other sites, however, cannot afford to lose any data, and use a recovery log to prevent such a loss. A recovery log contains a copy of all changes which have been made to the database since the last backup; if many changes are made, the recovery log can become quite large.

In the event of damage to the database, the database is restored from the most recent backup, then the changes which were tracked in the recovery log are applied to the restored data. This makes the data in the database completely current to the time of the incident.

To start this continuous recovery logging:

First, set the tabzero variables MSDBRECOVERYLOG1 and MSDBRECOVERYLOG2 to the name(s) of the recovery log file(s). The good practice would be to define the recovery log file(s) with the full path.
Next, issue emprlini to create the log files.
Last, backup the database either using archiving (emparch) utility or on-line backup (empolbak) utility.

It is important that the database and the recovery log file(s) are on different devices or file systems. This is to minimize the chance of both being destroyed in case of hardware or system problem.

The common syntax for emprlini is:

   emprlini database

This will create the log file(s) as indicated in tabzero, and attach an appropriate header to the file as needed by Empress. Each physical log file will also be given an extension to its name to indicate the version number. After doing emprlini, the version number extension will always be .000.

For example if MSDBRECOVERYLOG1 is set to:

   /usr2/accounting/logs/log1

then after issuing emprlini, you will see the following physical file created:

   /usr2/accounting/logs/log1.000

This is the file to which all database activity will be automatically logged.

Continuous logging of database commands will not occur unless emprlini is used to create the original recovery logs. If a log file is created without the appropriate header, an error message will occur for each database command.

4.5 Archives and Recovery

4.5.1 Database Archives

Empress provides a utility, emparch, to create a copy of a database onto a backup medium. Any current recovery log files will be emptied and re-started. The program will back up all data in the database, including the data dictionary and any remote tables.

You should not keep non-database files in the database directory or in directories used for remote tables. These files will not be backed up and the restore process will destroy them.

emparch will question you for various details during the process. Its syntax is:

   emparch [-a file] [-f] database

where:

`-a file`	Create an archive of the database on file, which can be a file or a device.
`-f`	Force execution without confirmation.
`database`	is the name of a database.

The program emparch will not run unless a recovery is specified in tabzero.

Note that emparch come configured for /dev/rmt0 as the default backup device and tar or cpio as the archive mechanism. If these are not suitable for your system, edit the files accordingly. The programs can be found in the bin directory where Empress is installed.

4.5.2 Database Recovery from the Archives

In the event of a disaster to a database (such as an operating system or disk failure), the program emprecov will restore the database from a backup copy produced by emparch, and use the recovery log to re-run the commands entered since the last backup. By backing up files regularly and using the recovery log, a database can always be brought back to the state it was in immediately before it was corrupted.

Note that emprecov comes configured for /dev/rmt0 as the default backup device and tar or cpio as the archive mechanism. If these are not suitable for your system, edit the files accordingly. The two programs are found in the bin directory where Empress is installed.

A recovery log is stored in the file cited by the Empress database administrative variable MSDBRECOVERYLOG1 in tabzero. A copy of this log may be made in the file specified by the Empress database administrative variable MSDBRECOVERYLOG2.

4.6 On-Line Backup and Recovery

4.6.1 On-Line Backup

Performing an on-line backup of a database means copying the database while it is being accessed by users. This facility is important to sites with databases that must be available at all times; these sites cannot afford to take the database off-line for the duration of a backup. The on-line backup procedure is completely transparent to the users and database integrity is maintained.

On-line backup can also be performed on databases which have no active users. It then serves as an equivalent (but more flexible) alternative to the previous method of doing backups: namely the utility emparch to create the backup, and emprecov to restore the database.

The empolbak command is used to perform an on-line backup. The backup can be made to an off-line device like a cartridge or a DAT tape, or to a file system. If you are using an off-line device make sure that your backup medium has been formatted and has enough space to store the database. Note that empolbak does not support multiple volumes; you must make sure that the database will fit fully onto your backup medium. The command syntax for empolbak is:

   empolbak options_list argument_list database

where database is the physical name or logical name of the database to back up. The option_list and the argument_list is described in the following sections. Although you may run on-line backup through a server, you must be on the local node where the physical database resides.

On-line backup can be done with or without the existence of continuous recovery logging in place.

4.6.1.1 On-Line Backup Device

The database will be backed up to the default device given by the Empress variable, MSOLBBACKUPDEVICE. This is usually set to /dev/rst0 by default. The block size for the backup is given by the Empress variable MSOLBBLOCKSIZE. It is 1024 by default.

In order to specify backup to a different device or file, use the -f (file/device) option followed by the name of the file or device, for example:

   empolbak -f /tmp/backup movies

Here, /tmp/backup is the name of the file in which to store the backup, and movies is the name of the database. To save to a DAT tape for example, you might use /dev/tape as the name of the device:

   empolbak -f /dev/tape movies

Note that using "-" as the file name will send the backup to the process' standard output.

4.6.1.2 On-Line Backup Blocking Factor

Depending on the medium you use to store your backup, you may wish to specify the blocking factor. The blocking factor is the number of 512-byte data blocks which will be stored in one media block; see your System Administrator to discover what value you should use.

Cartridge tapes typically have a blocking factor of 126. To specify a blocking factor of 126 (512 * 126 = 64512 data bytes per media block) to a cartridge tape, use:

   empolbak -b -f 126 /dev/rst0 movies

Note that the Empress variable MSOLBBLOCKSIZE specifies the default block size for the backup and indirectly the blocking factor. You therefore need to use the -b (blocking factor) option only if the blocking factor you wish to use is different from the default.

All of the options must be specified first, followed by their arguments; the arguments must be in the same order as the options. The options may also be concatenated together, however the order still remains relevant. The following command:

   empolbak -bf /dev/rst0 126 movies

is incorrect. It specifies a blocking factor of /dev/rst0 and uses 126 as the name of the backup file. The correct command would be:

   empolbak -bf 126 /dev/rst0 movies

4.6.1.3 On-Line Backup Process Information

The final option is -v, which makes empolbak produce verbose information about the backup process. The name of each database file is printed as it is backed up, as well as the current phase of the backup:

   empolbak -bfv 126 /dev/rst6 movies

4.6.1.4 On-Line Backup with Continuous Recovery Logging

If on-line backup is done with continuous recovery logging in place, the logging of database activities will continue while backup is in progress. All database files including transaction log and the recovery log will be copied to the backup medium and a recovery log file(s) with new extension number will be created at the end of backup process.

The log file extension will be increased by 1 each time on-line backup is performed. For example, if after issuing emprlini the log file is named log1.000, then after the first on-line backup, the new log file will be named log1.001, after another on-line backup it will be named log1.002, and so on.

The log file extension numbers will be automatically maintained by the coordinator facility, and will be applied by the empolbak command during re-initialization of the log file. It is not necessary to re-issue the emprlini command after doing an on-line backup.

4.6.1.5 On-Line Backup without Continuous Recovery Logging

If continuous recovery logging is not in place, on-line backup will create its own recovery log file which will be used during the backup process only. This log file will contain all changes which are made to the database during the on-line backup. All database files as well as this recovery log file will be copied to the backup medium. This produces a full backup of the database which is current to the time of the backup. Since continuous recovery logging is not in place, subsequent changes to the database will not be logged once the backup process completed.

The name of the log file is determined by the variable MSOLBRECOVERYLOG in the initfile file under the Empress system custom directory. The default file name is olbreclog and will be created relative to the local directory where on-line backup started. The setting of the variable can be changed by:

Changing the value of MSOLBRECOVERYLOG in the $EMPRESSPATH/config/initfile file using system editor. For example:
```
   MSOLBRECOVERYLOG=/usr2/accounting/tmp_logs/log1
```
Or,
Setting the value of MSOLBRECOVERYLOG in the environment of the DBA before on-line backup is run. For example:
```
   In UNIX csh:

   setenv MSOLBRECOVERYLOG /usr2/accounting/tmp_log1 
```

MSOLBRECOVERYLOG should always be set to the full path specification of the log file so that client processes are able to find the file.

It is possible, although not advisable, to run two simultaneous on-line backups on different databases. It will be necessary to insure that MSOLBRECOVERYLOG is set to different values before running each on-line backup.

The recovery log file created by empolbak will be removed from disk after the backup process completed. However, in the case where a client process is active during on-line backup but becomes inactive for a significant period of time, therefore, cannot be notified that the on-line backup is complete. In this case, empolbak will issue a message stating that the log file could not be removed. It should then be manually removed at some later time by the DBA. The database backup itself will be complete and correct.

4.6.2 Database Recovery

In case of hardware problem, system failure or user error that destroys the database, the empolrec can be used to restore data from the last on-line backup copy of the database, and roll forward the data from the current recovery log file. This will bring the database up to the point right before the incident.

Make sure that the tape with the last backup is in the drive and ready to be read or that the backup file on disk is available before issuing this command.

Since database is recovering from a backup copy, it is reasonable to assume that the database is damaged and therefore not in use. Otherwise, you should verify that the database is not being accessed before the recovery process. Use empadm database coordinfo or empadm database lockinfo to look for active users.

Please be aware that empolrec will fully remove the old database before restoration begins.

When the recovery process begins, it will remove shared memory, copy the database files back from the backup medium, then the transaction logs. Next, it rolls the recovery log forward directly from the backup medium. At this point, based on the transaction logs and the recovery log from the continuous recovery logging (if present), it determines what transactions must be roll back. Finally, it runs any remaining operations from the recovery log.

The command syntax for empolrec is:

   empolrec option_list argument_list database

Recovery will be from the default device as given by the Empress variable, MSOLBBACKUPDEVICE. In order to specify a different device or file name from which to recover the database, use the -f (file/device) option followed by the name of the file or device. For example, to recover from a file, use the command:

   empolrec -f /tmp/backup movies

Here, /tmp/backup is the name of the file from which to recover the database and movies is the name of the database which is to be recovered. To recover from a DAT tape, you might use /dev/tape as the name of the device:

   empolrec -f /dev/tape movies

Note that using "-" as the file name will read the data for the recovery from the process' standard input.

The blocking factor that was used in making the backup is stored within the header of the backup. Therefore, it is not necessary to specify a block size or blocking factor.

When recovering a database, same pathname must be used as the one from which the database was archived. Attempting to recover database with different pathname from a backup will result in an error. Also attempting to restore database to a different absolute location from which it was backed up, will result in error.

In order to see the contents of the backup, use the -t option:

   empolrec -tf /tmp/backup movies

This will only display the contents of the backup; the database will not be recovered.

The final option is -v which makes empolrec produce verbose information about the recovery process. The name of each database file is printed as it is recovered, as well as the current phase of the recovery. For example:

   empolrec -fv /dev/rst8 movies