Chapter 5
Synchronization in Empress Replication
5.1 Introduction
This chapter explains Synchronization in Empress Replication.
The Client/Server architecture of synchronization is explained with the relationship between replication master server and synchronization client.
Usage of the Synchronization Utility as a client is also explained.
5.2 Replication Master Server and Synchronization Client
Replication Master Server is a kind of Empress Server that
is used for replication purposes. Replication Master Server resides in
RMT-Side and accesses RMTs.
Execution of a Replication Master Server is controlled by
Empress Server Administration Utility (empsvadm).
A Replication Master Server needs to access and control the replication operations of any RMT that wants to participate in a replication
relation. This control is done by Replication Master Server residing in
RMT-Side and accessing that RMT.
In a Replication World, only the Replication Tables that are not RMTs, do not
need to be accessed by a Replication Master Server.
A Synchronization Client is a client
for a Replication Master Server. It accesses the RRT and sends synchronization requests to a Replication Master Server.
The relation between a Replication Master Server and a Synchronization
Client are shown in [Figure: Replication Master
Server and Synchronization Client].
Figure 5-1
Replication Master Server and Synchronization Client
|
5.2.1 Connection between an RMT-Side and an RRT-Side
A Connection between an RMT-Side and an RRT-Side is a Network
Connection between :
- A Synchronization Client residing at
RRT-Side, and
- A Replication Master Server residing at RMT-Side.
In order to establish this connection, the following conditions must satisfy:
- In RMT-Side, a Replication Master Server must be started. and must be
accessing the RMT.
- Synchronization Client must be configured to access the Replication Master
Server on the RMT-Side.
- User of the Replicate Table shall be authorized to access Replication
Master Server residing at RMT-Side. This is an authorization given by the
Administrator of the Replication Master Server to control the access of
Synchronization Clients to this Replication Master Server. This control might
require Synchronization Clients to identify themselves by sending their login
name and password.
(See [User Authorization])
- A network connection between Synchronization Utility and
Replication Master Server must be existent.
- Synchronization Client must have privilege to access RRT
5.2.2 Setting up a Replication Master Server
Setting up a Replication Master Server is done as follows:
-
Setting required environment variables (optional)
-
These environment variables are to be used by Empress Server Administration
Utility.
Generally four environment variables MSUSERAUTHCONFIGFILE,
MSNETSERVERCONFIGFILE, MSNETTYPECONFIGFILE and
MSCONFIGFILEPATH should be set, before starting a Replication
Master Server. The default values of these environment variables are given
in $EMPRESSPATH/config/initfile. These environment variables are explained
in [Configuring Empress Server and
its Clients].
If there is no need to change contents of Network Configuration Files or
setting User Authorization security, go directly to step 3.
-
Network Configuration and User Authorization Configuration (optional, depends on Step 1)
-
Users can change the contents of network server configuration file
and network type configuration file, such as server name, host name,
port number etc. Refer to
[Network Configuration]
for setting Network Configurations. If increased security is required by
checking username and password of administrators or users, the corresponding
password file and user authorization should be created. Refer to
[User Authorization]
for increasing the security and user authorization.
-
Creating an Empress Server Start Configuration File (optional)
-
This is to optionally specify the database and RMTs to be
handled by a Replication Master Server. This is explained in [Starting an Empress Server].
-
Starting an Empress Server
-
In this step an Empress Server is started, using Empress Server Administration Utility.
This is explained in [Starting an Empress Server].
5.2.3 Setting up a Synchronization Client
-
Setting required environment variables (optional)
-
Similar to setting up a Replication Master Server, four environment
variables should be set for Empress Server Utility and Synchronization
Utility on the client side.
Usually the client of an Empress Server needs to set environment variable
MSNETSERVERCONFIGFILE to point to a new network server
configuration file.
These environment
variables are explained in [Configuring Empress
Server and its Clients].
-
Network Configuration and User Authorization Configuration (optional, depends on Step 1)
-
Usually the client of an Empress Server needs to change contents of a network server configuration file, so that the Synchronization Utility can access the Empress Server.
-
Using Synchronization Utility or Empress Server Administration Utility
-
In this step [Synchronization
Utility] sends requests to a running Empress Server. The
functions of [Empress Server
Administration Utility] can also be used to perform administrative and
non-administrative operations on Empress Servers.
5.3 Synchronization Utility
Synchronization Utility emprepsync is a client for a Replication Master Server and runs on the synchronization client side.
In order to establish the connection to a replication master server,
the network configuration must be configured before execution.
Refer to
[Configuring Empress Server and its
Clients].
The utility usage for synchronizing table replicate_table in database
replicate_database is as follows:
$ emprepsync replicate_database replicate_table
Complete options and arguments to this utility are given in [References: Replication Synchronization Utility].
5.4 Synchronization
Synchronization in Empress Replication is the process of updating a Replicate Table with the changed data of its RMT since the last successful synchronization.
The changes to the data of an RMT can be insertion of new records, deletion
and update of existing records of RMT.
The synchronization request can be made manually, or can be automated through some other features like "cron" in
Unix environment. The Synchronization Client then updates the replicate table based on this data.
The RRT-Side of a Replication Relation does not have to be always connected to the host machine running Replication Master Server, but only for the duration of the synchronization.
Note that Empress Replication is an asynchronous process. At any certain time,
the contents of an RRT might not be the same as its RMT. Only at the end
of a "successful" synchronization, the contents of RRT and its RMT are
synchronized.
5.4.1 Table Timestamp and Recovery Timestamp
In order to discuss synchronization algorithms, the following concepts
are needed:
- Current Master Table Start Timestamp
- Table Timestamp
- Recovery Timestamp
Where Current Master Table Start Timestamp (CMTS) is discussed in
[Replication Table Switch].
Table Timestamp is the timestamp that a replicate table has the snapshot
of its Master Table. A replicate table with Table Timestamp TTS means that
it is consistent with master table at TTS. Only a successful synchronization
can change table timestamp of a replicate table.
For a master table, current timestamp is it's own timestamp, which is
increasing with time.
When a replicate table is switched to master table, the original table timestamp
is defined as Recovery Timestamp (RTS), and current timestamp becomes the table
timestamp of current master table. When a replicate table is explicitly
or implicitly synchronized with the new master table, it automatically
inherits
the recovery timestamp and broadcasts to its RRTs for next synchronization.
Generally, Recovery Timestamp of a replication table is less than
its Table Timestamp.
5.4.1 Choosing Replication Master
For Synchronization, a connection between the RRT-Side
containing RRT and RMT-Side containing a "chosen"
RMT is tried to be established. This connection is a
network connection between Synchronization Client (residing in the
RRT-Side), and the Replication Master Server assigned to the "chosen"
RMT (residing in the RMT-Side). Choosing an RMT for a Replicate Table RT is to
find an appropriate RMT that will serve as source of data for updating the RT.
Choosing an RMT for Synchronization is done automatically by Synchronization
Client, considering the Replication Master Entries for the RT.
The process of choosing a Replication Master is explained here.
Synchronization utility opens the assigned replicate table, and gets all
of its enabled Replication Master Entries.
Then following the Replication Master Order, Synchronization utility tries to establish a connection to the replication master server assigned to the chosen replication master entry.
If the connection is established, synchronization utility sends the server
the following information:
- Database name and table name of RMT
- Host name, database name and table name of itself (RRT)
- Original master table information, current master table start timestamp,
table timestamp of RRT and synchronization mode
After server gets the above information, it does the following checks:
- Whether the assigned RMT exists
- Whether RMT is in the list of replication tables of the Replication
Master Server Start Configuration File, if the Replication Master Server is
started with a
[Replication Master
Server Start Configuration File]
- Whether RRT is authorized to synchronize from the RMT
- Whether RRT and RMT are in the same replication world
If all conditions are satisfied, the RMT is chosen. Otherwise, the
synchronization fails.
Note that the list of Replication Master Entries can be altered by:
These manual alterations to list of Replication Master Entries affect the way
that Synchronization Client "chooses" an RMT.
Only RMTs that are accessed by an "enabled" Replication
Master Entry are "chosen". Empress RDBMS "chooses" the RMT that
has the smallest [Replication Master
Order].
Start of a Synchronization Request updates Last Request Time information
of the RRT. Arrival of
Synchronization Request sent to the RMT updates Last
Request Time information of the
RMT. Successful Synchronization updates Last
Successful Request Time and Last Request Status of both
RMT-Side and RRT-Side.
5.4.2 Synchronization Requirements
A successful Synchronization has three main steps. These are:
- "Choosing" an RMT, as described in
[Choosing Replication Master].
- Establishing a connection between the Synchronization
Client in the RRT-Side where the RRT resides,
and the Replication Master Server in the RMT-Side where the "chosen"
RMT resides, (as explained in [Connection between an RMT-Side and an
RRT-Side].)
- Synchronizing the RRT data, with the
changed, up-to-date data of the RMT, since the last
Synchronization.
5.4.3 Synchronization Algorithms
After the RMT is chosen, the server chooses the corresponding algorithm
to complete the synchronization by comparing current master table start
timestamp (CMTS) and table timestamp(TTS) in both RRT and RMT.
The flowchart representation of synchronization algorithms is shown in
[Figure 5-2: Flowchart for Synchronization Algorithms].
In this section, synchronization algorithms on subset replication
are not discussed. They are similar to full set Synchronization
algorithms, only that SRSC must be applied for synchronization.
5.4.3.1 Algorithm 1: Forward algorithm
Forward algorithm is applied for the following cases:
- CMTS(RRT) == CMTS(RMT) and TTS(RRT) < TTS(RMT)
- CMTS(RRT) < CMTS(RMT) and TTS(RRT) < RTS(RMT)
Replication Master Server just collects all data changed on RMT between
TTS (RRT) to TTS (RMT). i.e. it performs the following pseudo-query:
select * from RMT
where EMPRESS_TIMESTAMP > TTS (RRT)
and EMPRESS_TIMESTAMP <= TTS (RMT)
and sends it to synchronization client.
After synchronization client gets the changed data, synchronization utility
will automatically update the replicate table, and changes its TTS, and
synchronization status. If CMTS of RRT is less than CMTS of its RMT, then
its RTS and CMTS are also automatically modified.
5.4.3.2 Algorithm 2: Backward algorithm
Backward algorithm is applied for the following case:
CMTS (RRT) == CMTS (RMT), TTS (RRT) > TTS (RMT)
- if "force synchronization" option is not applied:
then no synchronization is done
- if "force synchronization" option is applied:
Replication Master Server first asks the client to send back the change
list between TTS(RMT) to TTS (RRT) on RRT, i.e. it performs the following
pseudo-query:
select * from RMT
where EMPRESS_RECORD_NUMBER (or Primary key) in changed list
then collects the data depending on the change list on RMT,
and sends it to client. After synchronization client gets the data,
it first physically deletes all data
with EMPRESS_TIMESTAMP between TTS(RMT) to TTS(RRT) on RRT:
delete from RRT
where EMPRESS_TIMESTAMP > TTS(RMT)
and EMPRESS_TIMESTAMP < TTS(RRT)
then applies the changed data to RRT:
insert into RRT
(select * from RMT
where EMPRESS_TIMESTAMP > TTS(RMT)
and EMPRESS_TIMESTAMP < TTS(RRT))
then rolls back RRT to TTS(RMT) status, and it
finally updates TTS(RRT) and synchronization status.
5.4.3.3 Algorithm 3: Recovery algorithm
Recovery algorithm is applied for the following case:
- CMTS (RRT) < CMTS (RMT) and TTS (RRT) > RTS (RMT)
Replication Master Server first asks the client to send back the change
list between RTS(RMT) to TTS(RRT) on RRT, then collects the data
depending on the change list and the data between RTS(RMT) and TTS(RMT) on
RMT, i.e. it performs the following query:
select * from RMT
where EMPRESS_RECORD_NUMBER (or Primary key) in changed list
or (EMPRESS_TIMESTAMP > RTS(RMT) and EMPRESS_TIMESTAMP <= TTS(RMT))
and sends it to client. After synchronization client get the data,
it first physically deletes all data with EMPRESS_TIMESTAMP between RTS(RMT) and
TTS(RRT) on RRT:
delete from RRT
where EMPRESS_TIMESTAMP > RTS(RMT)
and EMPRESS_TIMESTAMP < TTS(RRT)
then applies the changed data to RRT:
insert into RRT
(select * from RMT
where EMPRESS_TIMESTAMP > RTS(RMT)
and EMPRESS_TIMESTAMP < TTS(RMT))
It finally update TTS(RRT), RTS(RRT), CMTS(RRT) and its synchronization status.
Figure 5-2
Flowchart for Synchronization Algorithms
|