History Plugin
History plugin enables keeping history of records in the hub.
The main features are:
-
Configurable projection of the MDM model and data changes.
-
Only specified entities and columns are stored, for instance entities, only records from the configured source systems are stored.
-
Only changes of specified columns are considered as changes to be stored.
-
-
Independent optimized storage.
-
Historical data can be stored in a separate database (for example, with a different SLA or lower price per storage unit).
-
Data is stored as a compressed BLOB.
-
-
Native services and batch exports including time query support.
How to Enable the History Plugin
Previously, the history plugin was configured in the main configuration file nme-config.xml
(see MDM Configuration):
<?xml version="1.0"?>
<config>
<!-- NME configuration -->
<plugins>
<plugin class="com.ataccama.nme.history.NmeHistoryPlugin">
<storageDirectory>../storage/history</storageDirectory>
<persistenceLayer class="com.ataccama.nme.persistence.vldb.VldbPersistenceFactory">
<dataSource>mdc_db</dataSource>
<prefix>H_</prefix>
</persistenceLayer>
<configFile>nme-history.xml</configFile>
</plugin>
<!-- more plugins -->
</plugins>
</config>
Attribute | Description |
---|---|
|
Directory where events are temporarily stored before publishing to the history persistence. |
|
Persistence where historical data is permanently stored, |
|
XML file containing the definition of the history model: entities and columns. |
Due to the broken compatibility after refactoring (see Ataccama 15.2.0 Release Notes, section Known Issues), the history plugin configuration defined in the The issue will be resolved in 15.4.0.
After version 15.4.0, definitions in the |
Detailed Configuration of the History Model
The default configuration file with the history model is nme-history.xml
, with the following structure:
<?xml version="1.0"?>
<config>
<model>
<entities>
<entity name="party" layer="instance">
<sourceSystems>
<sourceSystem>crm</sourceSystem>
<sourceSystem>life</sourceSystem>
</sourceSystems>
<columns>
<column name="src_first_name" />
<column name="uni_rule_name" traced="false" />
<column name="master_id" />
<!-- more columns -->
</columns>
</entity>
<entity name="party" layer="master" masterView="mas">
<columns>
<column name="cmo_first_name" />
<column name="cmo_sin" />
<!-- more columns -->
</columns>
</entity>
<!-- more entities-->
</entities>
</model>
</config>
The configuration contains the identification of entities and columns from the Model (see Model) and history settings.
Every mentioned entity
must be present in the model and is identified using the following attributes:
Attribute | Description |
---|---|
name |
Name of the entity in the model |
layer |
Type of layer: instance or master |
masterView |
Name of the master layer |
sourceSystems |
(Optional) (applicable to the instance layer only) List of source systems whose records are historized; if empty, records from all source systems are historized |
Entities contain a list of columns
with the following attributes:
Attribute | Default | Description |
---|---|---|
name |
N/A |
Name of the column in the NME model |
traced |
true |
true - changes of this column are considered as record changes and a new version of the record is created and stored false - changes of these columns are ignored, that is, this column is only stored in case of change in other traced column |
searchable |
false |
true - this column is stored as a separate persistence column to support filtering, usable for permissions false - this column is stored in a BLOB, all non-searchable columns are stored in one compressed BLOB to conserve space, however efficient filtering on this column is not possible |
The relationship between historical records and regular MDM data is through the ID column. The record in the history table takes over its MDM ID from the data table. That is how both instance and master records are linked with history. |
History Plugin Native Services
The History plugin provides new native services get<entityName>InstanceHistoryById and get<entityName>MasterHistoryById.
These services provide full history for one record identified by id
(NME internal id).
get<entityName>InstanceHistoryById
Simple request:
<get:getpartyInstanceHistoryById>
<get:id>91212</get:id>
</get:getpartyInstanceHistoryById>
<getpartyInstanceHistoryByIdResponse xmlns="http://www.ataccama.com/ws/nme/getpartyInstanceHistoryById">
<party>
<metadata>
<id>91212</id>
<origin>life#party#party</origin>
<sourceSystem>life</sourceSystem>
<valid_from>2015-04-10T16:50:32+02:00</valid_from>
<valid_to>2015-04-10T16:50:55+02:00</valid_to>
<active>true</active>
<sourceTimestamp>2015-04-10T16:50:54+02:00</sourceTimestamp>
</metadata>
<attributes>
<source_id>300</source_id>
<src_first_name>Jack</src_first_name>
</attributes>
</party>
<party>
<metadata>
<id>91212</id>
<origin>life#party#party</origin>
<sourceSystem>life</sourceSystem>
<valid_from>2015-04-10T16:49:28+02:00</valid_from>
<valid_to>2015-04-10T16:50:32+02:00</valid_to>
<active>true</active>
<sourceTimestamp>2015-04-10T16:50:31+02:00</sourceTimestamp>
</metadata>
<attributes>
<source_id>300</source_id>
<src_first_name>Johny</src_first_name>
</attributes>
</party>
<!-- more history versions of this record -->
</getpartyInstanceHistoryByIdResponse>
Instance history can be retrieved via a combination of source_id and origin instead of id:
<get:getpartyInstanceHistoryById>
<get:source_id>300</get:source_id>
<get:origin>life#party#party</get:origin>
</get:getpartyInstanceHistoryById>
To limit returned history, use timeQuery with from and/or to elements:
<get:getpartyInstanceHistoryById>
<get:id>91212</get:id>
<get:timeQuery>
<get:from>2015-04-10T16:50:32+02:00</get:from>
<get:to>2015-04-10T10:50:32+02:00</get:to>
</get:timeQuery>
</get:getpartyInstanceHistoryById>
The service will return all history versions of the records that were valid after from and before to values.
To get record history at specific point in time, use timeQuery with at element:
<get:getpartyInstanceHistoryById>
<get:id>91212</get:id>
<get:timeQuery>
<get:at>2015-04-10T10:50:32+02:00</get:at>
</get:timeQuery>
</get:getpartyInstanceHistoryById>
The service will return only one history version or nothing if the record did not exist at a given point in time. If there were no relevant record changes, the service will return nothing because the requested version is in the operational storage of NME.
History Plugin Batch Export
When the history plugin is active, the following two data providers for batch exports are available:
-
com.ataccama.nme.history.batch.HistoryInstanceEntityDataSource
-
com.ataccama.nme.history.batch.HistoryMasterEntityDataSource
HistoryInstanceEntityDataSource
This data source provides history records of the specified instance entity
.
Example configuration:
<dataProvider class="com.ataccama.nme.history.batch.HistoryInstanceEntityDataSource">
<prefix>history_inst_</prefix>
<entity>party</entity>
</dataProvider>
Attribute | Description |
---|---|
prefix |
Prefix of integration inputs in the export plan using this provider |
entity |
Name of the entity |
sourceSystem |
Only records from this source system are provided |
HistoryMasterEntityDataSource
This data source provides history records of the specified master entity. Example configuration:
<dataProvider class="com.ataccama.nme.history.batch.HistoryMasterEntityDataSource">
<prefix>history_inst_</prefix>
<entity>party</entity>
<viewName>masters</viewName>
</dataProvider>
Attribute | Description |
---|---|
prefix |
Prefix of integration inputs in the export plan using this provider |
entity |
Name of the entity |
viewName |
Name of the master layer |
Limit Export with Time Query
To limit history by timeQuery, start a batch export operation with the following parameters:
Parameter | Description |
---|---|
timeQueryFrom |
Only history versions valid from this date |
timeQueryTo |
Only history versions valid to this date |
Time format is either SOAP (via native service invokeExportOperation) 2015-04-10T16:50:32+02:00
or 2015-04-10 16:50:32
.
Available Metadata Columns
Besides columns defined in the history model, the following columns are available in the export plan:
Column | Type | Description | Instance/Master |
---|---|---|---|
id |
LONG_INT |
Record ID, internal engine id |
I, M |
eng_valid_from |
DATE |
Date from which record version was valid |
I, M |
eng_valid_to |
DATE |
Date till which record version was valid |
I, M |
eng_active |
BOOLEAN |
Record version activity (true/false) |
I, M |
eng_origin |
STRING |
Record origin |
I |
eng_source_system |
STRING |
Record source system |
I |
eng_source_timestamp |
DATE |
Record version source timestamp |
I |
How Time Validity and Time Query Work
Each record provided in the response has time validity defined by the validFrom
and validTo
attributes, which are taken from the so called commitTimestamp of RW transactions.
The newer record’s validFrom is equal to the previous record validTo
, that is, record1.validTo=record2.validFrom, with the validity period being a half-open interval as follows: [validFrom, validTo).
That means that if an at
query=record1.validTo=record2.validFrom, record2 is returned.
In general, which records are returned as a response to a query is determined by the following conditions:
Type of query | Condition |
---|---|
at |
record.validFrom ⇐ query < record.validTo |
from |
record.validTo > query |
to |
record.validFrom ⇐ query |
Was this page useful?