User Community Service Desk Downloads
If you can't find the product or version you're looking for, visit support.ataccama.com/downloads

Event Handler

A Data Event Handler represents an interface in the MDM Hub solution that captures changes in entity attributes during data processing and forwards change events to external systems using various connection channels. Change events in record data always originate from a load operation (either a batch load or online processing request). Change events can be transformed and published with current attribute values and old attribute values (filtering rules may be applied to some extent). Data change events originate from monitored entities across all MDM Hub model layers.

Event Handlers Configuration File

Configuration file of event handlers is a list of all event handlers that are to be used. This file is referenced in model configuration (see Model) and configuration is generated to nme-event_handler.gen.xml.

<?xml version="1.0" encoding="utf-8"?>
<handlers>
  <handler class="first.desired.handler.Class" name="firstHandler">
    ... handler configuration ...
  </handler>
  <handler class="second.desired.handler.Class" name="secondHandler">
    ... handler configuration ...
  </handler>
  ...
</handlers>

Every handler can have a name, if not given, name of event handler implementation class is taken. However, the name must be unique - if several event handlers are used, their names must be set and must be different.

Event Handlers

EventHandlerAsync

Asynchronous event handler; during NME operation it stores all necessary data to a specified directory. Only when the NME operation is completed, it initiates publishing of the data. If publishing of an event fails for any publisher, the publishing is stopped. When resumed, the event at which the failure occurred is published again with all publishers (that means some publishers may have distributed it two or more times). The preceding events are not published again.

<handler class="com.ataccama.nme.engine.event.handler.EventHandlerAsync">
  <processor class="processor.class"/>
  <persistenceDir>/persistence/directory</persistenceDir>
  <filter>
    ...
  </filter>
  <publishers>
    <publisher class="first.desired.publisher.Class">
      ... publisher configuration ...
    </publisher>
    <publisher class="second.desired.publisher.Class">
      ... publisher configuration ...
    </publisher>
    ...
  </publishers>
</handler>
  • persistenceDir - the path to a directory where all the persisted data will be stored (temporarily, before they are published). This directory should not be used to store anything else and must not be shared by multiple event handlers.

  • filter - a filter (see Filters) can pre-select events that will be persisted (and, consequently, published).

  • publishers - the list of all publishers (see Publishers) which will distribute the data into various systems, depending on their nature and configuration.

    Event handler publishing schema

Processors

Processors consume events, each in a different way. A single Event Handler configuration defines exactly one Processor. The following Processors are available.

SimpleEventProcessor

Passes events on the input straight to the configured publishers (grouping events per operation).

<handlers>
    <handler name="default" class="com.ataccama.nme.engine.event.handler.EventHandlerAsync">
    <processor class="com.ataccama.nme.engine.event.handler.SimpleEventProcessor"/>
        ...
    </handler>
    ...
</handlers>

BatchingEventProcessor

Groups events by entity type and then passes them to publishers in small groups (see nme.eventHandler.publish.batchSize runtime parameter, see Runtime Parameters).

<handlers>
    <handler name="default" class="com.ataccama.nme.engine.event.handler.EventHandlerAsync">
        <processor class="com.ataccama.nme.engine.event.handler.BatchingEventProcessor"/>
        ...
    </handler>
    ...
</handlers>

GroupingEventProcessor

Groups related events. Related events are defined in a logical-model-like configuration of entities (the <roots> element contains these models). There can be several roots. Master entity-based roots can have both other master entities as related (<childEntities>) and instance entities (<instance.name="">). Instance entity-based roots can have other instance entities as related entities (children). Entities belonging to one group are sent to publishers together.

<handlers>
    <handler name="grouping" class="com.ataccama.nme.engine.event.handler.EventHandlerAsync">
        <processor class="com.ataccama.nme.engine.event.handler.GroupingEventProcessor">
            <roots>
                <root>
                    <master name="party" view="masters">
                        <childEntities>
                            <entity name="address" relationshipName="party"/>
                            <entity name="contact" relationshipName="party_has_contact"/>
                            <entity name="rel_party_party" relationshipName="parent child"/>
                            <entity name="id_document" relationshipName="party"/>
                            <entity name="consent" relationshipName="party"/>
                        </childEntities>
                    </master>
                </root>
            </roots>
        </processor>
        ...
    </handler>
    ...
</handlers>

Processor Behavior Comparison

In the following table you can see differences in the Event Processors behavior. There are two master entities with 1:N relation (parent - party, and child - contact). Assume there are also 2 EH Publishers: 1 plan publisher for party+contact, and 1 party SQL publisher, and the nme.eventHandler.publish.batchSize = 1.

Table 1. Changed Entity Data Events
Party (PK: id) Contact (FK: party_id)

id=1, event_type="U"

/

id=2, event_type="I"

party_id=2, event_type="I"

id=3, event_type="U"

/

/

party_id=4, event_type="U"

/

party_id=5, event_type="U"

Event Processor Type Description Number of Executed Batches per Publisher Recommended Use

Simple

There are 3 distinct Party events, and 3 distinct Contact events.

Processor will call Plan publisher once and will pass all the 6 data events (3 Carty + 3 Contacts) in one publishing action. All events will be available, therefore aggregation or joining of related records is possible

1

Aggregations

File outputs

Batching

There are 3 distinct Party events, and 3 distinct Contact events.

Processor will call Plan publisher M*N times (where M= number of entities, N=roundUp(distinct records/nme.eventHandler.publish.batchSize, 0)). Party events and Contact events will be delivered to the plan publisher sequentially, therefore any aggregation or joins are not possible.

6

Faster events delivery (data chunks)

Append storage (DB)

Grouping

There are 3 distinct Party events, and 3 distinct Contact events. Grouping Event Processor is configured having master Party as the root model entity, and Contact as related (silver) entity.

Grouping Processor uses the model relationships to send Party record (id=2) together with Contact (party_id=2), all other records (Party and Contact) will be sent to Plan publisher independently (for nme.eventHandler.publish.batchSize=1) or in a small group. Therefore the plan publisher will always have the corresponding events sent together, however, missing (non-triggered records) will not be retrieved from the MDM Storage. See TraversingPlanPublisher.

5

Grouping of related events

Processors and Publishers are two independent sub-modules of the Event Handler feature. Combining the different aspects of various Processors and Publishers is possible, however, some combinations might produce redundant configurations (that is, Grouping Processor and Traversing Publisher), and the developer will be notified. The configuration might still be syntactically valid though.

Filters

Filter is an element that allows pre-filtering the events so only certain events are actually accepted.

<filter>
  <filterExpression>meta.event_type = "UPDATE"</filterExpression>
  <entities>
    <entity name="party" layer="instance">
      <filterExpression>new.src_name != old.src_name</filterExpression>
    </entity>
    <entity name="party" layer="master" masterView="MasterView" />
  </entities>
</filter>
  • filterExpression - a boolean DQC expression can be specified here. In that case, only the events for which the expression evaluates to true are accepted. Can be omitted to accept all events. Only metadata columns are available here. For more information, see [Evaluation of DQC Expressions].

  • entities - the second filtering tool allows you to specify entities only from which the events will be accepted (all other entities' events will be filtered out). Each entity is identified by its name, layer (instance/master) and for master entities (and only for master entities) masterView. This has to be an existing entity that has not been already filtered out earlier (usually by a filter on the parent element). Each entity can have its own filterExpression that works exactly like the filterExpression on the filter, except all the data columns are available (old and new values of the changed records as well as metadata columns).

Delegating Publishers

Delegating publisher can be used wherever a publisher is expected, but it’s not doing the actual publishing - instead adding another kind of functionality and for the publishing itself using another publisher (delegate). This delegate can also be a delegating publisher, cascading this event handling chain one level further.

FilteringPublisher

Adds the possibility to filter the flow of incoming events to the underlying publisher.

<publisher class="com.ataccama.nme.engine.event.handler.publishers.FilteringPublisher">
  <filter>
    ...
  </filter>
  <delegate class="delegate.publisher.Class">
    ...
  </delegate>
</publisher>
  • filter - a filter.

  • delegate - another publisher that will be used for the actual publishing of the events that pass through the specified filter (this publisher can also be another delegating publisher).

RetryingPublisher

Adds the possibility to retry a failed attempt to publish an event. Each time a publishing of an event fails, one global retry is consumed to retry the publishing of this event after the specified retry delay. If there are no global retries left, the publishing fails like it would without this Retrying Publisher. Furthermore, after the specified amount of consecutive publishing attempts that were all successful, one additional global retry is generated. This way the reserve of global retries can grow (up to the specified maximum amount of retries) to be used later.

<publisher class="com.ataccama.nme.engine.event.handler.publishers.RetryingPublisher">
  <globalRetries>5</globalRetries>
  <retryDelay>20</retryDelay>
  <numberOfConsecutiveSuccessesGrantingRetry>10</numberOfConsecutiveSuccessesGrantingRetry>
  <maximumRetries>30</maximumRetries>
  <delegate class="delegate.publisher.Class">
    ...
  </delegate>
</publisher>
  • globalRetries - the initial amount of retries.

  • retryDelay - the delay after a failed attempt to publish an event before retry (in seconds).

  • numberOfConsecutiveSuccessesGrantingRetry - the amount of consecutive successes that will generate another retry; 0 means no retries will be generated.

  • maximumRetries - the limit of retries in reserve.

Publishers

Publishers that do the actual publishing - sending the events in the way specified by their nature and configuration.

StdOutPublisher

This publisher outputs events to StdOut. It is useful for debugging and presenting.

<publisher class="com.ataccama.nme.engine.event.handler.publishers.StdOutPublisher">
  <transformer>
    ...
  </transformer>
</publisher>

EventHttpSoapPublisher

This type of publisher constructs a SOAP message and then sends it through an HTTP protocol to a specified destination.

<publisher class="com.ataccama.nme.engine.event.handler.publishers.EventHttpSoapPublisher">
  <urlResourceName>dataChangesTargetSystem</urlResourceName>
  <soapAction>SOAP_ACTION</soapAction>
  <soapVersion>SOAP_1_1</soapVersion>
  <timeout>5000</timeout>
  <encoding>UTF-8</encoding>
  <delayBetweenRequestsMs>0</delayBetweenRequestsMs>
  <transformer>
    ...
  </transformer>
</publisher>
  • urlResourceName - name of the URL resource configured in the server runtime configuration file (see Runtime Configuration).

  • soapVersion - optional entry, defaulting to SOAP 1.1. Allowed values are SOAP_1_1 and SOAP_1_2.

  • timeout, encoding and delayBetweenRequestsMs - optional entries, with defaults listed in the example configuration.

  • transformer - this is the specification of a transformer element (see Transformers), an element that will transform the event into text (in this instance, the SOAP message).

iSMPublisher

Compiles a XML message from every event it receives. Every XML message then prefixes with length of the message in a form of 4-byte binary integer (big endian) and sends it separately to a specified host and port.

<publisher class="com.ataccama.nme.engine.event.handler.publishers.IsmPublisher">
  <host>localhost</host>
  <port>8888</port>
  <targetSystem>SYSTEM</targetSystem>
</publisher>
  • host - target host.

  • port - target port.

  • targetSystem - the "targetSystem" property in the message. This property is a template (see Templates) of general nested DQC expressions (see [Evaluation of DQC Expressions]).

Example of the message:

<?xml version='1.0' encoding='UTF-8'?>
<mdc.Event targetSystem="SYSTEM">
    <party changeType="UPDATE">
        <attributes>
            <source_id>100</source_id>
            <master_id>2</master_id>
            <src_name>John Smith</src_name>
            ... more data columns ...
            <eng_active>true</eng_active>
        </attributes>
    </party>
</mdc.Event>

EventSqlPublisher

This type of publisher executes an SQL statement on a database.

<publisher class="com.ataccama.nme.engine.event.handler.publishers.EventSqlPublisher">
  <dataSource>DATA_SOURCE</dataSource>
  <templates>
    <template>
      <entity name="party" layer="master" masterView="MasterView" />
      <template>insert into TABLE values (${new.src_name}, ${old.src_name})</template>
    </template>
    ...
  </templates>
</publisher>
  • dataSource - specifies the MDM data source: a named database connection defined in server configuration.

  • templates - unlike other publishers, SQL publisher doesn’t feature a transformer for creating target text messages. Instead, a set of SQL-specific templates is used to create SQL statements directly. This increases the performance of the publishing considerably. SQL templates have some additional constraints over the common templates (see Templates): they only allow to bind DQC expressions to SQL variables. They do not allow changing the SQL statement itself.

    Every template is mapped to one entity (one template per entity). If a template for an entity is not defined, events for such entity are not published.

    Allowed templates:

    insert into TEST (column1, column2) values (${old.src_name}, ${new.src_name})
    
    delete from TEST where column1=${
      case (
      meta.eventType is 'INSERT', 1,
      meta.eventType is 'UPDATE', 2,
                    3) }
    
    { call PACKAGE_NAME.PL_SQL_FUNCTION_NAME(${parameter1}, ${parameter2}) }

    Not allowed:

    ${'INSERT'} into TEST values (1)
    
    insert into ${old.table_name} values (1)
    
    { call ${'PACKAGE_NAME'}.${PL_SQL_FUNCTION_NAME()} }

EventPlanPublisher

This publisher sends events as records to a configured plan. Only events of the configured entities are processed. The plan is generated with an Integration Input step per each configured entity named with the following naming convention: <entity_name>.master.<view> for master entities and <entity_name>.instance for instance entities.

<publisher class="com.ataccama.nme.dqc.event.EventPlanPublisher">
    <planFileName>../engine/events/eh-test.comp</planFileName>
    <entities>
        <entity layer="master" masterView="master" name="party"/>
        <entity layer="instance" name="party"/>
        <entity layer="instance" name="address"/>
        <entity layer="instance" name="contact"/>
        <entity layer="instance" name="contact_info"/>
    </entities>
</publisher>

EventPlanPublisher propagates data in batches (applies to Batching and Grouping Processors). Therefore steps that operate with all records in one processing run (like Text File Writer, Representative Creator, JDBC Writer with a "before" script) do not behave as expected. Data change events might get overwritten with each batch processing, or might not be part of the same publishing batch.

This behavior is required to support the "at-least-one semantic" of EventHandlerAsync and related publishers.

Integration input step can have the following columns:

  • meta_<metaColumnName> - see the list of columns in the [Evaluation of DQC Expressions].

  • new_<columnName> - columns of the configured entity, including eng_active.

  • old_<columnName> - columns of the configured entity, including eng_active.

TraversingPlanPublisher

Traversing Plan Publisher (in the XML configuration seen as TraversingEventAdapter) is an extension of the plan publisher that provides input to the plan by traversing a model-like structure (derived from the MDM Hub model layers). It ensures that all related records defined by the model are present in the plan. The configuration consists in defining a root entity and related entities. Master entity-based roots can have both other masters as related entities (<childEntities>) and instance entities (<instance.name="">). Instance entity-based roots can have other instance entities as related entities. When an event containing one or more entities from the defined model is sent to the publisher, the adapter traverses through all the relationships between the entities, retrieves their data and provides them to the plan to their corresponding Integration Inputs.

The Traversing Plan Publisher is called by the Processor for each group of related records that will always contain at least one record that had some change event (see further). The record group will always be consistent and complete based on the defined publisher’s model configuration (that is usually a subset of the MDM hub model).

With the configured adapter, the publisher configuration looks as follows:

<publisher class="com.ataccama.nme.dqc.event.EventPlanPublisher">
    <planFileName>../engine/events/tp_party_master.comp</planFileName>
    <entities>
        <entity layer="master" masterView="masters" name="party"/>
        <entity layer="master" masterView="masters" name="address"/>
        <entity layer="master" masterView="masters" name="contact"/>
        <entity layer="master" masterView="masters" name="id_document"/>
        <entity layer="master" masterView="masters" name="consent"/>
        <entity layer="instance" name="party"/>
    </entities>
    <adapter class="com.ataccama.nme.engine.event.handler.publishers.TraversingEventAdapter">
        <root>
            <master name="party" view="masters">
                <childEntities>
                    <entity name="address" relationshipName="party" view="masters"/>
                    <entity name="contact" relationshipName="party_has_contact" view="masters"/>
                    <entity name="id_document" relationshipName="party" view="masters"/>
                    <entity name="consent" relationshipName="party" view="masters"/>
                </childEntities>
                <instance name="party"/>
            </master>
        </root>
    </adapter>
</publisher>

Integration input step can have the following columns:

  • meta_<metaColumnName> - see the list of columns in table in [Evaluation of DQC Expressions].

  • record_<columnName> - columns of the configured entity with current (actual) record values.

  • old_<columnName> - columns of the configured entity with old (previous) record values taken from event.

  • meta_id (long) - NME identifier of the record.

  • has_event (boolean) - this flag is true for records triggering the event, false for records preloaded into the data flow using model traversing.

Transformers

A transformer is an element that transforms an event into a string - a message that will be distributed by a publisher.

Simple XML Transformer

Transforms events to simple predefined XML - does not require nor provide any template configuration.

<transformer class="com.ataccama.nme.engine.event.handler.publishers.transformers.SimpleXmlTransformer">
  <indent>true</indent>
  <includeOldValues>true</includeOldValues>
</transformer>
  • indent - optional, if true (default) XML is pretty-printed - indented using \t.

  • includeOldValues - optional, if true (default) XML contains old values of attributes.

SimpleXmlTransformer produce following XML:

<event>
  <metadata>
    <id>1001</id>
    <event_type>UPDATE</event_type>
    <entity>party</entity>
    ...
  </metadata>
  <attributes>
    <source_id>306</source_id>
    <src_name>John Smith</src_name>
    ...
  </attributes>
  <oldAttributes>
    <source_id>306</source_id>
    <src_name>John Smith Jr.</src_name>
    ...
  </oldAttributes>
</event>

Expression Template Transformer

A transformer based on a template.

<transformer class="com.ataccama.nme.engine.event.handler.publishers.transformers.ExpressionTemplateTransformer">
  <entity name="party" layer="master" masterView="MasterView" />
  <template>Some text with ${'DQC expressions'} inside it.</template>
</transformer>
  • entity - optional entry that declares the entity that flows into this transformer (and if there’s an incoming event from a different entity, an exception is thrown); if omited, all entities are accepted, but no data columns are available.

  • template - the intended text with the nested DQC expressions.

  • begin - optional entry; can change the default begin mark of nested DQC expressions.

  • end - optional entry; can change the default end mark of nested DQC expressions.

Multi Transformer

Multi transformer can join several transformers, each handling one entity, to a single transformer that can handle all of them. This is the preferred way to deal with transforming several entities where the data columns contents are necessary.

<transformer class="com.ataccama.nme.engine.event.handler.publishers.transformers.MultiTransformer">
  <transformers>
    <transformer class="first.transformer.Class">
      ...
    </transformer>
    <transformer class="second.transformer.Class">
      ...
    </transformer>
    ...
  </transformers>
 </transformer>

Templates

A template is an attribute type. It is basically a string type, but it allows nested DQC expressions that give you all the tools and expressive power of DQC engine. Expressions are denoted by ${ …​ } notation by default.

<template>
  <entity name="party" layer="master" masterView="MasterView" />
  <template>This is a common string and ${'this will be evaluated as DQC expression'}</template>
</template>
  • entity - specifies the entity this template applies to. It is therefore possible to use all the entity specific data columns in the template’s expressions.

  • template - the text template itself, with nested DQC expressions (see [Evaluation of DQC Expressions]). Since the expressions are part of a string, they have to be of a STRING type. This includes converting any columns of a non-string type to string (by toString() DQC function) if necessary.

Evaluation of ONE expressions

There are two types of DQC expressions in event handling: general and entity specific.

  • general: this type of DQC expression only allows access to metadata of the event. These are available through dot-source meta. Available columns follow.

    name type content

    id

    LONG

    Internal record id.

    event_type

    STRING

    Type of event (INSERT/UPDATE/DELETE).

    entity

    STRING

    Name of the entity this event originated from.

    layer

    STRING

    The layer of the entity this event originated from (instance/master)

    master_view

    STRING

    Name of the entity master view this event originated from if it is a master entity; null otherwise.

    origin

    STRING

    The origin of the data for an instance entity; null otherwise.

    source_system

    STRING

    The source system of the data for an instance entity, null otherwise.

    activation_date

    creation_date

    deactivation_date

    deletion_date

    last_update_date

    last_source_update_date

    DAY

    The date the record was activated, created, deactivated, deleted, last updated, last updated from source, respectively.

    activation_tid

    creation_tid

    deactivation_tid

    deletion_tid

    last_update_tid

    last_source_update_tid

    LONG

    Id of the logical transaction that activated, created, deactivated, deleted, last updated, last updated from source the record, respectively.

  • entity specific - this type of DQC expression allows access to all metadata, like general DQC expression. Furthermore, because it is entity specific, it allows access to all data columns of the record on which the change occured - both new and old values. These are available through dot-sources new and old. In addition to all columns of that specific entity, the internal column eng_active is also available through these dot-sources.

Non-functional Features

EventHandler stores events in persistent storage to deliver at-least-once semantics. Such approach requires disk space (this should be taken into account for disk sizing) that can be computed as follows:

required_disk_space = event_size x event_count.

Was this page useful?