Distributed Architecture

Distra's distributed architecture inherently provides a replicated, scalable and fault-tolerant operating environment for enterprise class payment applications.

The Distra platform follows a true distributed computing model, with the application distributed as a collection of distributed software services, across separate nodes or servers within the platform.

The platform manages two distinct types of services; the peer-to-peer services are managed in such a way that distribution is supported via the reliable messaging mechanism for state distribution and each service is considered a peer. Primary-secondary services are replicated in such a way that one is designated as the primary instance and the others are considered ordered replicas or slaves, each being kept in constant synchronisation and the Distra platform orchestrating the failover of the primary to an available secondary instance in the event of failure.

Real-time distribution facilitates high performance via dynamic load-balancing, allowing processesing to spread across all resouces available to the platform.

Synchronisation is managed by using a process called Group Communications. Group Communications provides a mechanism to allow for reliable communications between elements of distributed systems that have inherently unreliable components, such as hardware, operating systems, networks and databases. In practice, the platform uses Group Communications to ensure reliable messaging and synchronisation of distributed services.

Replication ensures that each distributed instance of a service is kept within the same state; modifications to a service's shared state are guaranteed to be consistent replicated to the other instances by the Group Communications mechanism.

Failure detection involves detecting blocking, overload and exception conditions within services and the applications server itself. These may be caused by transient conditions such as network outages or blocked threads. Under these circumstances, the platform takes evasive action such as quarantining the problematic service and migrating processing to a good known replica on another instance.

Recovery involves automatically and seamlessly returning the failed service back to a steady state of operation.

 

Request more information
 
Find out more