~~CLOSETOC~~
<html><font color=#990000 size="+2"><b>Platform Architecture</b></font></html>

The StreamScape data platform, also referred to as a reactive data fabric, facilitates the hosting of application services, manages in-memory data and provides a high-performance communications layer that allows platform components to exchange data in real-time.

Motivation behind the platform’s unified design is simple: Deliver solutions faster. Don’t waste time integrating solution components.  The challenge of efficient cooperative processing has become a major obstacle to effective system design.  Enterprise systems remain fragmented with key resources often hidden behind layers of complex integration software.  The tools used by enterprise architects to integrate and automate systems have become too complex and cumbersome.  Often, the tools themselves require significant integration and development effort in order to remain useful. This limitation stifles innovation, increases development cycles and makes change management difficult.  The result is an increase in cost, sub-par solutions and brittle systems.

{{ :doc:st-integration.png?nolink |}}

The platform offers a new way of developing analytic applications that integrate fragmented data sources, allowing users to query data development platform that affects a so-called straight-thru integration of features and capabilities otherwise found in application servers, messaging systems and distributed caching technologies.  The product’s light-weight design and seamless integration of services, query language and application tools provides a cost-effective, high-performance alternative to similar solutions implemented by using independent software components and a traditional technology stack.

Similar to the way smart-phones integrated e-mail, web browsers and telephony devices into a new application platform, the StreamScape application fabric provides a unified enterprise infrastructure for developing integrated and collaborative applications. While the technology does not change the overall system function, it has significant impact on how such systems are designed, implemented and used.  

=====Federated Architecture=====

The StreamScape application fabric implements a federated architecture design; an approach to enterprise architecture that allows for the most optimal interoperability and information sharing between autonomous, de-centralized information systems and applications.

As a general principle, federated architecture is an approach to coordinated sharing and exchange of information which is organized into models that describe common concepts and behavior. The approach emphasizes controlled sharing and exchange of information between autonomous components by communication via messages (events). Autonomous fabric components engage in cooperative data processing and are expected to adhere to common data models by using well-defined interfaces.

The goal of a federated architecture is to provide the highest possible autonomy in order to reduce system complexity, which in turn increases agility. The expected result is a high degree of flexibility — which allows architects to address technology problems at the local level, thereby helping users to solve complex distributed system problems better. 

The application fabric facilitates localized autonomy on several levels.  Security and Authorization, Configuration Repository and Data Management all make use of the federated architecture model sharing content, data models, semantics and constraints where appropriate.  The fabric implements a state coherence engine that allows independent nodes to share state and common information and function as a flexible, unified system.

{{ :doc:federated-arch.png?nolink |}}

A federated architecture is critical to solving the problem of application change management.  Such problems often arise when a functional business must incorporate new, often non-functional IT requirements.  A federated approach is applicable to decoupling or decentralization projects and heterogeneous environments, where a central one-size-fits-all approach cannot be applied and will not solve the problem of constantly changing underlying realities.

The fabric architecture fosters managed independence between loosely-coupled cooperating components allowing individual system components (services and data applications) to be developed in an autonomous fashion and deployed into the federation.  Fabric components are able to share information and make use of a unified governance mechanism, giving administrators a reliable way to manage the distributed environment. 

 
===== Data Broker =====

At the core of the Reactive Data Platform™ is StreamScape's **<color gray>Data Broker</color>**; implemented using an [[wp>Actor model]] for [[wp>Concurrent Systems]] and built on top of our Service Event Fabric™, an event processing network or so-called event cloud. The platform architecture is a cluster of inter-connected brokers, referred to simply as engines or fabric nodes. A node is a Java based micro kernel that supports a set of configurable network communication protocols and is capable of hosting transactional memory and application logic.  The terms fabric runtime, data engine and fabric node will be used interchangeably, referring to a single instance of an engine in the StreamScape computing environment.
.  

Nodes may function as an independent runtime process or may be included into a Java application as a runtime library or an embedded database via JDBC, turning such applications into full-functioning fabric nodes. The runtime supports client connections using a variety of protocols such as HTTP, XMPP and TruLink™ Protocol (TLP) allowing the application fabric to be used as a more traditional messaging system. Regardless of application context the API for accessing platform resources remains the same allowing developers to easily move application logic between client programs and fabric runtime nodes.  

The diagram below illustrates an overall architecture of the runtime environment that comprises an application engine.  Strictly speaking, all system participants are considered components and classified as two possible types: clients and resources.  A client is any component that can connect to the fabric and use the API to communicate with other clients or invoke operations on resources. A resource is any components that, in addition to being a client, manages data or performs a task.  Resource components do not require client interaction.  They can interact directly with each other, function as independent daemon services or be organized into automated process flows.

{{ :doc:fabric-runtime.png?nolink |}}

=====Service Event Fabric™ =====

Data exchange between application engine components occurs via the Service Event Fabric™, a self-organizing event cloud that provides adaptive peer-to-peer messaging and communications facilities. The event fabric communications layer is called an Exchange.  It is embedded within each application engine and does not require additional components or message brokers.  As such, the event fabric architecture consists of a network of light-weight messaging agents, hosted in the node’s runtime.  A centralized configuration and peer discovery mechanism allows nodes to be organized into ad hoc, user-defined communication topologies. 

An exchange transparently facilitates communication, discovery and data routing between application fabric nodes and between nodes and clients.  This allows applications to function as stand-alone engines, act as servers or form peer groups (clusters) with other fabric nodes.  Exchanges form an overlay network on top of a physical network topology.  Each participant is assigned a unique address that is used when forming communication links, event routing and distribution. Virtual addressing facilitates reliability, allowing fabric components to survive network partitioning and engage in advanced communications when using HTTP or the TruLink™ Protocol.

{{:doc:sef.png?nolink |}}

The event fabric allows clients and components to communicate with each other by event datagrams using Publish/Subscribe, Direct Request/Reply Links for point-to-point operations as well as Message or Process Queues for task oriented (cooperative) communication models. Events are covered in more detail in [[event_fabric#Events vs. Messages|Event Fabric: Events vs. Messages.]]

Fabric nodes are linked together by TCP/IP based network connections that are dynamically managed by the exchange. Connections may be initiated on either side and depending on configuration may be set up as fault-tolerant with role preferences, meaning that the preference of which node initiates a connection may be configured as well. This is especially useful in situations where a DMZ or Firewall restriction prohibits a node from making         

outbound connections or accepting inbound connections.  Exchange connections are bidirectional entities in the sense that once a connection is established, regardless of who initiated it, all communication over that connection is bidirectional with either node being able to send and receive event datagrams.

The application fabric makes use of a shared-nothing architecture.  Although service components, clients and data collection triggers may query and access resources on remote nodes, every instance keeps a private copy of all relevant security information, service configuration files and relevant state information.  Each application engine is a self-contained entity capable of managing external network connections, hosting business logic and data collections. Each engine has its own configuration repository and may be configured to support multiple protocol acceptors, also referred to as access points within the context of the application fabric.

The event fabric is an Event Stream Processing platform; the most basic event processor unit within the fabric being the participant component (Component).  All components are derived from the Fabric Event Dispatcher (FED) which is the basic building block of all network communication.  In fact the exchange is simply a special version of the dispatcher that knows how to discover and communicate with other exchanges over the network. Every application fabric component essentially functions like an event broker.  Events may arrive into the component dispatcher in a synchronous or asynchronous fashion and it is the responsibility of the FED to match the events to the appropriate methods and functions of a component.  See [[Chapter 2: The Fabric Event Dispatcher]] for additional details on the event dispatcher.

In the application fabric all component interactions use message (event) passing as a means of communication, isolating business logic and data resources from direct programmer access.  Isolation guarantees that components interact in a completely location-transparent fashion.  Participants may access any service or data collection within the application fabric and treat it as a local computing resource without regard as to its physical location.  Components are either: event producers, event consumers or event observers; establishing a simple, powerful interaction model and a reliable way to determine the role and intent of system participants.  For additional information on working with real-time participant views see [[Chapter 5: Moderator Interface]].

{{ :doc:node-replication.png?nolink |}}

The event fabric facilitates state coherence and synchronization of common data elements across application engines.  In the broader architecture, engine instances that engage in data processing, as opposed to those that perform administrative functions, are referred to as processor nodes.  Multiple processor nodes can be joined together to form a so-called sysplex (system complex) that determines a distributed computing domain.  This determination is made by quorum thru the establishment of at least, 2 nodes that are linked together and have been configured with the same domain name.  The sysplex grows organically by adding other nodes to the domain and instructing the sysplex to accept new members.  Upon joining the domain a fabric node will automatically synchronize all relevant security information, global variables and shared configuration elements with its repository and become part of the sysplex.  Further modifications to the repository will be checked against the domain’s authorization and replicated across the sysplex.  Domains and the Sysplex are covered extensively in the following section: Chapter 2: Sysplex.

====The Event Cloud==== 

The application fabric is a scalable event cloud capable of hosting business logic and data. Fabric participants exchange structured data by sending and receiving discrete message units called events using the TruLink™ Protocol in a location-transparent manner.  Unlike conventional messaging systems, events do not require a user to define communication channels or mail box constructs such as topics, queues or subjects.  The underlying messaging layer is abstracted in favor of a simpler, data-oriented interface.

Data consumers register interest in an event based on its content and structure.  Data producers advertise event availability by creating (raising) events with discrete content identifiers.  Participants need not be aware of each other’s location or address. The fabric handles all aspects of communication and does not require developers to create or maintain any communication abstractions other than event identifiers. The cloud’s structure is self-organizing allowing developers to optimize the network topology based on data distribution needs.  

{{:doc:event-fab.png?nolink |}}

When the engine’s runtime is initialized the exchange uses its designated strategy to discover the other nodes in the fabric and form dynamic communication links with them.  

Developers may configure a default discovery module provided with the software or develop their own.  This allows the topology to evolve in a user-controlled fashion organizing based on a central lookup mechanism, be partitioned by function, value or a distribution hash table.  

As the diagram illustrates, there are no limits on what the topology can be. Fabric nodes may be configured to enforce user defined network paths performing traffic shaping in order to guarantee the best performance, optimize data distribution, improve latency or simply conserve bandwidth.  

Regardless of the topology all exchanges share routing and participant information with each other. System members engage in a moderated exchange of structured data; with the Exchange Moderator providing a passive governance mechanism that collects and presents information about system participants.  Components may query the active state of the fabric and dynamically discover other members in the cloud as well as their roles, status and the event types they are using. See [[Chapter 5: Moderator Interface]] for additional details.

=====Application Dataspaces™=====

The application engine offers a set of robust facilities for hosting data collections called Application Dataspaces™. Dataspace collections group multiple data elements of similar format into a single entity such as a table, queue, array or map.  A dataspace is a scale-able, distributed general-purpose data storage system capable of storing structured, semi-structured or binary (blob) data and exposing such data as service engine resources to any client or component, including other data space collections.

{{ :doc:adataspace.png?nolink|}}

A data space is a hybrid in-memory data store capable of holding data, decomposed objects, events or serialized object sets depending on the collection’s definition. Developers may use memory in a flexible fashion according to application needs, choosing from several models including memory, logged or persistent data.

Data spaces support several collection types including tables, queues, maps, arrays and files and allow transactional modification of data, providing access via the collections API or industry standard SQL queries.

Within the engine runtime, data spaces appear as independent resource components, each capable of acting as event consumer and event producer.  Participants interact with data spaces via a data collections or event passing API, or by using the Data Space Query Language (DSQL), an extension of standard SQL that allows users to access non-tabular entities such as queues, arrays or persisted objects that have been annotated (indexed).

====Structured Data Collections====

As a general computing principle, all data are structured in the sense that all data must be organized in some discrete fashion in order to be processed by an application.  Data typically belongs to one of three categories.  Structured data usually implies a tabular format, wherein data are organized into relational tables or name/value pairs sometimes referred to as tuples. Semi-structured data refers to non-tabular, self-describing formats such as XML or JSON.  Unstructured data commonly refers to file system content or binary media files. However, the latter term is misleading since all data must have some type of structure and provide a way to identify and group similar types together.

Data spaces allow users to store and reference all three types of data with the added benefit of providing a simple, object oriented API for accessing the data collections.  Structured data and their definitions are stored as data space cache files.  Semi-structured data elements are stored as text or binary objects within the data space cache files; whereas external files are defined as file tables and only references to the external entities are stored.

====Memory Usage====

Data collections may be configured to reside entirely in-memory allowing for fast, low latency data access.  In-memory collections are volatile and their content is lost when the runtime engine hosting the data space is shut down. This option provides the best possible performance at the expense of reliability.  Although the runtime monitors memory utilization and may be configured to raise advisories or suspend operations when a threshold is exceeded, in-memory collections provide no guarantees against reaching memory limits.

Alternatively, collections may be declared as logged.  In this case collection information is stored in memory but all data modifications are logged to disk.  Logged collections are re-loaded from disk when the data space is opened.  Logging slows down data operations but provides a significant level of reliability in the event of a failure.  Recovery logs may be placed on any disk device including memory mapped devices, further enhancing cache performance. 

Users may also define persistent data collections whose behavior is closer to that of a conventional database.  Persistent collections are disk based, however frequently accessed data are retained in cache.  Persistent data collections allow users to limit memory usage providing a number of performance tuning options.  Persistent collections allow data spaces to grow beyond the JVM memory footprint, supporting storage in excess of 100 GB.

====Managing Data Relationships====

{{ :doc:data-relationsips.png?nolink|}}

Relationships between data collections may be specified or inferred in a variety of ways.  For structured data such as tables, maps and arrays the user may define static relationships using familiar SQL constructs such as primary and foreign key pairs.  Collections that contain semi-structured or binary data may be indexed to expose data elements as keys while those that reference external files or un-structured data (for instance queues) can be exported as views or be treated as ‘indexed’ entities.

Alternatively, dynamic relationships between data collections can be declared by using event triggers.  Unlike traditional databases, use of event triggers to model data relationships is encouraged. Trigger meta-data can be queried, allowing users to discover data relationships.  Event triggers can reference collections in multiple data spaces and even those in other fabric nodes by raising data modification events in asynchronous or transacted fashion. For additional information on event triggers and their capabilities see [[Chapter 9. Event Triggers]].

Data spaces allow developers to organize multiple, disparate data elements into a federated data model and provide a way to query and modify such data using the collections API or standards-compliant SQL.  Data space technology solves a critical problem in object oriented programming by allowing developers to infer relationships between objects and collections at runtime without the overhead and complexity of an object database or additional object-relational mapping technologies.  Application Data Spaces are an alternative data management system designed for storing and processing large amounts of transient application data.  They are complementary to conventional database systems.  For more detailed information see Chapter 8. Application Data Spaces™.

====Data Space Events====

The application engine integrates event-driven computing facilities into structured data management by allowing data space collections to act as event producers and consumers.  Collections may be configured to receive and store events raised by fabric components. Depending on application requirements the events may be stored as binary data or decomposed into annotated (indexed), semi-structured objects and organized according to a collection’s data model.  The fabric’s object mediation framework handles all aspects of data marshaling and serialization.  For information on the framework’s concepts and API see Chapter 6. Object Mediation Framework.

Collections that are defined as event consumers may be semantically constrained to receive events only with a specific event id.  Declaring a constraint implicitly means that only data elements of similar structure will be stored within a given collection.  Constrained collections may further use selectors to filter the types of events they receive by their content.  Constrained collections will reject any events that do not match the identifier.  

Data modifications on a collection produce actionable events allowing users to react to changes by ‘publishing’ the deltas to observer components and event consumers or by affecting modifications in other collections.  Changes are captured and processed by declaring event triggers on a given collection.  The type of published events and their content depends on a collection’s data model.  Users may define event selectors on observable data in order to create event streams based on specific content or type of data modification.  Event triggers may raise advisories, exceptions, standard data events or delta events containing before and after images of modified elements.  

Using event triggers changes to a collection can be easily replicated across the application fabric or be subject to transactional dependency on other components (when declared as synchronous and transactional); injecting external dependency into data modification operations. See Chapter 9: Inversion of Transaction Control for additional information.

====Data Annotation====

Annotations (object indexing) provide a way to extract values from Java objects or XML documents by supplying a reference path to an element or field and allow the data to be used as searchable arguments by event selectors and the query engine. Note that although data spaces can store and query objects, they are not intended for use as an object database or object-relational mapping technology.  Indexing allows users to query objects by their annotation elements and infer relationships between instances without decomposing objects into relational data, resulting in significantly faster performance and simpler application development. 

Data annotations implement an XPATH-like syntax to specify a semantic data reference path (SDR) to object or document elements.  For example, the reference 

<code dsql>
//Employees[3]/employee/first_name 
</code>

points to the first name of the 3rd employee in the Employees Java collection.

When an annotation is declared on an event’s payload the value is automatically extracted and attached to the event datagram as a property.  Annotated events that are declared as data space collection constraints allow users to export annotated fields as search-able columns or indexes.  Indexing allows developers to treat objects and documents as query-able entities as well as establish relationships between potentially un-structured data.  For additional information see [[Chapter 2: Event Annotations]].

=====Application Service Hosting=====

The application engine supports Service Oriented Architecture and provides facilities for hosting application logic components. Users may register any plain old Java object (POJO) as a service or use the open service framework API to develop services that take advantage of the application fabric’s event processing and data storage facilities. 

{{ :doc:aservice.png?nolink|}}

Service programs are ‘wrapped’ into the application fabric by their service container context.  The context provides event dispatching and dynamic method invocation facilities allowing event data to be mapped to class methods.  Users can configure Java classes to function as services by exposing existing methods as event handlers. Services can support both synchronous (direct) and asynchronous method invocation. 

The open service framework includes service life cycle, event and exception management facilities and supports external connection factories, metrics and state advisories often required of more complex systems.  See [[Chapter 7. Open Service Framework]] for additional information.

====Event Handlers====

Service logic is accessed by invoking a service’s event handlers.  An event handler is used to associate a service method with the event used to invoke it.  Results of method invocation are raised by the service dispatcher as actionable events that may be presented to observer applications thru the use of event triggers.  A service bean may have several event handlers configured, thereby exposing multiple methods of a service class as event consumers.  Similarly, any number of event triggers may be defined on the resulting method invocations.  Triggers allow developers to declare result filtering logic by using SQL-like syntax, thereby turning service call results into event streams that may be processed by other, down-stream consumers.  [[Chapter 2: Event Handlers]] provides additional information on the subject.

====Service Events====
Results of service method invocation are raised as actionable events.  Using event triggers developers may simply pass the results to other event consumers or raise new events, advisories and exceptions in reaction to service logic calls.  The open service framework API provides access to a service’s Fabric Event Dispatcher allowing developers to take full advantage of the fabric’s messaging and event processing capabilities.  

Service events offer fine-grained control over event scope providing a declarative way to limit event visibility by other fabric components.  Actionable events are part of a service’s meta-data and may be searched for and queried by system participants from anywhere in the environment.  When event triggers publish service events they effectively re-raise actionable events with a new scope, thus becoming event producers.  

Specifying a WHEN clause in an event trigger allows users to create multiple content-driven event streams from a single actionable event.  Users may take advantage of system trigger types, such as Event Publisher, Logger, Auditor, Exception Handler and Acknowledgement. Additionally, user-defined types may be developed that perform more specific functions.  The trigger mechanism is a powerful tool for filtering, enriching and processing events generated by service components; and allow individual services to be organized into event flows in order to facilitate high-performance pipe-line processing.  See [[Chapter 9. Event Triggers]] for additional information.

=====Application Fabric Clients=====
 
The application fabric provides several ways to connect to the distributed computing environment, allowing client applications to exchange data with each other, query and manipulate information in the data space and perform administrative tasks.  The service engine supports a variety of protocol acceptors, also referred to as access points that can be configured to allow network access.   An engine instance can have multiple protocol acceptors defined allowing users to create routed topologies that can support secure Firewall and DMZ configurations.

Clients are implemented in the runtime context as client components and are functionally the same as service and data space components.  A client component is a proxy representing a connection.  Like other components it is based on the fabric event dispatcher and capable of raising events and requests or subscribing to events from other event producers. Using the administrative interface or the moderator API developers can query client connection information such as protocol, source IP address as well as inspect the resources used by clients, such as which event id are being used by the client and whether the client is producing or consuming the events.  For more information on the moderator see [[Chapter 5. Exchange Moderator]].

====Protocol Support====

Several client protocols are supported by the application fabric allowing a variety of collaborative tools and applications to access fabric resources and communicate with system participants. The core protocol used internally by service engines and clients is the TruLink Protocol™ (TLP), a proprietary protocol for structured data exchange developed by StreamScape Technologies. Additional protocol support is implemented as tunneled proxies over TLP, meaning that external (client facing) protocols such as HTTP or XMPP can establish connections and get access to all the capabilities of a TLP session.  See Chapter 4. Service Event Fabric™ for more information.

====HTTP Client Protocol====

The engine supports two forms of HTTP protocol communication.  The standard REST based exchange allows users to interact with services, clients and other fabric resources by using basic POST and GET requests and navigate the configuration repository.  Light-weight session management based on leased tokens allows REST clients to engage is secure HTTP based data exchange.

The HTTP Acceptor also supports a full-featured Java Script client for HTTP streaming. The acceptor uses popular Comet Server techniques to provide a cross-browser client that supports the AJAX programming model, allowing Web applications to engage in reliable and secure, session-based exchange of events using asynchronous communication or request/reply. Cross-domain browser access and routed links are supported as well as most of the standard fabric client features.  Both forms of communication support XML and JSON based data exchange allowing users to work with complex data structures and make use of the engine’s object mediation facilities.

====XMPP Client Protocol====

The fabric provides support for XMPP (Jabber) allowing popular Instant Messenger applications and XMPP clients to connect to the application platform and interact with each other, services and fabric resources using IM messages.  Components may also use XMPP presence to advertise state and availability.

====TLP Client Protocol====

The TruLink Protocol™ is a full-featured, high-performance client library that provides the same object mediation facilities found in the fabric runtime, allowing client applications to define semantic types, event prototypes and work with binary, XML and JSON data structures.  TLP clients may access fabric resource accessors and may be opened as networked or in-memory connections from within the runtime.

=====The Application Engine Sysplex=====

The application fabric implements a distributed system architecture that allows users to partition tasks and workloads between peer nodes.  Peers are equally privileged, equipotent system participants. They are said to form a peer-to-peer (P2P) network of nodes; a so-called System Complex or simply a sysplex.

{{ :doc:sysplex.png?nolink|}}

The sysplex is implemented as a hybrid P2P network and supports infrastructure nodes called Management Nodes that assist with routing and management of system components.


A hybrid P2P network implies:

•	support for network clients

•	structured bootstrapping

•	advanced routing

•	overlay network addressing


The fabric’s architecture is considered to be a structured and self-organizing peer system, employing a globally consistent protocol to ensure that participants can efficiently access their resources.

====Processor Nodes====

Sysplex nodes are categorized by functionality.  Processor nodes also referred to as task nodes or simply t-nodes are intended for hosting service logic and data collections.  In the overall architecture task nodes are considered resource containers with services, data collections and application artifacts (such as web pages, event or object definitions) being considered resources.  Event flows are assembled from service and data collection interactions and may span multiple resource containers.  As such, event flows are not considered physical resources.

Processor nodes may be launched independently as system processes with each node being an application that runs within a Java Virtual Machine.  Alternatively, Java applications can embed the engine runtime as a library and access it’s functionality thru the service engine API.  In this case the application essentially becomes a task node within the sysplex.

A sysplex consists of participant nodes that share security, entitlements and routing information.  It is identified by a unique domain name and requires that all nodes within the domain also have unique names, regardless of their role.  Sysplex participant nodes have to be registered with the fabric’s directory services and may be organized into a variety of topologies either by broadcast-based discovery, static routes or user-defined modules.

The application fabric implements a shared-nothing architecture to data management, replicating critical system information, global variable and configuration artifacts between participant nodes.  Members that become sysplex entities lose their individual security and entitlement information and assume those of the domain, ensuring that access control to components and fabric resources is not compromised.  See [[Chapter 2: Security and Authorization]] section for more information.

====Management Nodes====

Management nodes provide facilities for administration and deployment of processor nodes allowing users to package and remotely administer resource containers, fabric components, authorization and system entitlements.  A management node may take on the role of a process manager, a router, a lead node that defines the domain or any combination of these functions. 

A management node does not usually host business services or data collections and typically remains running unless the host machine is shut down, making it an optimal candidate for lead node.  In situations where an ad-hoc topology is required, management nodes can be used for establishing routes between groups of nodes, providing a way to perform traffic shaping in order to guarantee the best performance, optimize data distribution, improve latency or simply conserve bandwidth.

====Sysplex Membership and Discovery====

Although the domain is a physical grouping of related nodes the actual sysplex is a virtual concept in the sense that it only exists if two or more nodes with the same domain name can discover and establish fabric links with each other.  As such the sysplex physically exists until the last node of the domain is shut down.  For replication conflict resolution the longest running node is considered the ‘peer leader’.  Its system information is considered the most accurate, overriding that of other nodes that may join the sysplex at a later time.

As previously stated sysplex participants share security, entitlements and routing information.  Registered nodes may be organized into a variety of topologies either by broadcast-based discovery, static routes or user-defined modules.  The fabric provides default discovery mechanisms based on a shared file directory and allows users to develop their own discovery modules using an API.  On start-up a node loads information about neighboring nodes and preferred sysplex access points allowing it to establish or solicit fabric links.

Links between nodes are established in one of two ways.  A node that initiates a link actively attempts to connect to a neighbor (peer) based on information in the directory that indicates a link target.  Any node may initiate a link, thereby allowing sysplex members to engage in targeted and directional communications. This is useful in situations where DMZ or Firewall restrictions may prohibit the opening of connections to/from a participant.  A link may be solicited by a participant, instructing the target node to initiate the connection.  Such information tends to be ‘sticky’ within an target node in the sense that if a connection is broken, perhaps due to a network failure or a node crash, the node that was asked to solicit the link will attempt to re-connect to its target.  Soliciting is an internal operation and controlled by specifying a combination of source/target nodes and their intended behavior in the event of communications failure.  All links are bi-directional allowing full duplex communications between participants once they are formed.  See [[Chapter 3: Sysplex Configuration]] for more information.

====Full Mesh vs. Directed Graph====

{{ :doc:fab-full-mesh.png?nolink|}}

In networking parlance, a full-mesh topology implies that every node within the sysplex has a direct connection to all other nodes.  Full-mesh prepares fabric nodes for the most optimal communication method, allowing components to engage in point-to-point data exchange by opening the shortest communication path ([[wp>OSPF]]) between participants. 

While this architecture is effective in small and medium-sized systems, allowing for the fastest communications with the least latency, it may be sub-optimal in large distributed systems due to network overhead generated by the large number of direct communication links.  Systems that do not have many cross-communicating components should not deploy full-mesh links as they are likely to be under-utilized.     

Alternatively, the sysplex can be configured to implement a directed graph topology.  Directed graphs let users define routed component links across the sysplex.  This allows the fabric components to exchange data with each other using static, user-defined communication paths where event datagrams potentially pass through intermediate nodes on the way to their destination.

By definition a directed graph is a communication path between two participants, wherein data exchange can only flow in one direction.  However fabric links may be used in bi-directional fashion allowing fabric nodes to form duplex communication paths.  Hence the topology definition refers to inter-component communications.
  
Static link definitions may result in so-called cyclic directed graphs; wherein multiple paths between fabric nodes may form a loop.  The fabric event dispatcher implements overlay network addressing that assigns a unique address to each component in the sysplex.  Overlay addressing allows components to establish communication links in an optimal fashion eliminating inefficiencies such as cyclic graphs or redundant paths. 
{{ :doc:fab-full-digraph.png?nolink|}}
====Sysplex Partitioning====

A properly functioning sysplex depends on a reliable network to guarantee infrastructure availability and ensure that system information and meta-data are properly replicated between nodes.  Global consistency is achieved by organizing the sysplex nodes into a tree-like structure with one of the nodes assuming the role of a peer leader, also referred to as the root node.  The fabric protocol implements a so-called FAIR rule to determine the root node.  FAIR implies that among the sysplex nodes, First Available Is Root and all other nodes synchronize their system state with the root node.  Stability of the sysplex therefore depends on all nodes being able to see the root.

In situations where a network becomes unstable sysplex participants may lose their connection to the root node resulting in sysplex partitioning wherein a group of nodes may continue to function independently and see each other but have no way to see the root node.  Although loosing access to a root node is rare, even the most stable networks and virtual environments experience network outages that may lead to partitioning.  Fabric links provide reliability and fault tolerance by soliciting links and by performing network scavenger operations at configurable intervals in an attempt to automatically repair the sysplex and re-synchronize state.

When defining the topology of a given domain, developers and architects should always take into consideration possible sysplex partitioning, data exchange patterns between components, physical location of participant nodes and the network topology of an enterprise in order to ensure an optimal and reliable communications network.  Sysplex nodes are designed to recover from network outages and dynamically re-partition when network connectivity is re-established.  However the application engine provides a number of configuration settings that allow users to specify the node’s behavior in response to network outage on an individual basis.  For additional information see Chapter 3: Sysplex Configuration. 
 
====Event Identity Management====
 
Application fabric components exchange information by producing and consuming events.  Unlike a message that is created by a sender and destroyed by a receiver, an event may be a long-lived entity that is sequentially acted on by multiple components without being altered, essentially passing thru participants and preserving the chain of causality.  Participants may act on event data and re-transmit events resulting in event flows that may be tracked and visualized. The benefit of this approach is that it presents accurate and self-documenting data distribution graphs without the need for additional technologies or impact to the overall process.

The application fabric allows users to raise events based on service invocations or data modification and organize them into discrete event flows. Groups of events often represent a specific business function or process.  Such, events are typically grouped into partially ordered sets (or posets) and may need to be matched to transactions that produce them, routed by common criteria or potentially correlated to other event flows.  

Event identity management provides developers with the necessary tools for identifying and correlating posets (partially ordered sets of data) within discrete event flows, allowing flows and their data content to be related to other mission-critical information across enterprise systems and applications.  The service application engine provides a configurable way to manage event identity and track events as they move thru the application fabric.  For additional details on event identity see [[Chapter 2: Event Identity Manager]], [[Chapter 2:  EIM Plug-ins]] and [[Chapter 2: EIM Properties]].
 
====Runtime Debugging====

The engine runtime provides a number of facilities for debugging and tracing distributed application logic.  At its core, the engine provides state notification facilities called Advisory Events which notify subscribers of critical runtime actions that may potentially result in conflicts or errors.  Advisory events are typically raised by fabric components in reaction to connection failures, resource limit thresholds, exceptions or configuration changes, allowing users to observe and react to system actions on a global level.  See Chapter 2: Exception Event Types for more information.

Application fabric resources such as service and data collections also allow users to declare Exception Triggers.  An exception trigger fires as a result of an actionable error that can occur within a component, raising an exception event that can be observed and processed by any subscribed participant.  Applications may subscribe to exception events by using explicit event id names, wild cards or event selectors. Event consumers may create granular exception processors that can react to specific events anywhere in the application fabric.  For more information on exception triggers and event processing see [[Chapter 9: Event Triggers]].

The application engine also provides a dynamic trace facility for class-level API tracing and general logging that allow developers to specify multi-level trace information, as well as intercept and re-route the runtime error log, allowing users to subscribe to error log stream content.  Using the trace facility developers can specify which package trace to activate and the trace level.  The language environment allows users to connect to any engine instance and dynamically specify which packages and traces should be enabled or disabled.  For further discussion on trace and logging see [[Chapter 3: Trace and Logging Facilities]].


<html>
<img src="/dokuwiki/_media/icons_large/bowlerhat-transp.png" alt="Smiley face" height="46" width="46" style="margin-left:-6px;">
<a href="/dokuwiki/start" style="margin-left:-1em; font-weight:bold; color:#990000">Back</a>
</html>