diff --git a/src/Core/AdminConsole/Services/Implementations/EventIntegrations/README.md b/src/Core/AdminConsole/Services/Implementations/EventIntegrations/README.md index cfd811d83e..e26d589a96 100644 --- a/src/Core/AdminConsole/Services/Implementations/EventIntegrations/README.md +++ b/src/Core/AdminConsole/Services/Implementations/EventIntegrations/README.md @@ -7,21 +7,19 @@ existing event system without needing to add special handling. By adding a new l pipeline, it gains an independent stream of events without the need for additional broadcast code. We want to enable robust handling of failures and retries. By utilizing the two-tier approach -(described below), we have built-in support at the service level for retries. When we add new integrations, -they can focus solely on the process of sending the specific integration and reporting status, with all the -process of retries and delays offloaded to the messaging system. +(described below), we build in support at the service level for retries. When we add new integrations, +they can focus solely on the integration-specific logic and reporting status, with all the +process of retries and delays managed by the messaging system. -Another goal was to not only support this functionality for the cloud version, but offer it as well to -self-hosted instances. By using RabbitMQ for the self-hosted piece, we have a lightweight way to tie -into the event system outside of the Cloud offering (which is using Azure Service Bus) using the same robust -architecture for integrations locally. +Another goal is to not only support this functionality in the cloud version, but offer it as well to +self-hosted instances. RabbitMQ provides a lightweight way for self-hosted instances to tie into the event system +using the same robust architecture for integrations without the need for Azure Service Bus. -Finally, we wanted to offer organization admins flexibility and control over what events are significant, where +Finally, we want to offer organization admins flexibility and control over what events are significant, where to send events, and the data to be included in the message. The configuration architecture allows Organizations to customize details of a specific integration; see Integrations and integration configurations below for more details on the configuration piece. - # Architecture The entry point for the event integrations is the `IEventWriteService`. By configuring the @@ -37,6 +35,31 @@ When `EventIntegrationEventWriteService` publishes, it posts to the first tier o approach to handling messages. Each tier is represented in the AMQP stack by a separate exchange (in RabbitMQ terminology) or topic (in Azure Service Bus). +``` mermaid +flowchart TD + B1[EventService] + B2[EventIntegrationEventWriteService] + B3[Event Exchange / Topic] + B4[EventRepositoryHandler] + B5[WebhookIntegrationHandler] + B6[Events in Database / Azure Tables] + B7[HTTP Server] + B8[SlackIntegrationHandler] + B9[Slack] + B10[EventIntegrationHandler] + B12[Integration Exchange / Topic] + + B1 -->|IEventWriteService| B2 --> B3 + B3-->|EventListenerService| B4 --> B6 + B3-->|EventListenerService| B10 + B3-->|EventListenerService| B10 + B10 --> B12 + B12 -->|IntegrationListenerService| B5 + B12 -->|IntegrationListenerService| B8 + B5 -->|HTTP POST| B7 + B8 -->|HTTP POST| B9 +``` + ### Event tier In the first tier, events are broadcast in a fan-out to a series of listeners. The message body @@ -46,20 +69,20 @@ at this level: - `EventRepositoryHandler` - The `EventRepositoryHandler` is responsible for long term storage of events. It receives all events and stores them via an injected `IEventRepository` into the database. - - This mirrors the behavior of when event integrations are turned off - Cloud stores to Azure Tables + - This mirrors the behavior of when event integrations are turned off - cloud stores to Azure Tables and self-hosted is stored to the database. - `EventIntegrationHandler` - The `EventIntegrationHandler` is a generic class that is customized to each integration (via the configuration details of the integration) and is responsible for determining if there's a configuration - for this event/organization/integration, fetching that configuration, and parsing the details of the + for this event / organization / integration, fetching that configuration, and parsing the details of the event into a template string. - The `EventIntegrationHandler` uses the injected `IOrganizationIntegrationConfigurationRepository` to pull the specific set of configuration and template based on the event type, organization, and integration type. This configuration is what determines if an integration should be sent, what details are necessary for sending it, and the actual message to send. - - The output of `EventIntegrationHandler` is a new `IntegrationMessage` with the details of this + - The output of `EventIntegrationHandler` is a new `IntegrationMessage`, with the details of this the configuration necessary to interact with the integration and the message to send (with all the event - details incorporated) published to the integration level of the message bus. + details incorporated), published to the integration level of the message bus. ### Integration tier @@ -68,9 +91,9 @@ will be concrete types of the generic `IntegrationMessage` where `` is the specific integration for which they've been sent. These messages represent the details required for sending a specific event to a specific integration, including handling retries and delays. -Handlers are the integration level are tied directly to the integration (e.g. `SlackIntegrationHandler`, +Handlers at the integration level are tied directly to the integration (e.g. `SlackIntegrationHandler`, `WebhookIntegrationHandler`). These handlers take in `IntegrationMessage` and output -`IntegrationHandlerResult`, which tells the listener what the outcome of the integration was (e.g. success/fail, +`IntegrationHandlerResult`, which tells the listener the outcome of the integration (e.g. success / fail, if it can be retried and any minimum delay that should occur). This makes them easy to unit test in isolation without any of the concerns of AMQP or messaging. @@ -80,12 +103,29 @@ Failures will either be sent to the dead letter queue (DLQ) or re-published for ### Retries -One of the goals of introducing the integration level was to simplify and enable the process of multiple retries +One of the goals of introducing the integration level is to simplify and enable the process of multiple retries for a specific event integration. For instance, if a service is temporarily down, we don't want one of our handlers blocking the rest of the queue while it waits to retry. In addition, we don't want to retry _all_ integrations for a specific event if only one integration fails nor do we want to re-lookup the configuration details. By splitting out the `IntegrationMessage` with the configuration, message, and details around retries, we can process each -event/integration individually and retry easily. +event / integration individually and retry easily. + +When the `IntegrationHandlerResult.Success` is set to `false` (indicating that the integration attempt failed) the +`Retryable` flag tells the listener whether this failure is temporary or final. If the `Retryable` is `false`, then +the message is immediately sent to the DLQ. If it is `true`, the listener uses the `ApplyRetry(DateTime)` method +in `IntegrationMessage` which handles both incrementing the `RetryCount` and updating the `DelayUntilDate` using +the provided DateTime, but also adding exponential backoff (based on `RetryCount`) and jitter. The listener compares +the `RetryCount` in the `IntegrationMessage` to see if it's over the `MaxRetries` defined in Global Settings. If it +is over the `MaxRetries`, the message is sent to the DLQ. Otherwise, it is scheduled for retry. + +``` mermaid +flowchart TD +A[Success == false] --> B{Retryable?} + B -- No --> C[Send to Dead Letter Queue DLQ] + B -- Yes --> D[Check RetryCount vs MaxRetries] + D -->|RetryCount >= MaxRetries| E[Send to Dead Letter Queue DLQ] + D -->|RetryCount < MaxRetries| F[Schedule for Retry] +``` Azure Service Bus supports scheduling messages as part of its core functionality. Retries are scheduled to a specific time and then ASB holds the message and publishes it at the correct time. @@ -103,13 +143,12 @@ defaults to `false` which indicates we should use retry queues with a timing che - This allows us to forego using any retry queues and rely instead on the delay exchange. When a message is marked with the header it gets published to the exchange and the exchange handles all the functionality of holding it until the appropriate time (similar to ASB's built-in support). - - The plugin must be setup and enabled before turning this option on (which is why we default to off). + - The plugin must be setup and enabled before turning this option on (which is why it defaults to off). 2. Retry queues + timing check - If the delay plugin setting is off, we push the message to a retry queue which has a fixed amount of time before it gets re-published back to the main queue. - - When a message comes off the queue, we check to see if the `DelayUntilDate` has - already passed. + - When a message comes off the queue, we check to see if the `DelayUntilDate` has already passed. - If it has passed, we then handle the integration normally and retry the request. - If it is still in the future, we put the message back on the retry queue for an additional wait. - While this does use extra processing, it gives us better support for honoring the delays even if the delay plugin @@ -121,20 +160,20 @@ defaults to `false` which indicates we should use retry queues with a timing che To make it easy to support multiple AMQP services (RabbitMQ and Azure Service Bus), the act of listening to the stream of messages is decoupled from the act of responding to a message. -**Listeners** +### Listeners - Listeners handle the details of the communication platform (i.e. RabbitMQ and Azure Service Bus). - There is one listener for each platform (RabbitMQ / ASB) for each of the two levels - i.e. one event listener - and one integration listener + and one integration listener. - Perform all the aspects of setup / teardown, subscription, message acknowledgement, etc. for the messaging platform, but do not directly process any events themselves. Instead, they delegate to the handler with which they are configured. - Multiple instances can be configured to run independently, each with its own handler and subscription / queue. -**Handlers** +### Handlers -- One handler per queue/subscription (e.g. per integration at the integration level). +- One handler per queue / subscription (e.g. per integration at the integration level). - Completely isolated from and know nothing of the messaging platform in use. This allows them to be freely reused across different communication platforms. - Perform all aspects of handling an event. @@ -160,14 +199,14 @@ Organizations can configure integration configurations to send events to differe handler maps to a specific integration and checks for the configuration when it receives an event. Currently, there are integrations / handlers for Slack and webhooks (as mentioned above). -**`OrganizationIntegration`** +### `OrganizationIntegration` - The top level object that enables a specific integration for the organization. - Includes any properties that apply to the entire integration across all events. - For Slack, it consists of the token: - ```json + ``` json { "token": "xoxb-token-from-slack" } ``` @@ -175,34 +214,34 @@ Currently, there are integrations / handlers for Slack and webhooks (as mentione have a webhook `OrganizationIntegration` to enable configuration via `OrganizationIntegrationConfiguration`. -**`OrganizationIntegrationConfiguration`** +### `OrganizationIntegrationConfiguration` - This contains the configurations specific to each `EventType` for the integration. - `Configuration` contains the event-specific configuration. - For Slack, this would contain what channel to send the message to: - ```json + ``` json { "channelId": "C123456" } ``` - For Webhook, this is the URL the request should be sent to: - ```json + ``` json { "url": "https://api.example.com" } ``` - `Template` contains a template string that is expected to be filled in with the contents of the actual event. - The tokens in the string are wrapped in `#` characters. For instance, the UserId would be - `#UserId#` + `#UserId#`. - The `IntegrationTemplateProcessor` does the actual work of replacing these tokens with introspected values from the provided `EventMessage`. - The template does not enforce any structure -- it could be a freeform text message to send via Slack, or a JSON body to send via webhook; it is simply stored and used as a string for the most flexibility. -**`OrganizationIntegrationConfigurationDetails`** +### `OrganizationIntegrationConfigurationDetails` - This is the combination of both the `OrganizationIntegration` and `OrganizationIntegrationConfiguration` into a single object. The combined contents tell the integration's handler all the details needed to send to an @@ -216,37 +255,41 @@ These are all the pieces required in the process of building out a new integrati clarity in naming, these assume a new integration called "Example". ## IntegrationType -Add a new type to `IntegrationType` for the new integration + +Add a new type to `IntegrationType` for the new integration. ## Configuration Models + The configuration models are the classes that will determine what is stored in the database for `OrganizationIntegration` and `OrganizationIntegrationConfiguration`. The `Configuration` columns are the serialized version of the corresponding objects and represent the coonfiguration details for this integration and event type. 1. `ExampleIntegration` - - Configuration details for the whole integration (e.g. a token in Slack) - - Applies to every event type configuration defined for this integration - - Maps to the JSON structure stored in `Configuration` in ``OrganizationIntegration` + - Configuration details for the whole integration (e.g. a token in Slack). + - Applies to every event type configuration defined for this integration. + - Maps to the JSON structure stored in `Configuration` in ``OrganizationIntegration`. 2. `ExampleIntegrationConfiguration` - - Configuration details that could change from event to event (e.g. channelId in Slack) - - Maps to the JSON structure stored in `Configuration` in `OrganizationIntegrationConfiguration` + - Configuration details that could change from event to event (e.g. channelId in Slack). + - Maps to the JSON structure stored in `Configuration` in `OrganizationIntegrationConfiguration`. 3. `ExampleIntegrationConfigurationDetails` - - Combined configuration of both Integration _and_ IntegrationConfiguration - - This will be the deserialized version of the `MergedConfiguration` in `OrganizationIntegrationConfigurationDetails` + - Combined configuration of both Integration _and_ IntegrationConfiguration. + - This will be the deserialized version of the `MergedConfiguration` in + `OrganizationIntegrationConfigurationDetails`. ## Request Models -1. Add a new case to the switch method in `OrganizationIntegrationRequestModel.Validate` -2. Add a new case to the switch method in `OrganizationIntegrationConfigurationRequestModel.IsValidForType` +1. Add a new case to the switch method in `OrganizationIntegrationRequestModel.Validate`. +2. Add a new case to the switch method in `OrganizationIntegrationConfigurationRequestModel.IsValidForType`. ## Integration Handler + e.g. `ExampleIntegrationHandler` -- This is where the actual code will go to perform the integration (i.e. send an HTTP request, etc.) +- This is where the actual code will go to perform the integration (i.e. send an HTTP request, etc.). - Handlers receive an `IntegrationMessage` where `` is the `ExampleIntegrationConfigurationDetails` defined above. This has the Configuration as well as the rendered template message to be sent. -- Handlers return an `IntegrationHandlerResult` with details about if the request was - successful, if it can be retried, when it should be delayed until, etc. +- Handlers return an `IntegrationHandlerResult` with details about if the request - success / failure, + if it can be retried, when it should be delayed until, etc. - The scope of the handler is simply to do the integration and report the result. Everything else (such as how many times to retry, when to retry, what to do with failures) is done in the Listener. @@ -274,23 +317,23 @@ integration below #### Service Bus Emulator, local config In order to create ASB resources locally, we need to also update the `servicebusemulator_config.json` file to include any new subscriptions. -- Under the existing event topic (`event-logging`) add a - subscription for the event level for this new integration (`events-example-subscription`). +- Under the existing event topic (`event-logging`) add a subscription for the event level for this + new integration (`events-example-subscription`). - Under the existing integration topic (`event-integrations`) add a new subscription for the integration level messages (`integration-example-subscription`). - Copy the correlation filter from the other integration level subscriptions. It should filter based on the `IntegrationType.ToRoutingKey`, or in this example `example`. These names added here are what must match the values provided in the secrets or the defaults provided -in Global Settings. This must be in place (and the local ASB emulater restarted) before you can use any +in Global Settings. This must be in place (and the local ASB emulator restarted) before you can use any code locally that accesses ASB resources. ## ServiceCollectionExtensions -In our `ServiceCollectionExtensions`, we pull all the pieces together that have been built to start -listeners on each message tier, with handlers to process the integration. There are a number of helper -methods in here to make this simple to add a new integration - one call per platform. +In our `ServiceCollectionExtensions`, we pull all the above pieces together to start listeners on each message +tier with handlers to process the integration. There are a number of helper methods in here to make this simple +to add a new integration - one call per platform. -Also note that if an integration needs a custom singleton/service defined, the add listeners method is a +Also note that if an integration needs a custom singleton / service defined, the add listeners method is a good place to set that up. For instance, `SlackIntegrationHandler` needs a `SlackService`, so the singleton declaration is right above the add integration method for slack. Same thing for webhooks when it comes to defining a custom HttpClient by name. @@ -303,7 +346,6 @@ defining a custom HttpClient by name. globalSettings.EventLogging.RabbitMq.ExampleIntegrationRetryQueueName, globalSettings.EventLogging.RabbitMq.MaxRetries, IntegrationType.Example); - ``` 2. In `AddAzureServiceBusListeners` add the integration: @@ -313,7 +355,6 @@ services.AddAzureServiceBusIntegration