Data Replay

Overview

Data Replay is a MetaRouter feature that enables you to resend a collection of previously ingested, untransformed events through the MetaRouter platform and out to your configured integrations. This is particularly useful when:

  • An integration’s API experiences downtime.
  • An end destination temporarily cannot accept events.
  • Events were malformed during transformation.

Events will follow the same transformation and routing logic configured in your playbooks. This allows you to replay data using updated configurations if necessary.

All events delivered to MetaRouter server-side are eligible for replay, provided they were successfully ingested and stored.

How It Works

When a Data Replay job begins:

  1. MetaRouter accesses your configured data store (e.g., Amazon S3).
  2. Events are read from the store and streamed into the MetaRouter platform.
  3. These events are reprocessed using your existing Playbook configurations.
  4. Events are batched and forwarded to your chosen integrations.

Replay duration depends on the volume of events, the timeframe selected, and the number of integrations involved. More complex replays, such as those spanning longer periods or involving multiple integration, may take longer to process. Events are replayed in batches, and total processing time will vary based on system load and data size.

Prerequisites & Limitations

  • Data Storage: Raw events must be stored in Amazon S3. If you’re using another service (e.g., Google Cloud Storage, Azure Blob Storage), please contact the MetaRouter team.
  • Availability: Events must have successfully reached the S3 bucket. If they were not ingested or stored properly, they will not be available for replay.
  • Contractual Access: Data Replay must be included in your MetaRouter contract. Contact your MetaRouter representative for clarification.

Uploading Your Data Through MetaRouter

Before beginning the steps below, notify the MetaRouter team of your intent to initiate a Data Replay job.

Step 1: Connect Your Bucket

⚠️

MetaRouter recommends setting a bucket lifecycle policy that retains events for at least 30 days. This ensures sufficient time to identify and address issues that may require a Data Replay. You may adjust this duration based on your organization’s data retention policies.

Step 2: Set AWS Permissions

Create a new IAM user for MetaRouter and grant it permission to access the relevant S3 bucket.

The user must have the following permissions:

  • s3:GetObject
  • s3:ListBucket

Here is an example IAM policy to grand the necessary access:

{
  "Id": "mr-replay-access",
  "Version": "2023-10-17",
  "Statement": [
    {
      "Sid": "mr-replay-access-1",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::<<S3 Bucket Name>>",
        "arn:aws:s3:::<<S3 Bucket Name>>/*"
      ],
      "Principal": {
        "AWS": [
          "<<Replay User ARN>>"
        ]
      }
    }
  ]
}

Step 3: Clean Up Integrations (If Needed)

If the data you intend to replay was malformed or incorrectly delivered, delete those events from the destination system to prevent duplication. This will create a clean time window that MetaRouter can refill with replayed events.

Step 4: Contact MetaRouter Support

Send the following information to your MetaRouter support contact:

  • WriteKey(s) for the Pipeline(s) to replay events through.
  • Target integrations (within those Pipelines) for the replay.
  • Timeframe for the replay job (in HH:MM:SS DD/MM/YYYY format).
  • [Optional] Event names to be replayed.
  • S3 bucket prefix (path where the events are stored).

Once received, the MetaRouter team will initiate the Data Replay job.