Data Replay
Overview
Data Replay is a MetaRouter feature that enables you to resend a collection of previously ingested, untransformed events through the MetaRouter platform and out to your configured integrations. This is particularly useful when:
- An integration’s API experiences downtime.
- An end destination temporarily cannot accept events.
- Events were malformed during transformation.
Events will follow the same transformation and routing logic configured in your playbooks. This allows you to replay data using updated configurations if necessary.
All events delivered to MetaRouter server-side are eligible for replay, provided they were successfully ingested and stored.
How It Works
When a Data Replay job begins:
- MetaRouter accesses your configured data store (e.g., Amazon S3).
- Events are read from the store and streamed into the MetaRouter platform.
- These events are reprocessed using your existing Playbook configurations.
- Events are batched and forwarded to your chosen integrations.
Replay duration depends on the volume of events, the timeframe selected, and the number of integrations involved. More complex replays, such as those spanning longer periods or involving multiple integration, may take longer to process. Events are replayed in batches, and total processing time will vary based on system load and data size.
Prerequisites & Limitations
- Data Storage: Raw events must be stored in Amazon S3. If you’re using another service (e.g., Google Cloud Storage, Azure Blob Storage), please contact the MetaRouter team.
- Availability: Events must have successfully reached the S3 bucket. If they were not ingested or stored properly, they will not be available for replay.
- Contractual Access: Data Replay must be included in your MetaRouter contract. Contact your MetaRouter representative for clarification.
Uploading Your Data Through MetaRouter
Before beginning the steps below, notify the MetaRouter team of your intent to initiate a Data Replay job.
Step 1: Connect Your Bucket
- Create and configure an Amazon S3 Bucket if one is not already set up.
- Connect MetaRouter to this bucket using the S3 integration guide.
- Ensure the bucket is integrated with the Pipeline(s) through which you want to replay events.
MetaRouter recommends setting a bucket lifecycle policy that retains events for at least 30 days. This ensures sufficient time to identify and address issues that may require a Data Replay. You may adjust this duration based on your organization’s data retention policies.
Step 2: Set AWS Permissions
Create a new IAM user for MetaRouter and grant it permission to access the relevant S3 bucket.
The user must have the following permissions:
s3:GetObject
s3:ListBucket
Here is an example IAM policy to grand the necessary access:
{
"Id": "mr-replay-access",
"Version": "2023-10-17",
"Statement": [
{
"Sid": "mr-replay-access-1",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::<<S3 Bucket Name>>",
"arn:aws:s3:::<<S3 Bucket Name>>/*"
],
"Principal": {
"AWS": [
"<<Replay User ARN>>"
]
}
}
]
}
Step 3: Clean Up Integrations (If Needed)
If the data you intend to replay was malformed or incorrectly delivered, delete those events from the destination system to prevent duplication. This will create a clean time window that MetaRouter can refill with replayed events.
Step 4: Contact MetaRouter Support
Send the following information to your MetaRouter support contact:
- WriteKey(s) for the Pipeline(s) to replay events through.
- Target integrations (within those Pipelines) for the replay.
- Timeframe for the replay job (in HH:MM:SS DD/MM/YYYY format).
- [Optional] Event names to be replayed.
- S3 bucket prefix (path where the events are stored).
Once received, the MetaRouter team will initiate the Data Replay job.
Updated about 16 hours ago