Skip to main content

Architecture Overview

EZ-CDC uses a distributed architecture with a clear separation between the control plane (managed by EZ-CDC) and the data plane (running in your AWS account).

High-Level Architecture

Architecture Overview

Components

Control Plane (EZ-CDC Managed)

The control plane runs in EZ-CDC's infrastructure and handles:

  • Web Portal — User interface for managing deployments, datasources, and jobs
  • API — Communication with workers and external integrations
  • Job Orchestration — Background task processing, reconciliation, and maintenance
  • Metadata Storage — Stores deployments, jobs, and encrypted connection configs

Data Plane (Your AWS Account)

The data plane runs entirely in your AWS account:

ComponentResponsibility
WorkersOrchestrate CDC jobs, manage daemon lifecycle
dbmazz DaemonsCore CDC engine (one per active job)
EC2 / ASGCompute infrastructure with auto-scaling
VPC EndpointPrivateLink connectivity (optional, for enterprise)

Data Flow

1. Job Creation

UserWeb PortalHTTP APIEncryptorPostgreSQLCatalogCreate JobPOST /api/jobsEncrypt configsAES-256-GCMINSERT status=PENDINGjob_id201 CreatedJob createdJob sits in PENDING until a worker polls and claims it
Job Creation Sequence

2. Job Assignment

Workers poll the control plane for pending jobs and receive assignments.

3. CDC Replication

PostgreSQLSourcedbmazz CDC DaemonWAL ReaderTransformCkptStarRocksSinkLogical ReplicationWALStream LoadAck LSN

4. Health Reporting

Workers continuously send heartbeat messages to the control plane to report health and status.

Communication Patterns

Pull-Based Model

All communication is initiated by workers (outbound from your VPC):

CommunicationDirectionPort
Worker RegistrationWorker → Control Plane443
HeartbeatWorker → Control Plane443
Job PollingWorker → Control Plane443
Test ConnectionWorker → Control Plane443
Metrics PushWorker → Control Plane443

No inbound connections to your VPC are ever required.

Connectivity Options

ModeNetwork PathUse Case
StandardPublic internet (HTTPS/gRPC)Most deployments
PrivateLinkAWS PrivateLink (private IPs)Enterprise, regulated industries
PrivateLink Architecture

Deployment Topology

Single-Region Deployment

Workers, source databases, and sink databases all run within your AWS account and VPC, with workers making outbound connections to the EZ-CDC control plane.

Security Model

  1. Data stays in your VPC: Source and sink connections are direct, within your network
  2. Encrypted configs: Connection credentials are encrypted with AES-256-GCM
  3. No inbound access: Workers only make outbound connections
  4. IAM-based auth: Workers authenticate using deployment-specific tokens
  5. Optional PrivateLink: Eliminate public internet entirely

Next Steps