Skip to main content

Architecture Overview

EZ-CDC uses a distributed architecture with a clear separation between the control plane (managed by EZ-CDC) and the data plane (running in your cloud account — AWS or GCP).

High-Level Architecture

Architecture Overview

Components

Control Plane (EZ-CDC Managed)

The control plane runs in EZ-CDC's infrastructure and handles:

  • Web Portal — User interface for managing deployments, datasources, and jobs
  • API — Communication with workers and external integrations
  • Job Orchestration — Background task processing, reconciliation, and maintenance
  • Metadata Storage — Stores deployments, jobs, and encrypted connection configs

Data Plane (Your Cloud Account)

The data plane runs entirely in your cloud account (AWS or GCP):

ComponentResponsibility
WorkersOrchestrate CDC jobs, manage daemon lifecycle
dbmazz DaemonsCore CDC engine (one per active job)
Compute (EC2 / GCE)Auto-scaling compute infrastructure
Private ConnectivityAWS PrivateLink or GCP Cloud NAT (optional, for enterprise)

Data Flow

1. Job Creation

UserWeb PortalHTTP APIEncryptorPostgreSQLCatalogCreate JobPOST /api/jobsEncrypt configsAES-256-GCMINSERT status=PENDINGjob_id201 CreatedJob createdJob sits in PENDING until a worker polls and claims it
Job Creation Sequence

2. Job Assignment

Workers poll the control plane for pending jobs and receive assignments.

3. CDC Replication

PostgreSQLSourcedbmazz CDC DaemonWAL ReaderTransformCkptStarRocksSinkLogical ReplicationWALStream LoadAck LSN

4. Health Reporting

Workers continuously send heartbeat messages to the control plane to report health and status.

Communication Patterns

Pull-Based Model

All communication is initiated by workers (outbound from your VPC):

CommunicationDirectionProtocolPort
Worker RegistrationWorker → Control PlanegRPC50051
HeartbeatWorker → Control PlanegRPC50051
Job PollingWorker → Control PlanegRPC50051
Test ConnectionWorker → Control PlaneHTTPS443
Metrics PushWorker → Control PlaneHTTP80

No inbound connections to your VPC are ever required.

Connectivity Options

ModeCloudNetwork PathUse Case
StandardAWS / GCPPublic internet (HTTPS/gRPC)Most deployments
PrivateLinkAWSAWS PrivateLink (private IPs)Enterprise, regulated industries
Cloud NATGCPNo public IPs, egress via Cloud NATPrivate GCP deployments
PrivateLink Architecture

Deployment Topology

Single-Region Deployment

Workers, source databases, and sink databases all run within your cloud account and VPC, with workers making outbound connections to the EZ-CDC control plane.

Security Model

  1. Data stays in your VPC: Source and sink connections are direct, within your network
  2. Encrypted configs: Connection credentials are encrypted with AES-256-GCM
  3. No inbound access: Workers only make outbound connections
  4. IAM-based auth: Workers authenticate using deployment-specific tokens
  5. Optional PrivateLink: Eliminate public internet entirely

Next Steps