Skip to main content

Quickstart Guide

Get your first CDC pipeline running in under 15 minutes. This guide walks you through creating a deployment, adding datasources, and starting replication.

Prerequisites

Before you begin, ensure you have:

  • An EZ-CDC account (sign up at portal.ez-cdc.com)
  • An AWS account with admin access
  • A PostgreSQL database (source)
  • A StarRocks cluster (sink)

Step 1: Create a Deployment

A deployment represents a set of workers in your AWS account.

1.1 Log in to the Portal

Navigate to portal.ez-cdc.com and log in.

1.2 Configure Deployment

Go to Deployments and click New Deployment. Fill in your deployment configuration:

  1. Enter a name for your deployment (e.g., "production")
  2. Select your AWS Region
  3. Choose connectivity mode:
    • Standard: Uses public internet (simpler)
    • PrivateLink: Uses AWS PrivateLink (more secure)

Deployment wizard step 1 showing name and region configuration

The deployment wizard guides you through the initial configuration. Enter a descriptive name and select the AWS region closest to your databases.

1.3 Create IAM Role

EZ-CDC needs an IAM role in your AWS account to provision resources securely:

  1. Click Download CloudFormation Template or use the provided template URL
  2. In AWS Console, go to CloudFormationCreate Stack
  3. Upload the template and create the stack
  4. Copy the Role ARN from the stack outputs
  5. Paste the Role ARN back in the EZ-CDC portal

AWS permissions step showing CloudFormation template and IAM role setup

The portal provides a CloudFormation template that creates the required IAM role with least-privilege permissions. Copy the Role ARN from CloudFormation outputs after the stack completes.

1.4 Review and Deploy

After configuring your VPC and subnets, review all settings before deploying:

Review screen showing all deployment configuration before clicking deploy

Review your deployment configuration carefully. Verify the IAM Role ARN, VPC, subnets, and connectivity mode are correct before proceeding.

Click Create Deployment to start provisioning. The deployment will provision in 2-3 minutes.

Deployment in progress showing status and provisioning steps

The portal shows real-time progress as Terraform provisions your infrastructure. Wait for the status to show workers are healthy before continuing.

You'll see worker status turn green when ready:

Deployment: production
Status: Active
Workers: 1/1 healthy

Step 2: Add Data Sources

2.1 Add PostgreSQL Source

  1. Go to DatasourcesNew Datasource
  2. Select PostgreSQL as the type
  3. Enter connection details:
Host: your-postgres.example.com
Port: 5432
Database: myapp
Username: cdc_user
Password: ********
SSL Mode: require
  1. Click Test Connection to verify
  2. Click Save

PostgreSQL datasource form with connection fields

Fill in your PostgreSQL connection details. The Test Connection button verifies connectivity from your EZ-CDC workers to the database before saving.

PostgreSQL Requirements

Your PostgreSQL must have wal_level = 'logical' enabled. See PostgreSQL Setup for details.

2.2 Add StarRocks Sink

  1. Go to DatasourcesNew Datasource
  2. Select StarRocks as the type
  3. Enter connection details:
Host: your-starrocks.example.com
MySQL Port: 9030
HTTP Port: 8040
Database: analytics
Username: root
Password: ********
  1. Click Test Connection to verify
  2. Click Save

After adding both datasources, you'll see them listed:

Datasources list showing PostgreSQL source and StarRocks sink

Your datasources list shows all configured connections. Verify both your PostgreSQL source and StarRocks sink appear with successful connection status.

Step 3: Create a Job

A job defines which tables to replicate from source to sink.

3.1 Create the Job

  1. Go to JobsNew Job
  2. Select your PostgreSQL source
  3. Select your StarRocks sink

Job creation step 1 showing source and sink selection

Select your source database (PostgreSQL) and sink database (StarRocks) from the dropdown menus. The job will replicate data between these two systems.

  1. Choose tables to replicate:

Table selection interface showing available tables with checkboxes

Select the tables you want to replicate. You can select individual tables or use the select all option. Tables with primary keys are required for CDC.

[x] public.users
[x] public.orders
[x] public.order_items
[ ] public.audit_logs
  1. Configure options:
OptionValueDescription
Replication Slotezcdc_slot_1PostgreSQL slot name
Publicationezcdc_pub_1PostgreSQL publication
Batch Size10000Events per batch
  1. Click Create Job

3.2 Monitor the Job

The job will start automatically. Monitor progress in the portal:

Jobs list showing running job with status and metrics

The jobs list shows all your replication jobs with their current status, throughput, and lag metrics. A green status indicates the job is running successfully.

Job: postgres-to-starrocks
Status: Running
Events/sec: 1,247
Lag: 0.5 KB
Tables: 3

Step 4: Verify Replication

Check StarRocks

Connect to your StarRocks cluster and verify data is flowing:

-- Check row counts
SELECT COUNT(*) FROM users;
SELECT COUNT(*) FROM orders;

-- Check audit columns (added automatically)
SELECT
id,
name,
_cdc_updated_at, -- When the row was last modified
_cdc_deleted -- Soft delete flag
FROM users
LIMIT 5;

Check Metrics

In the EZ-CDC portal, go to Jobs → your job → Metrics:

  • Throughput: Events processed per second
  • Lag: Bytes behind the source WAL
  • Latency: Time from change to replication

Troubleshooting

Job stuck in "Pending"

  • Verify workers are healthy in Deployments
  • Check worker can reach both source and sink databases

Connection test fails

  • Verify security groups allow worker access
  • Check database credentials are correct
  • Ensure SSL settings match your database

High replication lag

  • Increase batch size for higher throughput
  • Check sink database performance
  • Consider adding more workers

Next Steps

Congratulations! You've set up your first CDC pipeline. Continue learning:

Complete Example

Here's a visual summary of what you've built:

PostgreSQLSourceEZ-CDC WorkerSnapshotCDC StreamStarRocksSinkWAL StreamStream Load
Simple CDC Flow: PostgreSQL to StarRocks via EZ-CDC