Skip to main content

Sources Overview

Sources are the databases from which EZ-CDC captures changes. EZ-CDC connects to your source database's change stream and captures every INSERT, UPDATE, and DELETE operation in real-time.

Supported Sources

SourceCDC MethodStatusDocumentation
PostgreSQLLogical Replication (pgoutput)✅ StableSetup Guide
MySQLBinary Log (binlog)🔜 Coming Soon-
OracleLogMiner🔜 Planned-
SQL ServerChange Tracking🔜 Planned-

How Sources Work

Change Data Capture (CDC)

CDC captures changes by reading the database's transaction log:

PostgreSQLSourceEZ-CDC WorkerSnapshotCDC StreamStarRocksSinkWAL StreamStream Load
Simple CDC Flow: PostgreSQL to StarRocks via EZ-CDC

Event Types

EZ-CDC captures these event types:

EventDescriptionData Included
INSERTNew row addedAll column values
UPDATERow modifiedNew values + primary key
DELETERow removedPrimary key
BEGINTransaction startTransaction ID
COMMITTransaction endTransaction ID, LSN

Normalized Format

All sources emit events in a normalized CdcRecord format:

enum CdcRecord {
Insert {
schema: String,
table: String,
values: HashMap<String, Value>,
},
Update {
schema: String,
table: String,
values: HashMap<String, Value>,
key: HashMap<String, Value>,
},
Delete {
schema: String,
table: String,
key: HashMap<String, Value>,
},
// ...
}

This allows any source to work with any sink.

Source Configuration

Common Settings

All sources share these configuration options:

SettingDescriptionDefault
Connection URLDatabase connection stringRequired
TablesTables to replicateRequired
SSL ModeTLS connection settingsprefer

Source-Specific Settings

Each source has additional settings:

  • PostgreSQL: Replication slot, publication name
  • MySQL: Server ID, binlog position
  • Oracle: LogMiner settings

Adding a Source

Via Portal

  1. Go to DatasourcesNew Datasource
  2. Select source type (e.g., PostgreSQL)
  3. Enter connection details
  4. Test connection
  5. Save

Best Practices

1. Use Dedicated CDC User

Create a user specifically for CDC with minimal permissions:

-- PostgreSQL example
CREATE USER cdc_user WITH REPLICATION LOGIN PASSWORD 'secret';
GRANT SELECT ON ALL TABLES IN SCHEMA public TO cdc_user;

2. Enable SSL/TLS

Always use encrypted connections:

{
"ssl_mode": "require"
}

3. Monitor Replication Lag

Set up alerts for replication lag to detect issues early.

4. Plan for Schema Changes

  • Test schema changes in staging first
  • Use compatible changes when possible (add columns, not remove)
  • Monitor jobs after schema changes

Next Steps