Skip to main content

Network Isolation

EZ-CDC's architecture provides strong network isolation for your data.

Network Architecture

EZ-CDC Control PlaneAPIgRPCCustomer AWS AccountVPCPrivate SubnetWorkerSG: NO INBOUNDPostgreSQLStarRocksNATVPCERead WALStream LoadOutbound HTTPS:443Private pathStandard modePrivateLink mode
Network Isolation: Workers initiate all connections (no inbound traffic)

Zero Inbound Connections

Workers require no inbound connections:

Worker Security Group

resource "aws_security_group" "worker" {
name = "ezcdc-worker"

# NO INBOUND RULES
# Workers never accept incoming connections

# Outbound only
egress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"] # Control plane
}

egress {
from_port = 5432
to_port = 5432
protocol = "tcp"
security_groups = [aws_security_group.postgres.id]
}

egress {
from_port = 8040
to_port = 8040
protocol = "tcp"
security_groups = [aws_security_group.starrocks.id]
}
}

GCP Firewall Rules

# Deny all ingress (priority 1000)
resource "google_compute_firewall" "deny_all_ingress" {
name = "ezcdc-deny-all-ingress"
network = var.network
direction = "INGRESS"
priority = 1000

deny {
protocol = "all"
}

source_ranges = ["0.0.0.0/0"]
target_tags = ["ez-cdc-worker"]
}

# Deny all egress baseline (priority 1100)
resource "google_compute_firewall" "deny_all_egress" {
name = "ezcdc-deny-all-egress"
network = var.network
direction = "EGRESS"
priority = 1100

deny {
protocol = "all"
}

destination_ranges = ["0.0.0.0/0"]
target_tags = ["ez-cdc-worker"]
}

# Allow only required egress (priority 900)
resource "google_compute_firewall" "allow_egress_https" {
name = "ezcdc-allow-egress-https"
network = var.network
direction = "EGRESS"
priority = 900

allow {
protocol = "tcp"
ports = ["443", "50051"]
}

destination_ranges = ["0.0.0.0/0"]
target_tags = ["ez-cdc-worker"]
}

Why No Inbound?

EZ-CDC uses a pull-based architecture:

  1. Workers poll control plane for jobs
  2. Workers push metrics and status
  3. Control plane never initiates connections to workers

This means:

  • No public IPs on workers
  • No ports exposed to internet
  • No attack surface from inbound traffic

Data Path Isolation

Data Never Leaves Your VPC

PostgreSQL ──────────────────────────▶ StarRocks
│ │
│ Within Your VPC │
│ │
└────────────────────┘

All data flows directly between source and sink within your VPC:

Data TypePathCrosses VPC?
Row dataSource → Worker → Sink❌ No
Schema metadataWorker → Control Plane✅ Yes (encrypted)
CredentialsControl Plane → Worker✅ Yes (encrypted)
MetricsWorker → Control Plane✅ Yes

Connectivity Options

Standard (Internet)

Worker ──▶ NAT Gateway ──▶ Internet ──▶ Control Plane

TLS Encrypted
  • Workers in private subnets
  • Outbound via NAT Gateway
  • All traffic TLS encrypted
Worker ──▶ VPC Endpoint ──▶ AWS Network ──▶ Control Plane

Private IPs
No Internet
  • No internet exposure
  • Traffic stays on AWS backbone
  • Private IP addresses only

Cloud NAT (GCP)

Worker ──▶ Cloud NAT ──▶ Internet ──▶ Control Plane

mTLS + TLS
No Public IPs
  • No external IPs on workers
  • All egress via Cloud NAT gateway
  • mTLS authentication with control-plane
  • NAT scoped to worker subnetwork only

Database Isolation

Source Database

Only workers need access to your source:

resource "aws_security_group_rule" "postgres_inbound" {
type = "ingress"
from_port = 5432
to_port = 5432
protocol = "tcp"
source_security_group_id = aws_security_group.worker.id # Only workers
security_group_id = aws_security_group.postgres.id
}

Sink Database

Only workers need access to your sink:

resource "aws_security_group_rule" "starrocks_inbound" {
type = "ingress"
from_port = 8040
to_port = 8040
protocol = "tcp"
source_security_group_id = aws_security_group.worker.id # Only workers
security_group_id = aws_security_group.starrocks.id
}

Network Segmentation

For optimal security, segment your VPC into separate subnets for databases (Data Subnet) and CDC workers (CDC Subnet). Workers communicate with databases across subnet boundaries via security group rules, providing an additional layer of isolation.

Network ACLs

Additional layer of security:

resource "aws_network_acl_rule" "worker_outbound_postgres" {
network_acl_id = aws_network_acl.cdc.id
rule_number = 100
egress = true
protocol = "tcp"
rule_action = "allow"
cidr_block = var.data_subnet_cidr
from_port = 5432
to_port = 5432
}

Monitoring Network Access

VPC Flow Logs

Enable flow logs to monitor network activity:

resource "aws_flow_log" "cdc" {
vpc_id = aws_vpc.main.id
traffic_type = "ALL"
log_destination = aws_cloudwatch_log_group.flow_logs.arn
}

GCP VPC Flow Logs

Enable flow logs on the worker subnetwork:

resource "google_compute_subnetwork" "workers" {
name = "ezcdc-workers"
# ...

log_config {
aggregation_interval = "INTERVAL_5_SEC"
flow_sampling = 0.5
metadata = "INCLUDE_ALL_METADATA"
}
}

CloudWatch Insights Query

fields @timestamp, srcAddr, dstAddr, dstPort, action
| filter dstPort = 5432 or dstPort = 8040
| sort @timestamp desc
| limit 100

Best Practices

1. Use Private Subnets

Never place workers in public subnets:

# Good: Private subnet
subnet_ids = [aws_subnet.private_a.id, aws_subnet.private_b.id]

# Bad: Public subnet
# subnet_ids = [aws_subnet.public.id]

2. Minimize Egress Rules

Only allow necessary outbound traffic:

# Good: Specific destinations
egress {
to_port = 443
cidr_blocks = ["<control-plane-ip>/32"] # EZ-CDC control plane IP
}

# Acceptable: All HTTPS
egress {
to_port = 443
cidr_blocks = ["0.0.0.0/0"]
}

# Bad: All traffic
# egress {
# from_port = 0
# to_port = 0
# cidr_blocks = ["0.0.0.0/0"]
# }

For regulated industries or high-security requirements.

(GCP) Use Cloud NAT for Sensitive Workloads

For regulated industries or high-security requirements on GCP.

4. Enable VPC Flow Logs

Monitor and audit all network traffic.

(GCP) Enable VPC Flow Logs

Monitor and audit all network traffic via subnetwork flow logs.

Next Steps