Skip to main content

StarRocks Requirements

This guide covers the requirements for using StarRocks as a CDC sink with EZ-CDC.

Version Requirements

RequirementMinimumRecommended
StarRocks Version2.53.0+
Stream LoadEnabled-

Architecture Overview

StarRocks has multiple components:

StarRocks cluster with Frontend (FE) and Backend (BE) nodes

Network Requirements

EZ-CDC workers need access to:

ComponentPortProtocolPurpose
Frontend (FE)9030MySQLMetadata queries, DDL
Backend (BE)8040HTTPStream Load (data ingestion)

Firewall Rules

# Allow MySQL protocol to FE
resource "aws_security_group_rule" "starrocks_fe" {
type = "ingress"
from_port = 9030
to_port = 9030
protocol = "tcp"
source_security_group_id = var.ezcdc_worker_sg_id
security_group_id = var.starrocks_sg_id
}

# Allow HTTP to BE (Stream Load)
resource "aws_security_group_rule" "starrocks_be" {
type = "ingress"
from_port = 8040
to_port = 8040
protocol = "tcp"
source_security_group_id = var.ezcdc_worker_sg_id
security_group_id = var.starrocks_sg_id
}

User Permissions

Create CDC User

-- Connect to StarRocks FE (MySQL protocol)
mysql -h starrocks-fe -P 9030 -u root

-- Create user
CREATE USER 'ezcdc_user' IDENTIFIED BY 'your_password';

-- Grant permissions
GRANT SELECT, INSERT, UPDATE, DELETE ON analytics.* TO 'ezcdc_user';
GRANT ALTER ON analytics.* TO 'ezcdc_user';
GRANT CREATE ON analytics.* TO 'ezcdc_user';

-- Required for Stream Load
GRANT LOAD ON analytics.* TO 'ezcdc_user';

Required Privileges

PrivilegePurpose
SELECTCheck table structure
INSERTLoad data via Stream Load
UPDATEUpdate existing rows
DELETEDelete rows
ALTERModify table schema
CREATECreate new tables
LOADUse Stream Load

Table Requirements

Table Engine

Use Primary Key tables for CDC (recommended):

CREATE TABLE orders (
id BIGINT NOT NULL,
customer_id BIGINT,
total DECIMAL(10, 2),
status VARCHAR(50),
created_at DATETIME,
_cdc_updated_at DATETIME,
_cdc_deleted BOOLEAN DEFAULT false
)
PRIMARY KEY (id)
DISTRIBUTED BY HASH(id) BUCKETS 8
PROPERTIES (
"replication_num" = "1"
);

Primary Key tables:

  • Support UPSERT operations
  • Enable UPDATE and DELETE
  • Best for CDC workloads

Alternative: Duplicate Key Tables

For append-only workloads:

CREATE TABLE events (
event_id BIGINT,
event_type VARCHAR(50),
payload JSON,
created_at DATETIME
)
DUPLICATE KEY (event_id)
DISTRIBUTED BY HASH(event_id) BUCKETS 8;

Stream Load Configuration

Verify Stream Load

Check that Stream Load is working:

# Test Stream Load endpoint
curl -X PUT \
-H "Expect: 100-continue" \
-H "label:test_$(date +%s)" \
-H "format: json" \
-H "strip_outer_array: true" \
-T test_data.json \
-u ezcdc_user:password \
http://starrocks-be:8040/api/analytics/test_table/_stream_load

Stream Load Limits

ParameterDefaultDescription
streaming_load_max_mb10240Max batch size (MB)
streaming_load_max_batch_size_mb100Single batch max

Performance Tuning

Backend (BE) Settings

For high-throughput CDC:

# be.conf
streaming_load_rpc_max_alive_time_sec = 1200
load_process_max_memory_limit_percent = 50

Frontend (FE) Settings

# fe.conf
stream_load_default_timeout_second = 600

Verification Script

-- 1. Check StarRocks version
SELECT current_version();

-- 2. Verify user permissions
SHOW GRANTS FOR 'ezcdc_user';

-- 3. Check databases
SHOW DATABASES;

-- 4. Test table creation
CREATE DATABASE IF NOT EXISTS ezcdc_test;
USE ezcdc_test;
CREATE TABLE test_table (
id BIGINT PRIMARY KEY
) DISTRIBUTED BY HASH(id) BUCKETS 1;

-- 5. Clean up
DROP TABLE test_table;
DROP DATABASE ezcdc_test;

Common Issues

"Failed to connect to FE"

Cause: Network or authentication issue.

Solutions:

  1. Check security groups allow port 9030
  2. Verify username/password
  3. Try connecting with mysql client

"Stream Load timeout"

Cause: Large batch or slow BE.

Solutions:

  1. Reduce batch size in EZ-CDC
  2. Increase stream_load_default_timeout_second
  3. Check BE resource utilization

"Memory limit exceeded"

Cause: Batch too large for available memory.

Solutions:

  1. Reduce batch size
  2. Increase BE memory
  3. Increase load_process_max_memory_limit_percent

Next Steps