SQL Server → Firebolt: 10M CDC ops in 1m 13s

So we set up a Change Data Capture replication pipeline from SQL Server to Firebolt and wanted to benchmark how it performs under a “typical” OLTP load. This article discusses the various characteristics of the benchmark.

I do want to mention here that Firebolt benchmark is a little different, and slightly slower than what it could possibly be and the reason being that our database server is in GCP while our Firebolt bucket is in AWS. That is definitely a network hop, that will do some latency, as opposed to going from GCP to GCP. This probably explains why Firebolt benchmarks are slightly slower.

Benchmark Details

The most important decision here was to determine how to do a replication benchmark, specifically what constitutes a typical load profile of an OLTP database. We realized that a typical OLTP database has the following characteristics

It has multiple tables.
It has some inserts.
It has somewhat more updates.
It has some deletes.
It has small transactions with lot of commits, mostly single row operations and commits.

Given the above characteristics we decided on doing the following for the benchmark.

We will have 10 tables.
We will have 2 million single row inserts.
We will have 6 million single row updates.
We will have 2 million single row deletes.
We will have auto commit on, causing all the above operations to have corresponding commit.

Table Structure

We will have 10 tables called firenibble 1 to 10.

CREATE TABLE dbo.firenibble1 (
    f0 bigint NOT NULL,
    f1 bigint DEFAULT NULL,
    f2 bigint DEFAULT NULL,
    f3 int DEFAULT NULL,
    f4 float DEFAULT NULL,
    f5 float DEFAULT NULL,
    f6 float DEFAULT NULL,
    f7 float DEFAULT NULL,
    f8 varchar(128) DEFAULT NULL,
    f9 varchar(128) DEFAULT NULL,
    f10 date DEFAULT NULL,
    f11 date DEFAULT NULL,
    f12 date DEFAULT NULL,
    f13 varchar(128) DEFAULT NULL,
    f14 varchar(128) DEFAULT NULL,
    f15 varchar(128) DEFAULT NULL,
    PRIMARY KEY (f0)
  );

Insert

Insert will essentially be new primary key insert.

insert into firenibble1 values (19998, 5963887289447355291, 6644703987735879175, -2094978734, 0.51946, 0.480314, 0.528386, 0.247536, 'Y4Jt79YtKat99KRV8CioT7p6bUVdtsm6', 'hC8SKt3MgvoGkOYJIljgKjdEaoY68YJe', '5354-04-29', '5741-02-18', '9548-08-16', 'K3fiVGzTGeiIB2AfChQDXOAQ7ANZBxEN', 'SETnGb7dxOVQthQtFdH6lbZ6yqVVoXPU', 'fJMS3fPnIJBnsi0CBxkFc2fZDf5gUiSJ');

Update

Update will be updating of two columns on a primary key value.

update firenibble1 set f2 = 280396830734761092, f9 = 'n6ChQjN9dfxIl6nc6SNgIzSahfeXfAef' where f0 = 19998;

Delete

Delete will be essentially a primary key delete.

delete from firenibble1 where f0 = 19998;

DML Operations Sequence

Then came the question of how do we order the load so that we can rerun the load over and over again and never have to clean up the previous load. So we decided that the load will be such that we will start with 0 rows in all 10 tables, do a bunch of inserts, do a bunch of updates, do a bunch of deletes and end up with 0 rows in all tables. This kept the load and the tables rather clean. So the exact sequence will look like this.

I know we could have been more granular mixing the inserts, updates and deletes even more, and as we do more benchmarks we might try that strategy but for starters this looked pretty good. Also we wanted to see how our replication performed with large number of insert only, update only and delete only loads. 1 million inserts

1 million inserts
3 million updates
1 million deletes
1 million inserts
3 million updates
1 million deletes.

We will also makes sure that all inserts, updates and deletes happen on 10 tables in a sequence so that the operations are granularily staggered over all tables and not contiguous on one table and the next. Which means, the inserts look like this (and so does the updates and deletes).

insert into firenibble1 values (19998, 5963887289447355291, 6644703987735879175, -2094978734, 0.51946, 0.480314, 0.528386, 0.247536, 'Y4Jt79YtKat99KRV8CioT7p6bUVdtsm6', 'hC8SKt3MgvoGkOYJIljgKjdEaoY68YJe', '5354-04-29', '5741-02-18', '9548-08-16', 'K3fiVGzTGeiIB2AfChQDXOAQ7ANZBxEN', 'SETnGb7dxOVQthQtFdH6lbZ6yqVVoXPU', 'fJMS3fPnIJBnsi0CBxkFc2fZDf5gUiSJ');
insert into firenibble2 values (19998, 5963887289447355291, 6644703987735879175, -2094978734, 0.51946, 0.480314, 0.528386, 0.247536, 'Y4Jt79YtKat99KRV8CioT7p6bUVdtsm6', 'hC8SKt3MgvoGkOYJIljgKjdEaoY68YJe', '5354-04-29', '5741-02-18', '9548-08-16', 'K3fiVGzTGeiIB2AfChQDXOAQ7ANZBxEN', 'SETnGb7dxOVQthQtFdH6lbZ6yqVVoXPU', 'fJMS3fPnIJBnsi0CBxkFc2fZDf5gUiSJ');
insert into firenibble3 values (19998, 5963887289447355291, 6644703987735879175, -2094978734, 0.51946, 0.480314, 0.528386, 0.247536, 'Y4Jt79YtKat99KRV8CioT7p6bUVdtsm6', 'hC8SKt3MgvoGkOYJIljgKjdEaoY68YJe', '5354-04-29', '5741-02-18', '9548-08-16', 'K3fiVGzTGeiIB2AfChQDXOAQ7ANZBxEN', 'SETnGb7dxOVQthQtFdH6lbZ6yqVVoXPU', 'fJMS3fPnIJBnsi0CBxkFc2fZDf5gUiSJ');
insert into firenibble4 values (19998, 5963887289447355291, 6644703987735879175, -2094978734, 0.51946, 0.480314, 0.528386, 0.247536, 'Y4Jt79YtKat99KRV8CioT7p6bUVdtsm6', 'hC8SKt3MgvoGkOYJIljgKjdEaoY68YJe', '5354-04-29', '5741-02-18', '9548-08-16', 'K3fiVGzTGeiIB2AfChQDXOAQ7ANZBxEN', 'SETnGb7dxOVQthQtFdH6lbZ6yqVVoXPU', 'fJMS3fPnIJBnsi0CBxkFc2fZDf5gUiSJ');
insert into firenibble5 values (19998, 5963887289447355291, 6644703987735879175, -2094978734, 0.51946, 0.480314, 0.528386, 0.247536, 'Y4Jt79YtKat99KRV8CioT7p6bUVdtsm6', 'hC8SKt3MgvoGkOYJIljgKjdEaoY68YJe', '5354-04-29', '5741-02-18', '9548-08-16', 'K3fiVGzTGeiIB2AfChQDXOAQ7ANZBxEN', 'SETnGb7dxOVQthQtFdH6lbZ6yqVVoXPU', 'fJMS3fPnIJBnsi0CBxkFc2fZDf5gUiSJ');
insert into firenibble6 values (19998, 5963887289447355291, 6644703987735879175, -2094978734, 0.51946, 0.480314, 0.528386, 0.247536, 'Y4Jt79YtKat99KRV8CioT7p6bUVdtsm6', 'hC8SKt3MgvoGkOYJIljgKjdEaoY68YJe', '5354-04-29', '5741-02-18', '9548-08-16', 'K3fiVGzTGeiIB2AfChQDXOAQ7ANZBxEN', 'SETnGb7dxOVQthQtFdH6lbZ6yqVVoXPU', 'fJMS3fPnIJBnsi0CBxkFc2fZDf5gUiSJ');
insert into firenibble7 values (19998, 5963887289447355291, 6644703987735879175, -2094978734, 0.51946, 0.480314, 0.528386, 0.247536, 'Y4Jt79YtKat99KRV8CioT7p6bUVdtsm6', 'hC8SKt3MgvoGkOYJIljgKjdEaoY68YJe', '5354-04-29', '5741-02-18', '9548-08-16', 'K3fiVGzTGeiIB2AfChQDXOAQ7ANZBxEN', 'SETnGb7dxOVQthQtFdH6lbZ6yqVVoXPU', 'fJMS3fPnIJBnsi0CBxkFc2fZDf5gUiSJ');
insert into firenibble8 values (19998, 5963887289447355291, 6644703987735879175, -2094978734, 0.51946, 0.480314, 0.528386, 0.247536, 'Y4Jt79YtKat99KRV8CioT7p6bUVdtsm6', 'hC8SKt3MgvoGkOYJIljgKjdEaoY68YJe', '5354-04-29', '5741-02-18', '9548-08-16', 'K3fiVGzTGeiIB2AfChQDXOAQ7ANZBxEN', 'SETnGb7dxOVQthQtFdH6lbZ6yqVVoXPU', 'fJMS3fPnIJBnsi0CBxkFc2fZDf5gUiSJ');
insert into firenibble9 values (19998, 5963887289447355291, 6644703987735879175, -2094978734, 0.51946, 0.480314, 0.528386, 0.247536, 'Y4Jt79YtKat99KRV8CioT7p6bUVdtsm6', 'hC8SKt3MgvoGkOYJIljgKjdEaoY68YJe', '5354-04-29', '5741-02-18', '9548-08-16', 'K3fiVGzTGeiIB2AfChQDXOAQ7ANZBxEN', 'SETnGb7dxOVQthQtFdH6lbZ6yqVVoXPU', 'fJMS3fPnIJBnsi0CBxkFc2fZDf5gUiSJ');
insert into firenibble10 values (19998, 5963887289447355291, 6644703987735879175, -2094978734, 0.51946, 0.480314, 0.528386, 0.247536, 'Y4Jt79YtKat99KRV8CioT7p6bUVdtsm6', 'hC8SKt3MgvoGkOYJIljgKjdEaoY68YJe', '5354-04-29', '5741-02-18', '9548-08-16', 'K3fiVGzTGeiIB2AfChQDXOAQ7ANZBxEN', 'SETnGb7dxOVQthQtFdH6lbZ6yqVVoXPU', 'fJMS3fPnIJBnsi0CBxkFc2fZDf5gUiSJ');

Benchmark Sequence

A choice we had was to make was do one of the following

Have the wirekite replication running. Run the load and wait for the last event of the load to appear on the target. Measure the time between when the load started and the last event appeared.
Run the load and note down the binlog number and position at which the load started and wait for the load to finish. Now start the wirekite replication from the starting binlog number and position and wait for the last event of the load to appear on the target. Measure the time between when the wirekite replication started and the last event appeared on the target.

We decided to go with option 2. The primary reason being that in option 1 wirekite replication was waiting for the events to finish successfully on the database, and essentially we were not only benchmarking wirekite, but also the database, with the boundaries between the two completely blurred. So we preferred the pre-loaded database with the changes, so that we are only benchmarking wirekite and nothing else.

Machine Configuration

For SQL Server we created an n2-standard-16 GCP instance with all the standard configurations - 8 CPU’s and 64 GB RAM in the us-central region. We installed Ubuntu version 24 and SQL Server version 19.

For Firebolt we created a Medium Size Warehouse also in the us-central1 region to minimize the network distance.

=> ALTER ENGINE my_engine SET TYPE = "M" NODES = 5 MIN_CLUSTERS=1 MAX_CLUSTERS = 2 FAMILY=CO;
ok

We have been using the same configurations for all our benchmarks so that we are comparing apples and apples.

SQL Server Run Around

SQL Server had a really different change capturing scheme than all the other databases. While all the other databases use a central repository of some kinds to track changes (Oracle stores changes in archive logs, MySQL stores them in binlogs, Postgres uses WAL and so forth), and the changes are also stored serially across this data store. This means that the changes need to be traversed in order and you have really no opportunity for parallelism.

SQL Server took a very different route - they chose to have “change tables”. When you enable change tracking on a given table, a corresponding change table is created with all the columns of the original table as well as some meta columns to track the change order.

https://learn.microsoft.com/en-us/sql/relational-databases/system-tables/cdc-capture-instance-ct-transact-sql?view=sql-server-ver17

This means that one can actually have multiple threads tracking changes across multiple tables, and this is why this is one of our fastest benchmark.

Also note that SQL Server chose to use a cryptic LSN ( as opposed to humanly readable SCN in oracle and humanly readable binlog coordinate in MySQL). The fact that it was cryptic made it really difficult to see order in changes since you can’t tell which change happened before/after the other one.

The multi table approach also caused another problem - a transaction that spanned multiple tables appeared in multiple change tables. And SQL Server has no way of telling which tables are involved in a given transaction. So you have to scan all the change tables for a given LSN and stitch it together.

But one of the good things about SQL Server change tracking was that only committed transaction showed up in the change tables, unlike oracle where an uncommitted transaction can happily show up in an archive log, only to become committed in a later archive log. This was a relief.

No Airbyte Run

We could not do an airbyte run since it requires an enterprise level connector. So this is just our benchmark.

Wirekite Run

So how long did it take for wirekite to move 10 million changes — 1 minutes and 13 seconds.

cat wirekite_cdc.log | grep -E 'START|FINISH|ELAPSED'
START: Fri Jun 13 05:07:24 UTC 2025
FINISH: Fri Jun 13 05:08:37 UTC 2025
TOTAL ELAPSED TIME: 0 hrs 1 min 13 sec