PostgreSQL JSON Benchmark
This page documents a representative db-import benchmark for loading denormalized PostgreSQL rows into Redis 8 as JSON documents while a RediSearch JSON index is active.
The benchmark is intended to estimate cold-load behavior for appointment-search style workloads where each source row produces one Redis JSON document.
Environment
| Component | Value |
|---|---|
Source |
PostgreSQL 16 in Docker |
Target |
Redis 8.6.2 in Docker, with |
RIOT-X command |
|
Documents |
1,000,000 synthetic appointment-like JSON documents |
Average JSON size |
4,811 bytes |
RIOT-X settings |
|
| This benchmark used local Docker containers on one development machine. Use it as sizing guidance, then validate on your actual PostgreSQL replica, network path, and Redis deployment. |
Source Table
The benchmark source table stores the Redis key field and the JSON document text that db-import writes with json.set.
DROP TABLE IF EXISTS public.appointment_json_benchmark;
CREATE UNLOGGED TABLE public.appointment_json_benchmark (
appointment_id BIGINT PRIMARY KEY,
org_id INTEGER NOT NULL,
doc TEXT NOT NULL
);
The test data was inserted in 100,000-row batches. Each batch used generate_series(:row_start, :row_end):
INSERT INTO public.appointment_json_benchmark (appointment_id, org_id, doc)
SELECT
gs AS appointment_id,
CASE WHEN gs <= 90000 THEN 42 ELSE ((gs % 1690) + 1)::integer END AS org_id,
jsonb_build_object(
'appointmentId', gs,
'orgId', CASE WHEN gs <= 90000 THEN 42 ELSE ((gs % 1690) + 1)::integer END,
'status', (ARRAY['scheduled','checked_in','loaded','cancelled','completed'])[(gs % 5) + 1],
'startTime', to_char(timestamp '2026-01-01' + ((gs % 63072000) * interval '1 second'), 'YYYY-MM-DD"T"HH24:MI:SS"Z"'),
'endTime', to_char(timestamp '2026-01-01' + ((gs % 63072000) * interval '1 second') + interval '2 hours', 'YYYY-MM-DD"T"HH24:MI:SS"Z"'),
'referenceNumber', 'REF-' || lpad(gs::text, 12, '0'),
'createdBy', jsonb_build_object('id', gs % 50000, 'name', 'Created User ' || (gs % 50000), 'email', 'created' || (gs % 50000) || '@example.com'),
'updatedBy', jsonb_build_object('id', gs % 50000, 'name', 'Updated User ' || (gs % 50000), 'email', 'updated' || (gs % 50000) || '@example.com'),
'dock', jsonb_build_object('id', gs % 25000, 'name', 'Dock ' || (gs % 25000), 'doorNumber', (gs % 400)::text),
'warehouse', jsonb_build_object('id', gs % 5000, 'name', 'Warehouse ' || (gs % 5000), 'timezone', 'America/Chicago'),
'loadType', jsonb_build_object('id', gs % 500, 'name', 'Load Type ' || (gs % 500)),
'carrier', jsonb_build_object('id', gs % 80000, 'name', 'Carrier User ' || (gs % 80000), 'email', 'carrier' || (gs % 80000) || '@example.com'),
'company', jsonb_build_object('id', gs % 15000, 'name', 'Company ' || (gs % 15000), 'scac', 'SC' || lpad((gs % 9999)::text, 4, '0')),
'customForm', jsonb_build_object(
'id', gs % 200000,
'templateId', gs % 250,
'answers', jsonb_build_array(
jsonb_build_object('field', 'po', 'value', 'PO-' || gs),
jsonb_build_object('field', 'temperature', 'value', (gs % 80)::text)
)
),
'customFields', jsonb_build_object(
'priority', (gs % 10),
'searchText', left(repeat(md5(gs::text || ':custom'), (3900 / 32) + 1), 3900)
)
)::text AS doc
FROM generate_series(:row_start::bigint, :row_end::bigint) AS gs;
The generated corpus measured:
SELECT count(*) AS rows,
round(avg(length(doc))::numeric, 2) AS avg_doc_bytes,
min(length(doc)) AS min_doc_bytes,
max(length(doc)) AS max_doc_bytes,
pg_size_pretty(pg_total_relation_size('public.appointment_json_benchmark'::regclass)) AS table_size
FROM public.appointment_json_benchmark;
| Metric | Value |
|---|---|
Rows |
1,000,000 |
Average document size |
4,811 bytes |
Minimum document size |
4,732 bytes |
Maximum document size |
4,820 bytes |
PostgreSQL table size |
673 MB |
Redis 8 Index
The Redis 8 target had this JSON index in place before the import started:
FT.CREATE idx:appointments ON JSON PREFIX 1 appointment: SCHEMA \
'$.orgId' AS orgId NUMERIC SORTABLE \
'$.status' AS status TAG \
'$.startTime' AS startTime TAG \
'$.referenceNumber' AS referenceNumber TAG \
'$.dock.name' AS dockName TAG \
'$.warehouse.name' AS warehouseName TAG \
'$.carrier.name' AS carrierName TAG \
'$.company.name' AS companyName TAG \
'$.customFields.searchText' AS customSearchText TEXT
referenceNumber and the name fields are modeled as TAG fields here. This reduces index records and memory compared with indexing those fields as TEXT when exact-match or faceted lookup semantics are sufficient.
RIOT-X Command
The import query selected only the key field and the JSON text:
SELECT appointment_id, doc
FROM public.appointment_json_benchmark
The RIOT-X command wrote each row as a JSON document:
riotx db-import \
'SELECT appointment_id, doc FROM public.appointment_json_benchmark' \
-u redis://localhost:16380 \
--pool 8 \
--jdbc-url 'jdbc:postgresql://localhost:15432/riotx_bench?ApplicationName=riotx-appointment-json-tag-benchmark' \
--jdbc-user riotx \
--jdbc-pass riotx \
--fetch 10000 \
--batch 500 \
--threads 8 \
json.set 'appointment:#{appointment_id}' --value doc
Results
| Metric | Value |
|---|---|
Import time |
199 seconds |
Throughput |
5,025 documents/second |
Redis keys |
1,000,000 |
Indexed documents |
1,000,000 |
Redis memory used |
10.63 GB |
Selected FT.INFO values after the load:
| Metric | Value |
|---|---|
|
1,000,000 |
|
9,066,141 |
|
9.07 |
|
284 MB |
|
80 MB |
|
23 MB |
|
31 MB |
Full-Corpus Estimate
At 5,025 documents/second, a 36,900,717-document cold load projects to:
36,900,717 / 5,025 docs/sec = 7,343 seconds = 2.04 hours
For production planning, include headroom for network latency, Redis topology, index complexity, source SQL cost, and any rate limit needed to protect the PostgreSQL replica. A practical estimate for this workload is:
| Scenario | Estimated duration |
|---|---|
Local benchmark extrapolation |
~2 hours |
Typical production planning range |
2.5-4 hours |
Conservative planning range |
4-6 hours |
To complete 36,900,717 documents within 8 hours, RIOT-X must sustain about 1,281 documents/second:
36,900,717 / (8 * 60 * 60) = 1,281 docs/sec
The measured local throughput is about 3.9x that threshold.
Operational Notes
-
Run cold-load benchmarks against a PostgreSQL replica, not the primary database.
-
Use a positive JDBC fetch size. For large result sets,
--fetch 10000avoids loading the full result set upfront. -
Treat
--batch 500 --threads 8 --pool 8as a starting point, not a universal optimum. In local tests, larger thread counts reduced throughput. -
If the PostgreSQL replica needs protection, use
--rateto cap write throughput. For example, 2,000 documents/second still completes 36.9M documents in about 5.1 hours. -
Validate memory with production-like documents and the production index schema. Synthetic large text fields can materially change RediSearch memory usage.
-
Create the RediSearch index before the benchmark when you need to measure indexing cost as part of cold-load time.