Databricks Lakebase
This guide walks you through integrating RIOT-X with Databricks Lakebase to import data from Lakebase tables into Redis.
Databricks Lakebase is a cloud-native operational database service that provides PostgreSQL-compatible access to your data lakehouse, enabling real-time analytics and applications.
Prerequisites
For this guide you will need:
-
A Databricks account with compute in supported Lakebase regions
-
A Redis database (local or cloud-hosted)
-
RIOT-X installed on your machine
-
Basic familiarity with SQL and Redis commands
This integration uses the Postgres JDBC driver version 42.7.7 packaged with RIOT-X to access Lakebase. If you need to replace this driver with another version make sure it is at least 42.7.6 . It is required due to a bug (PR #3509): https://jdbc.postgresql.org/changelogs/2025-05-28-42/.
|
Setup
Create Lakebase Database
Follow these steps to create and manage a Lakebase database instance.
Once your database is created, note down the connection details as we will need them for RIOT-X:
host
|
Database hostname, for example |
user
|
Your Databricks username, for example |
dbname
|
Name of your PostgreSQL database, for example |
Generate an authentication token:
-
In your Lakebase instance dashboard, click Get OAuth Token
-
Copy the generated token and store it securely
-
This token will be used as the password for database connections
Create and Sync Table
Now let’s create a table and set up synchronization with Lakebase.
Create Source Table
Go to your SQL Editor in Databricks and execute this SQL query to create a sample table:
CREATE TABLE workspace.default.sales_customers
AS SELECT * FROM samples.bakehouse.sales_customers
This creates a table with customer data that we’ll use for demonstration.
Create Synced Table
Follow these steps to create a synced table:
-
Navigate to your Lakebase database instance
-
Create a synced table named
customers
from theworkspace.default.sales_customers
table -
Wait for the initial sync to complete
The synced table will automatically keep your Lakebase database updated with changes from the source table.
Import
Set Environment Variables
Store your authentication token from the Create Lakebase Database step in an environment variable:
export PGPASSWORD="<your-oauth-token>"
Replace <your-oauth-token>
with the token you generated earlier.
Optionally, you can also set other connection parameters:
export PGHOST="<your-host>"
export PGUSER="<your-username>"
export PGDB="<your-database>"
Import Data into Redis
Run the following RIOT-X command to import data from your Lakebase table into Redis.
Replace the placeholders with your actual connection details:
riotx db-import 'SELECT * FROM "default".customers' \
--jdbc-url "jdbc:postgresql://<host>:5432/<dbname>?sslmode=require&ApplicationName=redis-riotx" \
--jdbc-user "<user>" \
--jdbc-pass "$PGPASSWORD" \
hset customer:#{customerID}
Or using environment variables:
riotx db-import 'SELECT * FROM "default".customers' \
--jdbc-url "jdbc:postgresql://$PGHOST:5432/$PGDB?sslmode=require" \
--jdbc-user "$PGUSER" \
--jdbc-pass "$PGPASSWORD" \
hset customer:#{customerID}
This command:
-
Connects to your Lakebase database using PostgreSQL protocol
-
Executes the SQL query to fetch all customer records
-
Creates Redis hash keys in the format
customer:<customerID>
-
Maps each database column to a hash field
Verify Import
Verify that the data has been successfully imported into your Redis database.
Check Import Statistics
The import command will display statistics upon completion:
[INFO] Imported 1000 records in 2.5 seconds (400 records/sec)
Examine Individual Records
Check a specific customer record:
127.0.0.1:6379> HGETALL customer:2000146
1) "customerID"
2) "2000146"
3) "first_name"
4) "Dustin"
5) "last_name"
6) "Meyers"
7) "email_address"
8) "maryschmitt@example.net"
9) "phone_number"
10) "(813)928-0087x664"
11) "address"
12) "0262 Berry Shores Apt. 038"
13) "city"
14) "East Carla"
15) "state"
16) "Nevada"
17) "country"
18) "Japan"
19) "continent"
20) "Asia"
21) "postal_zip_code"
22) "98591"
23) "gender"
24) "female"
Advanced Usage
Custom Key Patterns
You can customize the Redis key pattern to suit your needs:
# Use email as key
riotx db-import 'SELECT * FROM "default".customers' \
--jdbc-url "jdbc:postgresql://$PGHOST:5432/$PGDB?sslmode=require" \
--jdbc-user "$PGUSER" \
--jdbc-pass "$PGPASSWORD" \
hset user:#{email_address}
# Include additional metadata
riotx db-import 'SELECT * FROM "default".customers' \
--jdbc-url "jdbc:postgresql://$PGHOST:5432/$PGDB?sslmode=require" \
--jdbc-user "$PGUSER" \
--jdbc-pass "$PGPASSWORD" \
hset "customer:#{customerID}" --meta-field source=lakebase --meta-field imported_at="$(date)"
Filtering Data
Import only specific records using SQL WHERE clauses:
# Import only customers from specific countries
riotx db-import 'SELECT * FROM "default".customers WHERE country IN ("USA", "Canada")' \
--jdbc-url "jdbc:postgresql://$PGHOST:5432/$PGDB?sslmode=require" \
--jdbc-user "$PGUSER" \
--jdbc-pass "$PGPASSWORD" \
hset customer:#{customerID}
Performance Tuning
For large datasets, consider these performance options:
riotx db-import 'SELECT * FROM "default".customers' \
--jdbc-url "jdbc:postgresql://$PGHOST:5432/$PGDB?sslmode=require" \
--jdbc-user "$PGUSER" \
--jdbc-pass "$PGPASSWORD" \
--threads 8 \
--batch 1000 \
hset customer:#{customerID}
Troubleshooting
Connection Issues
If you encounter connection problems:
-
Verify your Lakebase instance is running and accessible
-
Check that your OAuth token is valid and not expired
-
Ensure the database name and hostname are correct
-
Confirm that SSL is properly configured (
sslmode=require
)