Databricks Lakebase

This guide walks you through integrating RIOT-X with Databricks Lakebase to import data from Lakebase tables into Redis.

Databricks Lakebase is a cloud-native operational database service that provides PostgreSQL-compatible access to your data lakehouse, enabling real-time analytics and applications.

Prerequisites

For this guide you will need:

  1. A Databricks account with compute in supported Lakebase regions

  2. A Redis database (local or cloud-hosted)

  3. RIOT-X installed on your machine

  4. Basic familiarity with SQL and Redis commands

This integration uses the Postgres JDBC driver version 42.7.7 packaged with RIOT-X to access Lakebase. If you need to replace this driver with another version make sure it is at least 42.7.6. It is required due to a bug (PR #3509): https://jdbc.postgresql.org/changelogs/2025-05-28-42/.

Setup

Create Lakebase Database

Once your database is created, note down the connection details as we will need them for RIOT-X:

host

Database hostname, for example instance-16e99031-440f-42a2-8c01-7cc86e37cb2d.database.cloud.databricks.com

user

Your Databricks username, for example julien@redis.com

dbname

Name of your PostgreSQL database, for example databricks_postgres

Generate an authentication token:

  1. In your Lakebase instance dashboard, click Get OAuth Token

  2. Copy the generated token and store it securely

  3. This token will be used as the password for database connections

Create and Sync Table

Now let’s create a table and set up synchronization with Lakebase.

Create Source Table

Go to your SQL Editor in Databricks and execute this SQL query to create a sample table:

CREATE TABLE workspace.default.sales_customers
    AS SELECT * FROM samples.bakehouse.sales_customers

This creates a table with customer data that we’ll use for demonstration.

Create Synced Table

Follow these steps to create a synced table:

  1. Navigate to your Lakebase database instance

  2. Create a synced table named customers from the workspace.default.sales_customers table

  3. Wait for the initial sync to complete

The synced table will automatically keep your Lakebase database updated with changes from the source table.

Import

Set Environment Variables

Store your authentication token from the Create Lakebase Database step in an environment variable:

export PGPASSWORD="<your-oauth-token>"

Replace <your-oauth-token> with the token you generated earlier.

Optionally, you can also set other connection parameters:

export PGHOST="<your-host>"
export PGUSER="<your-username>"
export PGDB="<your-database>"

Import Data into Redis

Run the following RIOT-X command to import data from your Lakebase table into Redis.

Replace the placeholders with your actual connection details:

riotx db-import 'SELECT * FROM "default".customers' \
  --jdbc-url "jdbc:postgresql://<host>:5432/<dbname>?sslmode=require&ApplicationName=redis-riotx" \
  --jdbc-user "<user>" \
  --jdbc-pass "$PGPASSWORD" \
  hset customer:#{customerID}

Or using environment variables:

riotx db-import 'SELECT * FROM "default".customers' \
  --jdbc-url "jdbc:postgresql://$PGHOST:5432/$PGDB?sslmode=require" \
  --jdbc-user "$PGUSER" \
  --jdbc-pass "$PGPASSWORD" \
  hset customer:#{customerID}

This command:

  • Connects to your Lakebase database using PostgreSQL protocol

  • Executes the SQL query to fetch all customer records

  • Creates Redis hash keys in the format customer:<customerID>

  • Maps each database column to a hash field

Verify Import

Verify that the data has been successfully imported into your Redis database.

Check Import Statistics

The import command will display statistics upon completion:

[INFO] Imported 1000 records in 2.5 seconds (400 records/sec)

Examine Individual Records

Check a specific customer record:

127.0.0.1:6379> HGETALL customer:2000146
 1) "customerID"
 2) "2000146"
 3) "first_name"
 4) "Dustin"
 5) "last_name"
 6) "Meyers"
 7) "email_address"
 8) "maryschmitt@example.net"
 9) "phone_number"
10) "(813)928-0087x664"
11) "address"
12) "0262 Berry Shores Apt. 038"
13) "city"
14) "East Carla"
15) "state"
16) "Nevada"
17) "country"
18) "Japan"
19) "continent"
20) "Asia"
21) "postal_zip_code"
22) "98591"
23) "gender"
24) "female"

List All Customer Keys

See all imported customer keys:

127.0.0.1:6379> KEYS customer:*

Advanced Usage

Custom Key Patterns

You can customize the Redis key pattern to suit your needs:

# Use email as key
riotx db-import 'SELECT * FROM "default".customers' \
  --jdbc-url "jdbc:postgresql://$PGHOST:5432/$PGDB?sslmode=require" \
  --jdbc-user "$PGUSER" \
  --jdbc-pass "$PGPASSWORD" \
  hset user:#{email_address}

# Include additional metadata
riotx db-import 'SELECT * FROM "default".customers' \
  --jdbc-url "jdbc:postgresql://$PGHOST:5432/$PGDB?sslmode=require" \
  --jdbc-user "$PGUSER" \
  --jdbc-pass "$PGPASSWORD" \
  hset "customer:#{customerID}" --meta-field source=lakebase --meta-field imported_at="$(date)"

Filtering Data

Import only specific records using SQL WHERE clauses:

# Import only customers from specific countries
riotx db-import 'SELECT * FROM "default".customers WHERE country IN ("USA", "Canada")' \
  --jdbc-url "jdbc:postgresql://$PGHOST:5432/$PGDB?sslmode=require" \
  --jdbc-user "$PGUSER" \
  --jdbc-pass "$PGPASSWORD" \
  hset customer:#{customerID}

Performance Tuning

For large datasets, consider these performance options:

riotx db-import 'SELECT * FROM "default".customers' \
  --jdbc-url "jdbc:postgresql://$PGHOST:5432/$PGDB?sslmode=require" \
  --jdbc-user "$PGUSER" \
  --jdbc-pass "$PGPASSWORD" \
  --threads 8 \
  --batch 1000 \
  hset customer:#{customerID}

Troubleshooting

Connection Issues

If you encounter connection problems:

  1. Verify your Lakebase instance is running and accessible

  2. Check that your OAuth token is valid and not expired

  3. Ensure the database name and hostname are correct

  4. Confirm that SSL is properly configured (sslmode=require)

Data Import Issues

  • Empty results: Verify your SQL query and table name

  • Permission errors: Ensure your user has read access to the table

  • Slow imports: Consider using --threads and --batch options for large datasets

Redis Connection Issues

  • Ensure Redis is running and accessible

  • Check Redis memory usage if imports fail partway through

  • Verify Redis authentication if required