Architecture¶
This page covers the key architectural decisions behind fastapi-redis-sdk and the reasoning that shaped them.
Connection lifecycle¶
All components — DI factories and CacheBackend —
share a single async Redis connection pool managed by the
FastAPI lifespan.
Every request borrows from the same pool rather than opening a new TCP socket. This bounds the number of connections to Redis, avoids per-request TLS handshakes, and keeps connection limits predictable under load.
Pool construction is cheap — redis-py pools are lazy and allocate empty
bookkeeping (~48 bytes, ~4 µs) without opening any sockets.
Tying the pool to the lifespan gives deterministic startup and shutdown. The pool is guaranteed to exist before the first request and is drained gracefully when the app stops — no leaked sockets, no race conditions.
sequenceDiagram
participant App as FastAPI
participant LS as FastAPIRedis(app).lifespan()
participant PS as _PoolState
participant R as Redis
App->>LS: startup
LS->>PS: build async pool (no TCP yet)
loop Every request
App->>PS: borrow connection
PS->>R: connect on first use, then GET / SET / DEL
R-->>PS: response
PS-->>App: return to pool
end
App->>LS: shutdown
LS->>PS: close pool
PS->>R: disconnect all sockets
Lifespan wrapping¶
FastAPIRedis(app).lifespan() wraps the app's existing lifespan rather than
replacing it. FastAPI only accepts
one lifespan, so
if the library owned it outright, the user would have to manually compose
it with every other library's lifespan. Wrapping avoids this — multiple
builder calls just nest around whatever is already there, starting up in
registration order and tearing down in reverse.
This relies on app.router.lifespan_context, which is not part of
Starlette's public API but has been stable since 0.20+ and is used
internally by FastAPI's own
router lifespan merging.
Nesting order is determined by call order rather than being explicitly
visible, which can make debugging startup hangs harder.
For scenarios where explicit ordering matters, skip .lifespan() and
compose manually using redis_lifespan:
from contextlib import asynccontextmanager
from redis_fastapi import FastAPIRedis, redis_lifespan
@asynccontextmanager
async def my_lifespan(app):
async with redis_lifespan(app):
async with db_lifespan(app):
yield
app = FastAPI(lifespan=my_lifespan)
FastAPIRedis(app).caching() # no .lifespan() — user owns the lifespan
The builder methods are independent — .caching() does not require
.lifespan() (but it is recommended).
Async-first design and sync endpoint support¶
fastapi-redis-sdk is async-only at the transport layer — the sole Redis
connection pool is an redis.asyncio pool. There is no sync redis.Redis
pool. This section explains why that works for both async def and plain
def endpoints.
For background on how FastAPI handles async def vs def, see
Concurrency and async / await
(especially the Very Technical Details
section).
How FastAPI dispatches endpoints and dependencies¶
| Declaration | Where it runs | Blocking I/O safe? |
|---|---|---|
async def endpoint(…) |
Main event loop | No — would block all requests |
def endpoint(…) |
Worker threadpool (anyio.to_thread) |
Yes |
async def dependency(…) |
Main event loop | No |
def dependency(…) |
Worker threadpool | Yes |
All of fastapi-redis-sdk's DI factories (cache(), cache_evict(),
cache_put(), get_cache_backend()) are async def. They run on the
event loop and use await for every Redis call — no thread is blocked.
Sync endpoints that declare these as Depends(…) still work correctly:
FastAPI awaits the async dependency on the event loop, then hands the
resolved value to the sync endpoint which runs in the threadpool.
CacheBackendDep — async endpoints¶
CacheBackendDep injects a CacheBackend whose methods (get, set,
delete, has, delete_group) are all coroutines. Use it from
async def endpoints:
@app.get("/items/{item_id}")
async def get_item(item_id: int, cb: CacheBackendDep) -> dict:
cached = await cb.get(f"item:{item_id}")
if cached:
return cached
item = await fetch_item(item_id)
await cb.set(f"item:{item_id}", item, ttl=300)
return item
SyncCacheBackendDep — sync endpoints¶
Sync (def) endpoints cannot await. SyncCacheBackendDep provides a
blocking wrapper that bridges each call back to the event loop via
anyio.from_thread.run:
@app.get("/items/{item_id}")
def get_item(item_id: int, cb: SyncCacheBackendDep) -> dict:
cached = cb.get(f"item:{item_id}") # blocking — runs on event loop
if cached:
return cached
item = fetch_item(item_id)
cb.set(f"item:{item_id}", item, ttl=300)
return item
Under the hood, each SyncCacheBackend method does:
worker thread event loop
───────────── ──────────
cb.set("k", v, ttl=60)
└─ anyio.from_thread.run(lambda)
└─ schedules ──────────────► await backend.set("k", v, ttl=60)
blocks thread ◄────────── result / exception
This only works from threads managed by FastAPI's
AnyIO threadpool (sync endpoints and sync
dependencies). Calling SyncCacheBackend from the main thread or an
arbitrary thread raises RuntimeError.
Why no sync Redis pool?¶
The library's purpose is high-level DI features (caching, rate limiting, sessions) — not raw Redis access. All of those features use the async client internally. Maintaining a parallel sync pool would mean:
- Doubling pool-management code in the lifespan.
- Keeping two client wrappers (
Redis+AsyncRedis) in sync. - Opening a second set of TCP connections that most apps never use.
Users who need a raw sync redis.Redis client can create one outside the
library in two lines.
The DI caching factories (cache(), cache_evict(), cache_put())¶
These work with both async def and def endpoints without any extra
setup. They are async def generators resolved by FastAPI's DI system
before the endpoint runs. The endpoint function itself never touches
Redis — caching is handled entirely in the dependency and middleware layers.
See Caching § Sync endpoint support for details.
Why dependency injection, not decorators¶
Many FastAPI caching libraries — most notably
fastapi-cache2 — use a
@cache decorator that wraps the endpoint function. fastapi-redis-sdk
deliberately avoids this pattern and uses FastAPI's native dependency
injection (Depends()) instead. The decorator approach has five concrete
problems in FastAPI:
-
Signature rewriting is fragile. A caching decorator must inject hidden
Request/Responseparameters via__signature__manipulation. FastAPI relies on function introspection for validation, OpenAPI generation, and dependency resolution; rewriting the signature operates outside that system and is a known source of breakage (fastapi#1743, fastapi#5065, article). -
Conflicts with other DI-based libraries. Decorator-based route wrapping prevents other
Depends()-based libraries (pagination, security, DB sessions) from initializing correctly (fastapi-cache#557, fastapi-cache#89). -
dependency_overridescannot reach decorator internals. FastAPI's primary test-mocking mechanism only works withDepends()callables. Logic inside a decorator bypasses the DI container entirely (fastapi#4330). -
Decorator order is silently significant.
@app.getmust come before@cache, which must come before@cache_evict. Reversing the order silently breaks caching. DI dependencies are resolved as a graph — no ordering constraints. -
Ecosystem mismatch. FastAPI consistently models cross-cutting concerns as dependencies: authentication (
Depends(get_current_user)), database sessions (Depends(get_db)). The official documentation specifically shows this pattern for side-effect-only concerns — which is exactly what caching is.
Our DI-based design (cache(), cache_evict(), cache_put()) resolves all
five issues. On cache hit, the dependency raises a CacheHitException
(caught by a registered exception handler) that returns the cached response
directly — the endpoint never executes. On cache miss, a lightweight capture
middleware stores the response in Redis after the endpoint returns. See
Caching § Caching factories for usage.
Why not a full ASGI middleware for cache reads?¶
An earlier design used an ASGI middleware to intercept requests before routing and return cached responses without entering the FastAPI pipeline at all. In theory this is the fastest possible path — no DI resolution, no routing.
In practice, benchmarks showed no measurable improvement over the pure-DI approach. The ~0.5–2 ms saved by skipping FastAPI's pipeline is dwarfed by the Redis round-trip and client network latency (see Benchmarks). The middleware also introduced problems the DI path does not have:
- Route registry fragility. The middleware runs before DI, so per-route config (TTL, eviction group, key builder) must be pre-computed into a lookup table at startup. This breaks with lazy route registration, dynamic routes, and mounted sub-applications.
- Two mechanisms to coordinate. Users must register both the middleware and the per-route dependency; forgetting the middleware silently disables caching.
- Cross-layer key consistency. Eviction and write-through dependencies must produce the same cache keys as the middleware — an error-prone coupling.
- Harder to test.
dependency_overridescovers the DI config but not the middleware; tests require ASGI-level fixtures.
The current design keeps a single lightweight middleware
(CacheResponseCaptureMiddleware) solely for miss-path writes — it
buffers the response body and stores it in Redis after the endpoint returns.
Cache reads and short-circuiting happen entirely in the DI layer via
CacheHitException.
Storage model — strings vs hashes¶
Every cached entry is stored as a standalone Redis
string key,
with eviction groups encoded as key prefixes. Eviction-group deletion uses
SCAN + DEL to find and
remove matching keys.
Redis hashes
would be a natural fit — one hash per eviction group, one field per entry.
Since Redis 7.4 added
per-field expiration
and Redis 8.0 added HSETEX,
the main historical blocker is gone. Group deletion becomes a single
DEL (~59× faster than SCAN + DEL at 1000 entries), and memory drops
~5% from eliminating per-key overhead. Reads and writes are effectively
identical in latency.
Strings remain the default for three reasons:
HSETEX requires
Redis ≥ 8.0 and many deployments still run 7.x; key-level features
(keyspace notifications, MEMORY USAGE per entry) don't work on hash
fields;
Telemetry¶
fastapi-redis-sdk supports OpenTelemetry for observability. Instrumentation is split into three independent layers:
block-beta
columns 1
block:L1:1
columns 2
l1["Layer 1 — HTTP request spans"]
l1src["opentelemetry-instrumentation-fastapi"]
end
block:L2:1
columns 2
l2["Layer 2 — Cache operation spans + metrics"]
l2src["fastapi-redis-sdk OTel"]
end
block:L3:1
columns 2
l3["Layer 3 — Redis command spans + connection metrics"]
l3src["redis-py native OTel"]
end
style L1 fill:#636466,color:#ffffff,stroke:#636466
style L2 fill:#DC382D,color:#ffffff,stroke:#DC382D
style L3 fill:#A41E11,color:#ffffff,stroke:#A41E11
style l1 fill:#636466,color:#ffffff,stroke:none
style l1src fill:#636466,color:#ffffff,stroke:none
style l2 fill:#DC382D,color:#ffffff,stroke:none
style l2src fill:#DC382D,color:#ffffff,stroke:none
style l3 fill:#A41E11,color:#ffffff,stroke:none
style l3src fill:#A41E11,color:#ffffff,stroke:none
Each layer can be enabled independently. When all three are active, a single request produces a nested trace:
Layer 1 — HTTP requests¶
Handled by the standard
FastAPI OTel instrumentation.
Install opentelemetry-instrumentation-fastapi and call
FastAPIInstrumentor.instrument_app(app).
Layer 2 — Cache operations¶
This is what fastapi-redis-sdk adds. Enable with the builder or an environment variable:
Requires pip install fastapi-redis-sdk[otel].
Spans — one per cache operation, named following the
OTel span naming guidelines
({action} {target} pattern, low cardinality, human-readable).
Note: there is no official OTel semantic convention for caching yet — span
and metric naming for cache operations is
under discussion.
Our names may change to align once a convention is adopted.
| Span | Source |
|---|---|
cache.get |
cache() dependency (attributes: cache.hit, cache.key, cache.eviction_group, cache.ttl) |
cache.set |
Capture middleware after a cache miss |
cache.evict |
cache_evict() dependency |
cache.put |
cache_put() dependency |
cache.backend.* |
CacheBackend methods (get, set, delete, delete_group, has) |
Metrics:
| Metric | Type | Labels | Description |
|---|---|---|---|
redis_fastapi.cache.requests |
Counter | result (hit / miss / bypass), eviction_group |
Total cache lookups |
redis_fastapi.cache.writes |
Counter | type (miss_fill / write_through), eviction_group |
Cache writes |
redis_fastapi.cache.evictions |
Counter | type (key / group), eviction_group |
Cache invalidations |
redis_fastapi.cache.latency |
Histogram | operation (get / set / evict), eviction_group |
Operation duration in seconds |
Layer 3 — Redis commands¶
Instruments every GET, SET, DEL, etc. at the driver level. Enable via:
Or use opentelemetry-instrumentation-redis externally — but not both at
once, to avoid duplicate spans.
For full configuration details (all env vars, non-intrusiveness guarantee), see the Configuration guide — OpenTelemetry.