Vectorizers
Vectorizers convert text into numerical vector representations (embeddings) that capture semantic meaning. RedisVL for Java supports multiple vectorization options to fit different use cases and deployment requirements.
What are Vectorizers?
Vectorizers (also called embedding models) transform text into dense vector representations that machine learning models can process. Similar texts produce similar vectors, enabling semantic search and similarity comparisons.
Available Vectorizers
RedisVL for Java supports two main vectorization approaches:
-
LangChain4J Integration - Use cloud-based or local models via LangChain4J
-
Local ONNX Models - Run Sentence Transformers models locally with ONNX Runtime
LangChain4J Vectorizer
LangChain4J provides access to many embedding providers including OpenAI, Azure OpenAI, Cohere, and local models.
Setup
Add LangChain4J dependencies to your project:
- Maven
-
<dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j</artifactId> <version>0.35.0</version> </dependency> <!-- Add specific provider (e.g., OpenAI) --> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-open-ai</artifactId> <version>0.35.0</version> </dependency>
- Gradle
-
implementation 'dev.langchain4j:langchain4j:0.35.0' implementation 'dev.langchain4j:langchain4j-open-ai:0.35.0'
Basic Usage
import com.redis.vl.utils.vectorize.LangChain4JVectorizer;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.model.openai.OpenAiEmbeddingModel;
// Create an embedding model from LangChain4J
EmbeddingModel embeddingModel = OpenAiEmbeddingModel.builder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.modelName("text-embedding-3-small")
.build();
// Wrap it in RedisVL vectorizer
LangChain4JVectorizer vectorizer = new LangChain4JVectorizer(embeddingModel);
// Generate embeddings
String text = "Redis is an in-memory database";
float[] embedding = vectorizer.embed(text);
System.out.println("Embedding dimensions: " + embedding.length);
Batch Embedding
Process multiple texts efficiently:
List<String> texts = List.of(
"First document about Redis",
"Second document about vector search",
"Third document about databases"
);
List<float[]> embeddings = vectorizer.embedBatch(texts);
for (int i = 0; i < texts.size(); i++) {
System.out.println("Text: " + texts.get(i));
System.out.println("Embedding length: " + embeddings.get(i).length);
}
Supported Providers
LangChain4J supports many providers. RedisVL4J provides a unified interface to all of them through LangChain4JVectorizer
.
OpenAI
<!-- Maven dependency -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-open-ai</artifactId>
<version>0.35.0</version>
</dependency>
import dev.langchain4j.model.openai.OpenAiEmbeddingModel;
EmbeddingModel model = OpenAiEmbeddingModel.builder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.modelName("text-embedding-3-small") // or text-embedding-3-large, text-embedding-ada-002
.build();
LangChain4JVectorizer vectorizer = new LangChain4JVectorizer(
"text-embedding-3-small", // model name
model, // embedding model
1536 // dimensions
);
Azure OpenAI
<!-- Maven dependency -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-azure-open-ai</artifactId>
<version>0.35.0</version>
</dependency>
import dev.langchain4j.model.azure.AzureOpenAiEmbeddingModel;
EmbeddingModel model = AzureOpenAiEmbeddingModel.builder()
.apiKey(System.getenv("AZURE_OPENAI_API_KEY"))
.endpoint(System.getenv("AZURE_OPENAI_ENDPOINT"))
.deploymentName("text-embedding-ada-002")
.build();
LangChain4JVectorizer vectorizer = new LangChain4JVectorizer(
"text-embedding-ada-002",
model,
1536
);
Cohere
<!-- Maven dependency -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-cohere</artifactId>
<version>0.35.0</version>
</dependency>
import dev.langchain4j.model.cohere.CohereEmbeddingModel;
EmbeddingModel model = CohereEmbeddingModel.builder()
.apiKey(System.getenv("COHERE_API_KEY"))
.modelName("embed-english-v3.0") // or embed-multilingual-v3.0
.build();
LangChain4JVectorizer vectorizer = new LangChain4JVectorizer(
"embed-english-v3.0",
model,
1024
);
HuggingFace (Remote API)
<!-- Maven dependency -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-hugging-face</artifactId>
<version>0.35.0</version>
</dependency>
import dev.langchain4j.model.huggingface.HuggingFaceEmbeddingModel;
EmbeddingModel model = HuggingFaceEmbeddingModel.builder()
.accessToken(System.getenv("HUGGINGFACE_API_KEY"))
.modelId("sentence-transformers/all-MiniLM-L6-v2")
.build();
LangChain4JVectorizer vectorizer = new LangChain4JVectorizer(
"sentence-transformers/all-MiniLM-L6-v2",
model,
384
);
Mistral AI
<!-- Maven dependency -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-mistral-ai</artifactId>
<version>0.35.0</version>
</dependency>
import dev.langchain4j.model.mistralai.MistralAiEmbeddingModel;
EmbeddingModel model = MistralAiEmbeddingModel.builder()
.apiKey(System.getenv("MISTRAL_API_KEY"))
.modelName("mistral-embed")
.build();
LangChain4JVectorizer vectorizer = new LangChain4JVectorizer(
"mistral-embed",
model,
1024
);
Google Vertex AI
<!-- Maven dependency -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-vertex-ai</artifactId>
<version>0.35.0</version>
</dependency>
import dev.langchain4j.model.vertexai.VertexAiEmbeddingModel;
EmbeddingModel model = VertexAiEmbeddingModel.builder()
.project(System.getenv("GCP_PROJECT_ID"))
.location("us-central1") // or your preferred location
.modelName("textembedding-gecko@003")
.build();
LangChain4JVectorizer vectorizer = new LangChain4JVectorizer(
"textembedding-gecko@003",
model,
768
);
Voyage AI
<!-- Maven dependency -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-voyage-ai</artifactId>
<version>0.35.0</version>
</dependency>
import dev.langchain4j.model.voyageai.VoyageAiEmbeddingModel;
EmbeddingModel model = VoyageAiEmbeddingModel.builder()
.apiKey(System.getenv("VOYAGE_API_KEY"))
.modelName("voyage-large-2") // or voyage-2, voyage-code-2
.build();
LangChain4JVectorizer vectorizer = new LangChain4JVectorizer(
"voyage-large-2",
model,
1536
);
AWS Bedrock
<!-- Maven dependency -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-bedrock</artifactId>
<version>0.35.0</version>
</dependency>
import dev.langchain4j.model.bedrock.BedrockEmbeddingModel;
// AWS credentials are typically configured via AWS SDK default credential chain
EmbeddingModel model = BedrockEmbeddingModel.builder()
.region("us-east-1") // or your preferred region
.model("amazon.titan-embed-text-v2:0")
.build();
LangChain4JVectorizer vectorizer = new LangChain4JVectorizer(
"amazon.titan-embed-text-v2:0",
model,
1024
);
Ollama (Local)
<!-- Maven dependency -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-ollama</artifactId>
<version>0.35.0</version>
</dependency>
import dev.langchain4j.model.ollama.OllamaEmbeddingModel;
EmbeddingModel model = OllamaEmbeddingModel.builder()
.baseUrl("http://localhost:11434")
.modelName("nomic-embed-text") // or mxbai-embed-large, all-minilm
.build();
LangChain4JVectorizer vectorizer = new LangChain4JVectorizer(
"nomic-embed-text",
model,
768 // dimensions vary by model
);
Caching Embeddings
Cache embeddings to improve performance and reduce API costs:
import com.redis.vl.extensions.cache.EmbeddingsCache;
import redis.clients.jedis.UnifiedJedis;
// Create cache
UnifiedJedis jedis = new UnifiedJedis("redis://localhost:6379");
EmbeddingsCache cache = new EmbeddingsCache("my-embeddings-cache", jedis);
// Set cache on vectorizer
vectorizer.setCache(cache);
// First call - generates embedding and stores in cache
float[] embedding1 = vectorizer.embed("Redis vector search");
// Second call - retrieves from cache (much faster!)
float[] embedding2 = vectorizer.embed("Redis vector search");
// embeddings are identical
assert Arrays.equals(embedding1, embedding2);
The cache works automatically with batch operations too:
List<String> texts = List.of(
"Redis is fast",
"Vector search is powerful",
"Redis is fast" // Duplicate - will be cached
);
List<float[]> embeddings = vectorizer.embedBatch(texts);
// First and third embeddings are identical (from cache)
assert Arrays.equals(embeddings.get(0), embeddings.get(2));
Custom Vectorizers
Create your own vectorizer by extending BaseVectorizer
:
import com.redis.vl.utils.vectorize.BaseVectorizer;
import java.util.ArrayList;
import java.util.List;
public class MyCustomVectorizer extends BaseVectorizer {
public MyCustomVectorizer() {
super("my-custom-model", 384, "float32");
}
@Override
protected float[] generateEmbedding(String text) {
// Implement your custom embedding logic
// This could call your own API, use a custom model, etc.
// Example: Simple hash-based embedding (not recommended for production!)
float[] embedding = new float[384];
int hash = text.hashCode();
for (int i = 0; i < 384; i++) {
embedding[i] = (float) Math.sin(hash + i);
}
return embedding;
}
@Override
protected List<float[]> generateEmbeddingsBatch(List<String> texts, int batchSize) {
// Implement batch processing
// You can optimize this for your specific use case
List<float[]> results = new ArrayList<>();
for (String text : texts) {
results.add(generateEmbedding(text));
}
return results;
}
}
// Usage
MyCustomVectorizer vectorizer = new MyCustomVectorizer();
float[] embedding = vectorizer.embed("Hello world");
Custom vectorizers automatically support caching and preprocessing:
// Add cache
vectorizer.setCache(cache);
// Use preprocessing
float[] embedding = vectorizer.embed(
"Hello World",
text -> text.toLowerCase(), // Preprocess: convert to lowercase
false, // asBuffer (not used in Java)
false // skipCache
);
Local ONNX Vectorizer
Run Sentence Transformers models locally using ONNX Runtime. No API calls, no internet required, complete privacy.
Setup
ONNX Runtime dependency is already included in RedisVL. Download a model:
import com.redis.vl.utils.vectorize.HuggingFaceModelDownloader;
// Download a model from Hugging Face
String modelName = "sentence-transformers/all-MiniLM-L6-v2";
String modelPath = HuggingFaceModelDownloader.downloadModel(
modelName,
"~/.cache/redisvl4j/models" // local cache directory
);
System.out.println("Model downloaded to: " + modelPath);
Basic Usage
import com.redis.vl.utils.vectorize.SentenceTransformersVectorizer;
// Create vectorizer with downloaded model
SentenceTransformersVectorizer vectorizer =
new SentenceTransformersVectorizer(modelPath);
// Generate embeddings
String text = "Local embedding generation";
float[] embedding = vectorizer.embed(text);
System.out.println("Generated " + embedding.length + "-dim embedding locally");
Popular ONNX Models
Model | Dimensions | Best For |
---|---|---|
all-MiniLM-L6-v2 |
384 |
Fast, general purpose, good balance |
all-mpnet-base-v2 |
768 |
High quality, general purpose |
all-MiniLM-L12-v2 |
384 |
Better than L6, still fast |
multi-qa-MiniLM-L6-cos-v1 |
384 |
Question-answering, Q&A systems |
msmarco-distilbert-base-v4 |
768 |
Search and ranking tasks |
Complete Example with ONNX
import com.redis.vl.utils.vectorize.SentenceTransformersVectorizer;
import com.redis.vl.utils.vectorize.HuggingFaceModelDownloader;
import com.redis.vl.index.SearchIndex;
import com.redis.vl.schema.IndexSchema;
import com.fasterxml.jackson.databind.ObjectMapper;
public class LocalVectorizerExample {
public static void main(String[] args) {
// Download model (only once)
String modelName = "sentence-transformers/all-MiniLM-L6-v2";
String modelPath = HuggingFaceModelDownloader.downloadModel(modelName);
// Create vectorizer
SentenceTransformersVectorizer vectorizer =
new SentenceTransformersVectorizer(modelPath);
// Prepare documents
List<String> documents = List.of(
"Redis is an in-memory database",
"Vector search enables semantic similarity",
"Machine learning models process embeddings"
);
// Generate embeddings
List<float[]> embeddings = vectorizer.embedBatch(documents);
// Create search index (JSON storage uses $.field notation)
Map<String, Object> schema = Map.of(
"index", Map.of(
"name", "documents",
"prefix", "doc",
"storage_type", "json"
),
"fields", List.of(
Map.of("name", "$.content", "type", "text"),
Map.of(
"name", "$.embedding",
"type", "vector",
"attrs", Map.of(
"dims", 384, // all-MiniLM-L6-v2 dimensions
"distance_metric", "cosine",
"algorithm", "flat",
"datatype", "float32"
)
)
)
);
// Create index from schema
ObjectMapper mapper = new ObjectMapper();
String schemaJson = mapper.writeValueAsString(schema);
SearchIndex index = new SearchIndex(
IndexSchema.fromJson(schemaJson),
jedis
);
index.create(true);
// Load documents with embeddings
List<Map<String, Object>> data = new ArrayList<>();
for (int i = 0; i < documents.size(); i++) {
data.add(Map.of(
"content", documents.get(i),
"embedding", embeddings.get(i)
));
}
index.load(data);
// Search with a query
String query = "database systems";
float[] queryEmbedding = vectorizer.embed(query);
VectorQuery vq = VectorQuery.builder()
.vector(queryEmbedding)
.field("embedding")
.numResults(3)
.returnFields("$.content")
.build();
List<Map<String, Object>> results = index.query(vq);
System.out.println("Results for query: " + query);
results.forEach(result ->
System.out.println("- " + result.get("$.content"))
);
}
}
Builder Pattern
Use the builder for more control:
import com.redis.vl.utils.vectorize.VectorizerBuilder;
// LangChain4J with builder
LangChain4JVectorizer vectorizer = VectorizerBuilder
.langchain4j()
.embeddingModel(embeddingModel)
.build();
// ONNX with builder
SentenceTransformersVectorizer onnxVectorizer = VectorizerBuilder
.sentenceTransformers()
.modelPath(modelPath)
.build();
Provider Comparison
Provider | Dimensions | Cost | Best For |
---|---|---|---|
OpenAI |
1536 |
Pay per token |
High quality, production apps, latest models |
Azure OpenAI |
1536 |
Pay per token |
Enterprise apps, Azure ecosystem, compliance |
Cohere |
1024 |
Pay per token |
Multilingual support, semantic search |
HuggingFace |
384-768 |
Pay per API call or free (self-hosted) |
Wide model selection, experimentation |
Mistral AI |
1024 |
Pay per token |
European provider, privacy-focused |
Vertex AI |
768 |
Pay per token |
Google Cloud ecosystem, scalability |
Voyage AI |
1536 |
Pay per token |
Domain-specific models, high accuracy |
AWS Bedrock |
1024 |
Pay per token |
AWS ecosystem, managed service |
Ollama |
Varies |
Free (self-hosted) |
Local development, privacy, no internet |
ONNX (Local) |
384-768 |
Free |
Complete privacy, offline, high volume |
Choosing a Vectorizer
Aspect | LangChain4J | Local ONNX |
---|---|---|
Cost |
Pay per API call |
Free after initial download |
Speed |
Network latency + inference |
Fast, local inference |
Quality |
Latest models (e.g., GPT embeddings) |
Good quality, proven models |
Privacy |
Data sent to provider |
Complete privacy, offline capable |
Deployment |
Simple, no model management |
Requires model files, more setup |
Best For |
Production apps with cloud access |
Privacy-sensitive, offline, high-volume |
Integration with Search Index
Combine vectorizers with search indices:
public class VectorizedSearchIndex {
private final SearchIndex index;
private final BaseVectorizer vectorizer;
public VectorizedSearchIndex(
SearchIndex index,
BaseVectorizer vectorizer
) {
this.index = index;
this.vectorizer = vectorizer;
}
public void addDocument(String content, Map<String, Object> metadata) {
// Generate embedding
float[] embedding = vectorizer.embed(content);
// Create document
Map<String, Object> doc = new HashMap<>(metadata);
doc.put("content", content);
doc.put("embedding", embedding);
// Store in index
index.load(List.of(doc));
}
public List<Map<String, Object>> search(String query, int numResults) {
// Vectorize query
float[] queryVector = vectorizer.embed(query);
// Search
VectorQuery vq = VectorQuery.builder()
.vector(queryVector)
.field("embedding")
.numResults(numResults)
.build();
return index.query(vq);
}
}
// Usage
VectorizedSearchIndex vsi = new VectorizedSearchIndex(index, vectorizer);
vsi.addDocument(
"Redis enables real-time vector search",
Map.of("category", "database", "author", "Redis")
);
List<Map<String, Object>> results = vsi.search(
"fast database for vectors",
10
);
Security Best Practices
API Key Management
🔒 NEVER hardcode API keys in your code! Always use secure configuration methods:
✅ Recommended: Environment Variables
// GOOD - Uses environment variable
EmbeddingModel model = OpenAiEmbeddingModel.builder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.build();
Set environment variables in your shell:
# Add to ~/.zshrc or ~/.bashrc
export OPENAI_API_KEY="sk-..."
export HUGGINGFACE_API_KEY="hf_..."
export COHERE_API_KEY="..."
# Reload configuration
source ~/.zshrc
✅ Alternative: Properties File (Not in Git)
Create config.properties
(add to .gitignore
):
openai.api.key=${OPENAI_API_KEY}
cohere.api.key=${COHERE_API_KEY}
// Load from properties file
Properties props = new Properties();
props.load(new FileInputStream("config.properties"));
EmbeddingModel model = OpenAiEmbeddingModel.builder()
.apiKey(props.getProperty("openai.api.key"))
.build();
Important: Always add secret files to .gitignore
:
*.env
.env.*
config.properties
secrets.properties
api-keys.txt
✅ Production: Secret Management Services
For production deployments, use proper secret management:
-
AWS Secrets Manager
-
Azure Key Vault
-
Google Cloud Secret Manager
-
HashiCorp Vault
-
Kubernetes Secrets
// Example with AWS Secrets Manager
String apiKey = awsSecretsManager.getSecretValue(
new GetSecretValueRequest().withSecretId("openai-api-key")
).getSecretString();
EmbeddingModel model = OpenAiEmbeddingModel.builder()
.apiKey(apiKey)
.build();
Best Practices
-
Match Dimensions - Ensure your index vector field dimensions match your model:
// For all-MiniLM-L6-v2 (384 dimensions) Map.of("dims", 384, ...) // For text-embedding-3-small (1536 dimensions) Map.of("dims", 1536, ...)
-
Cache Models Locally - Download ONNX models once and reuse:
// Check if model exists before downloading Path modelPath = Paths.get(cacheDir, modelName); if (!Files.exists(modelPath)) { HuggingFaceModelDownloader.downloadModel(modelName, cacheDir); }
-
Batch Processing - Process multiple texts together for better performance:
// Less efficient for (String text : texts) { float[] emb = vectorizer.embed(text); } // More efficient List<float[]> embs = vectorizer.embedBatch(texts);
-
Handle Errors Gracefully:
try { float[] embedding = vectorizer.embed(text); } catch (Exception e) { logger.error("Failed to generate embedding", e); // Fallback strategy }
-
Monitor Token Limits - Some models have maximum token limits:
// Truncate long texts if necessary String text = longText; if (text.split("\\s+").length > 512) { text = truncate(text, 512); // Implement truncation } float[] embedding = vectorizer.embed(text);
Next Steps
-
LLM Cache - Cache embeddings for performance
-
Hybrid Queries - Combine vectors with filters
-
Getting Started - Build your first application