Class ExtractiveSelector
java.lang.Object
com.redis.vl.extensions.summarization.ExtractiveSelector
BERT-based extractive summarization using sentence clustering.
This class selects the most representative sentences from a document by embedding sentences with BERT, clustering them with k-means, and selecting the sentence closest to each cluster centroid.
Key Feature: Preserves original text exactly, which is critical for SubEM (Substring Exact Match) evaluation where paraphrasing fails.
Example Usage:
SentenceTransformersVectorizer vectorizer = SentenceTransformersVectorizer.builder()
.modelName("all-MiniLM-L6-v2")
.build();
ExtractiveSelector selector = new ExtractiveSelector(vectorizer);
SentenceSplitter splitter = new SentenceSplitter();
String document = "Long document text...";
List<String> sentences = splitter.split(document);
List<String> keySentences = selector.selectKeySentences(sentences, 10);
// keySentences contains the 10 most representative sentences
// in their original order, with exact original text preserved
-
Nested Class Summary
Nested Classes -
Constructor Summary
ConstructorsConstructorDescriptionCreate an extractive selector with default settings.ExtractiveSelector(SentenceTransformersVectorizer embedder, int defaultNumSentences) Create an extractive selector with custom number of sentences.ExtractiveSelector(SentenceTransformersVectorizer embedder, int defaultNumSentences, int maxIterations) Create an extractive selector with full configuration. -
Method Summary
Modifier and TypeMethodDescriptionstatic ExtractiveSelector.Builderbuilder(SentenceTransformersVectorizer embedder) Builder for ExtractiveSelector.selectKeySentences(List<String> sentences) Select the most representative sentences using the default count.selectKeySentences(List<String> sentences, int k) Select the k most representative sentences from the input.
-
Constructor Details
-
ExtractiveSelector
Create an extractive selector with default settings.- Parameters:
embedder- The sentence transformer vectorizer for embeddings
-
ExtractiveSelector
Create an extractive selector with custom number of sentences.- Parameters:
embedder- The sentence transformer vectorizer for embeddingsdefaultNumSentences- Default number of sentences to select
-
ExtractiveSelector
public ExtractiveSelector(SentenceTransformersVectorizer embedder, int defaultNumSentences, int maxIterations) Create an extractive selector with full configuration.- Parameters:
embedder- The sentence transformer vectorizer for embeddingsdefaultNumSentences- Default number of sentences to selectmaxIterations- Maximum k-means iterations
-
-
Method Details
-
selectKeySentences
Select the most representative sentences using the default count.- Parameters:
sentences- List of sentences to select from- Returns:
- Selected sentences in original order
-
selectKeySentences
Select the k most representative sentences from the input.Algorithm:
- Embed all sentences using BERT
- Cluster embeddings using k-means++
- For each cluster, select the sentence closest to the centroid
- Return sentences in their original order
- Parameters:
sentences- List of sentences to select fromk- Number of sentences to select- Returns:
- Selected sentences in original order (preserves exact text)
-
builder
Builder for ExtractiveSelector.
-