- Store and retrieve vectors using
PointStruct,points.upsert, andpoints.search. - Control search behaviour with distance metrics, score thresholds,
SearchParams, and pagination. - Fetch, count, and batch-search points using
points.get,points.count, andsearch_batch.
- Embed — Convert text, images, or audio into dense numerical vectors using a model.
- Store — Insert vectors with metadata into a collection.
- Search — Encode a query into the same vector space and find the nearest neighbors.
- Score — Rank results by distance (cosine, Euclidean, dot product, or Manhattan).
Environment setup
Run this command to install the two packages the tutorial depends on.Step 1: Import and configure
Run this cell to import the SDK classes, set the server address and collection name, and load the embedding model. The two helper functions at the bottom are used throughout the tutorial to convert text into vectors.Expected output
The cell prints the configured server address and confirms the embedding model loaded successfully with its dimensionality.Step 2: Create a collection
Run this cell to create the collection that all subsequent steps will use. If the collection already exists,get_or_create returns without error.
Key parameters
The following parameters are passed tocollections.get_or_create() to define the collection structure.
| Parameter | Value | Meaning |
|---|---|---|
size | 384 | Vector dimensionality — must match the embedding model dimension |
distance | Distance.Cosine | Similarity metric for scoring |
m | 16 | HNSW graph connectivity (higher = more accurate, more memory) |
ef_construct | 128 | HNSW build-time search width (higher = better index quality) |
Distance metrics
Actian VectorAI DB supports four distance metrics. The choice is made at collection creation time and cannot be changed afterwards.| Metric | Enum | Score meaning | Best for |
|---|---|---|---|
| Cosine | Distance.Cosine | Higher = more similar | Normalized text/image embeddings |
| Dot product | Distance.Dot | Higher = more similar | When magnitude matters |
| Euclidean | Distance.Euclid | Lower = more similar | Absolute distance measurement |
| Manhattan | Distance.Manhattan | Lower = more similar | Robust to outlier dimensions |
Expected output
A single confirmation line prints once the collection is created (or already exists).Step 3: Embed and store vectors
Run this cell to embed all ten sample documents and store them as points in the collection. Each point has an integer ID, a 384-dimensional vector, and a payload containing the original text plus topic and difficulty metadata.How it works
The ingestion pipeline converts raw text into vectors and stores them with metadata in a single batch operation.embed_texts()converts each document’s text into a 384-dimensional float vector usingall-MiniLM-L6-v2.PointStruct(id, vector, payload)packages the ID, vector, and metadata together.points.upsert()inserts (or updates) the points in the collection.vde.flush()persists vectors to disk immediately.
Expected output
The count confirms all ten documents were stored successfully.Step 4: Run a basic similarity search
Run this cell to search the collection using a natural-language query. The function embeds the query string and returns the five most similar documents, each with its ID, cosine score, topic, and a text preview.How it works
The search follows a three-step flow: encode, retrieve, rank.- The query text is embedded into the same 384-dim vector space as the stored documents.
points.search()finds the nearest vectors by cosine similarity.- Results are returned as scored point objects, ranked by score (highest first for cosine).
Key parameters
The following parameters are accepted bypoints.search().
| Parameter | Type | Default | Purpose |
|---|---|---|---|
vector | list[float] | required | The query embedding |
limit | int | 10 | Maximum number of results |
with_payload | bool | True | Include metadata in results |
Expected output
The five closest documents are printed in score order. The top result is about neural networks, followed by transformers and machine learning — demonstrating that the search captured semantic relationships rather than exact keyword overlap.Step 5: Understanding scores
The score value returned by each search result depends on the distance metric configured on the collection.Cosine similarity
For cosine distance — the metric used in this tutorial — scores represent the cosine similarity between normalized vectors. When both the stored vectors and query vectors are unit-normalized (as produced byall-MiniLM-L6-v2), scores range from 0 to 1 and are interpreted as follows.
| Score | Interpretation |
|---|---|
| 1.0 | Identical vectors (perfect match) |
| 0.7–0.9 | Strongly similar |
| 0.4–0.7 | Moderately similar |
| 0.1–0.4 | Weakly similar |
| 0.0 | Orthogonal (no similarity) |
Comparing queries
Run this cell to issue three different queries against the collection and compare their score distributions. Each query will return three results with scores that reflect how closely the corpus matches that particular topic.Expected output
Each query surfaces a different set of top results. The scores shift noticeably between topics, confirming that semantic relevance drives the ranking rather than surface-level word matching.Step 6: Tune search accuracy with SearchParams
SearchParams controls how the HNSW index is traversed at query time. Adjusting these values lets you trade search speed for recall accuracy.
Run this cell to compare the results of three search modes — low-effort approximate, high-effort approximate, and exact brute-force — against the same query.
SearchParams reference
All fields are optional. OmittingSearchParams entirely uses the collection’s default HNSW configuration.
| Parameter | Default | Effect |
|---|---|---|
hnsw_ef | Collection default | Search-time exploration factor. Higher = more accurate, slower. |
exact | False | True disables HNSW and performs a brute-force scan (100% recall). |
indexed_only | False | True skips unindexed segments (useful during bulk ingestion). |
quantization | None | Controls quantized vector search behavior (see below). |
ivf_nprobe | Collection default | For IVF indexes: number of partitions to search. |
Quantization-aware search
When a collection uses scalar or product quantization, useQuantizationSearchParams to control how quantized vectors are used during the search. The following example enables rescoring, which re-ranks the initial candidates using the original full-precision vectors for higher accuracy.
| Parameter | Effect |
|---|---|
ignore=False | Use quantized vectors for initial search (fast) |
rescore=True | Re-rank candidates with original full-precision vectors |
oversampling=2.0 | Retrieve 2x candidates before rescoring for higher recall |
Step 7: Score threshold — filter low-confidence results
score_threshold discards results below a minimum similarity score server-side before they are returned. Run this cell to see how raising the threshold progressively narrows the result set for a deep-learning query.
When to use score thresholds
Choose a threshold based on how strictly the results need to match the query intent.| Scenario | Threshold |
|---|---|
| Exploratory search (cast wide net) | 0.2–0.3 |
| General retrieval | 0.4–0.5 |
| Precise matching (reduce false positives) | 0.6–0.7 |
| Near-duplicate detection | 0.8+ |
Expected output
The result counts drop as the threshold rises, and the strict pass returns only the two documents that score above 0.7.Step 8: Pagination with offset and limit
For large result sets, useoffset and limit to retrieve results one page at a time. Run this cell to walk through three pages of results for a programming query, with three results per page.
How pagination works
Each call advances the window by incrementingoffset by limit. Results are always ranked by similarity score before the window is applied.
| Call | Offset | Limit | Returns |
|---|---|---|---|
| Page 1 | 0 | 3 | Results 1–3 |
| Page 2 | 3 | 3 | Results 4–6 |
| Page 3 | 6 | 3 | Results 7–9 |
offset skips the first N results and limit controls how many are returned per page.
Expected output
Three labeled pages print in sequence, each showing a different slice of the ranked result set.Step 9: Retrieve points by ID
points.get retrieves specific points by their IDs without performing any vector similarity search. Run this cell to fetch points 0, 4, and 6 and print their topic and text.
Parameters
The following parameters control whatpoints.get() returns alongside the point IDs.
| Parameter | Default | Purpose |
|---|---|---|
ids | required | List of point IDs (int or UUID string) |
with_payload | True | Include payload in response |
with_vectors | False | Include vector data in response |
Expected output
The three requested points are returned with their payload metadata. No vector data is included becausewith_vectors is set to False.
Step 10: Count points
points.count returns the number of points in a collection, with an option to apply a filter. Run this cell to count the total collection, an approximate count, and two filtered subsets.
exact flag trades speed for accuracy. Choose based on whether the count needs to be precise.
| Mode | Speed | Use case |
|---|---|---|
exact=True | Slower | Precise counts for reports |
exact=False | Faster | Dashboard approximations |
Expected output
Both the exact and approximate counts return 10 for this small collection. The filtered counts confirm there are two deep learning documents and three beginner-level documents.Step 11: Batch search — multiple queries in one call
search_batch sends up to 100 searches in a single gRPC round-trip, which eliminates per-request connection overhead. Run this cell to issue three different queries simultaneously and print their results side by side.
Why batch search matters
Sending multiple searches in a single call eliminates per-request connection overhead and reduces total latency significantly at scale.| Approach | Network round-trips | Overhead |
|---|---|---|
3 separate search() calls | 3 | 3x connection overhead |
1 search_batch() call | 1 | Minimal overhead |
vector, limit, filter, params, score_threshold, using, and offset. The results are returned in the same order as the input queries.
Maximum batch size: 100 searches per call.
Expected output
All three queries return results in a single round-trip, each with its own ranked list.Step 12: The universal query endpoint
points.query is a more powerful alternative to points.search. It supports vector search, payload ordering, server-side fusion, random sampling, and multi-stage prefetch — all through a single endpoint.
Vector search via points.query
Run this cell to perform a standard nearest-neighbour search usingpoints.query. It produces the same ranked results as points.search but makes the full query feature set available.
Payload-sorted retrieval
Run this cell to retrieve points sorted by thedifficulty payload field rather than by vector similarity. Passing an OrderBy object instead of a vector tells the endpoint to skip similarity computation entirely.
Multi-stage prefetch
Run this cell to run two filtered sub-searches in parallel — one for machine learning documents and one for deep learning documents — and then re-rank the merged candidate pool with a final similarity query, all in a single round-trip.How prefetch works
Prefetch executes the filtered sub-searches first, then merges their results for a final re-ranking pass.- In the first stage, the engine fetches candidates matching the machine learning topic filter.
- In the second stage, the engine fetches candidates matching the deep learning topic filter.
- In the final stage, the top-level query re-ranks the merged candidate pool by similarity.
Step 13: Return vectors with results
Settingwith_vectors=True includes the raw embedding vectors in the response alongside the payload and score. Run this cell to search for “machine learning” and print the dimensionality and first five values of each returned vector.
When to return vectors
Returning vectors increases response size significantly — each 384-dim float vector adds approximately 1.5 KB per result — so only enable this when needed.| Use case | with_vectors |
|---|---|
| Normal search (most cases) | False (default) |
| Client-side re-ranking | True |
| Similarity visualization (t-SNE, UMAP) | True |
| Debugging embeddings | True |
| Export for another system | True |
Selective payload with WithPayloadSelector
Instead ofwith_payload=True (which returns all payload fields), use WithPayloadSelector to include or exclude specific fields.
Expected output
Each result includes the full 384-dimensional vector. The dimensionality confirms the vector is present, and the first five values show a sample of its contents.Step 14: Combine search with filters
Filters restrict which points are considered during similarity search. The filter is evaluated server-side before ranking, so only matching points are scored. Run this cell to search by topic and by difficulty level separately.Expected output
The first search returns only machine learning documents, and the second returns only beginner-level documents, regardless of topic.Step 15: Collection cleanup
Run this cell to flush any pending writes to disk and confirm the vector count. Uncomment the delete lines to remove the collection entirely once finished.Expected output
The vector count confirms nothing was lost during the session, and the flush line confirms all data is persisted to disk.Complete API reference
The following tables summarize the methods, parameters, and distance metrics covered in this tutorial.Core search methods
The primary methods for running vector similarity searches are listed below.| Method | Purpose |
|---|---|
points.search(vector, limit, ...) | Find nearest vectors by similarity |
points.search_batch(searches) | Run up to 100 searches in one call |
points.query(query, ...) | Universal endpoint: search, order, fuse, sample, prefetch |
points.query_batch(queries) | Run up to 100 queries in one call |
Retrieval and counting
The following methods fetch points by ID and count collection contents.| Method | Purpose |
|---|---|
points.get(ids, ...) | Retrieve specific points by ID |
points.count(filter, exact) | Count points, optionally filtered |
Search parameters
All search methods accept the following parameters to control retrieval behaviour.| Parameter | Type | Purpose |
|---|---|---|
vector | list[float] | Query embedding |
limit | int | Maximum results |
filter | Filter | Payload filter conditions |
params | SearchParams | HNSW ef, exact mode, quantization, IVF nprobe |
score_threshold | float | Minimum score cutoff |
offset | int | Skip first N results (pagination) |
using | str | Named vector to search |
with_payload | bool | WithPayloadSelector | Control payload in response |
with_vectors | bool | Control vectors in response |
Distance metrics
The metric must be set at collection creation time and cannot be changed afterwards.| Metric | Score direction | Distance enum |
|---|---|---|
| Cosine | Higher = more similar | Distance.Cosine |
| Dot product | Higher = more similar | Distance.Dot |
| Euclidean | Lower = more similar | Distance.Euclid |
| Manhattan | Lower = more similar | Distance.Manhattan |
Next steps
Now that you can embed, store, search, and tune vector queries, explore the following tutorials to add more capabilities to your search pipeline.Predicate filters
Combine similarity search with structured payload constraints
Hybrid search patterns
Mix dense and sparse retrieval with fusion
Filtering with boolean logic
Add must, should, and must_not conditions
Geospatial search
Make retrieval location-aware