What you build
A user describes what they want to watch in natural language — “a suspenseful space movie” — and the system finds the best matches from the database, optionally filtered by genre, year, or minimum rating. The diagram below shows how data flows from raw movie records through embedding and into a searchable vector store.Prerequisites
Before starting, make sure the following are in place.- Python 3.10 or later.
pipavailable in your environment (verify withpip --version).- A virtual environment activated (recommended:
python -m venv .venv && source .venv/bin/activate). - An Actian VectorAI DB server running (default:
localhost:50051). - Internet access on first run —
sentence-transformersdownloads the embedding model (all-MiniLM-L6-v2, approximately 90 MB) from Hugging Face when you first callSentenceTransformer(EMBED_MODEL). - At least 512 MB of free memory to load the embedding model.
Step 1: Install dependencies
The following command installs the Actian VectorAI SDK and the sentence embedding library. Run it inside your virtual environment.| Package | Purpose |
|---|---|
actian-vectorai | Official Python SDK — async/sync clients, Filter DSL, gRPC transport. |
sentence-transformers | Open-source library for generating text embeddings. |
Step 2: Import libraries and configure
The following snippet imports every class needed for this tutorial and sets three constants that identify the server address, collection name, and embedding model. Running it loads the model into memory and prints the resolved configuration so you can confirm the values before proceeding.| Import | Purpose |
|---|---|
AsyncVectorAIClient | Manages the gRPC connection to VectorAI DB. |
Distance | Enum for similarity metrics (Cosine, Dot, Euclid, Manhattan). |
Field | Builds type-safe conditions on payload fields. |
FilterBuilder | Combines conditions with boolean logic (AND / OR / NOT). |
PointStruct | A data point: ID + vector + payload (metadata). |
VectorParams | Configuration for the vector space: dimension + distance. |
HnswConfigDiff | Tuning parameters for the HNSW search index. |
Expected output
The three constants are printed in order — server address, collection name, and the model name with its output dimension.Step 3: Connect to the server
The following snippet opens a gRPC connection to the server, callshealth_check(), and prints the server’s version information. If the connection fails, an exception is raised inside the async with block and the error message identifies the problem.
Expected output
When the server is reachable, health information similar to the following is printed.localhost:50051.
When check_connection() runs, the async with AsyncVectorAIClient(...) block manages the gRPC connection lifecycle. The client opens a channel to SERVER, runs the coroutine body including health_check(), and closes the channel when the block exits, so resources are released even if something fails. The sequence is as follows.
AsyncVectorAIClient(url=SERVER)creates a client instance.async withopens a gRPC channel and verifies the server is reachable.health_check()pings the server and returns status information.- When the
async withblock exits, the connection is closed cleanly.
Step 4: Create a collection
A collection is a named container for vectors. Think of it as a table in a relational database, but optimized for similarity search. The following snippet callsget_or_create, which creates the collection if it does not already exist. On first run it prints created; on subsequent runs it prints already exists. The function returns a boolean indicating whether a new collection was provisioned.
get_or_create define the vector dimension, how similarity is measured, and how the HNSW index is built. The table below explains each parameter.
| Parameter | Value | Meaning |
|---|---|---|
size=384 | Vector dimension | Must match the embedding model’s output dimension. |
distance=Distance.Cosine | Similarity metric | Cosine similarity is ideal for sentence transformers. |
m=16 | HNSW graph connections | Each node connects to 16 neighbours — balances speed and recall. |
ef_construct=128 | Build-time search width | Higher values improve index quality at the cost of build time. |
Why use get_or_create
get_or_create is safe to call repeatedly. When the collection does not yet exist, the SDK creates it and returns True. When the collection already exists, the SDK skips creation and returns False. This boolean return value lets you log whether a new collection was provisioned, and your scripts become idempotent — safe to re-run without side effects.
Expected output
get_or_create prints whether it provisioned a new collection or found one that already existed.
Step 5: Create embedding helpers
The following two functions wrap the sentence transformer model.embed_text encodes a single string; embed_texts encodes a list of strings in one forward pass and is significantly faster when processing multiple items.
Expected output
The vector dimension confirms the model loaded correctly. The five sample values will differ slightly between runs because the model weights are fixed but floating-point precision varies across platforms.- Speed:
embed_textsprocesses all texts in a single forward pass through the model, which is significantly faster than callingembed_textin a loop. - Efficiency: Batching reduces CPU and memory overhead compared to encoding one string at a time.
- Best practice: Always batch when embedding more than a few texts.
Step 6: Prepare your data
Each movie becomes a point in the collection. A point has three parts.- ID — A unique identifier (integer or UUID string).
- Vector — An embedding of the movie’s plot description.
- Payload — Structured metadata (genre, year, rating, and so on).
Step 7: Embed and store the data
The following snippet embeds every plot in a single batch, wraps each movie as aPointStruct, sends all ten points to the server in one upsert call, flushes the data to disk, and then reads back the total vector count to confirm the write succeeded.
Expected output
After a successful upsert and flush, the stored count matches the number of points sent. The total reported byget_vector_count confirms all ten movies were persisted.
embed_textsconverts all 10 plots into 384-dimensional vectors in one batch.- Each movie becomes a
PointStructwith an integer ID, the plot vector, and the full metadata as payload. points.upsertsends the points to the server (“upsert” means insert-or-update).vde.flushensures the data is persisted to disk immediately.vde.get_vector_countconfirms how many vectors are stored.
Step 8: Run your first semantic search
The following snippet embeds a natural-language query string, sends the query vector to the server, and prints the top five most similar movies ranked by cosine similarity score.search call accepts three key parameters that control what is returned.
| Parameter | Value | Purpose |
|---|---|---|
vector | Query embedding | The search finds vectors closest to this one. |
limit=5 | Top 5 results | Number of results to return. |
with_payload=True | Include metadata | Returns title, genre, year, and other fields with each result. |
Expected output
The results are ranked by cosine similarity score. The embedding model surfaces space-themed films even when the exact query words do not appear in their plot descriptions.Step 9: Filter by metadata
Filters restrict the candidate set before vector ranking, so similarity scores are only compared within the matching subset. Actian VectorAI DB provides theField and FilterBuilder classes for this purpose. The examples below show how to filter by genre, by a minimum rating, and by a combination of both.
Filter by genre
The following snippet defines asearch_by_genre function that builds a must condition on the genre field. Only points where genre equals the provided value are considered during ranking. Calling the function with "sci-fi" returns the top sci-fi matches for the query.
Expected output
Only sci-fi movies are scored and returned. Non-matching genres are excluded before ranking, so the similarity scores reflect distance within the sci-fi subset only.Field("genre").eq("sci-fi") creates a condition that passes only movies where genre equals "sci-fi". The filter is applied before ranking, so the search only scores matching points.
Filter by minimum rating
The following snippet uses.gte() on the numeric rating field to restrict results to movies at or above a minimum quality threshold.
Expected output
Only movies with a rating of8.8 or above are included. The results are still ordered by semantic similarity to the query, not by rating.
rating payload value is greater than or equal to the threshold.
Step 10: Combine multiple filters
FilterBuilder supports three types of boolean logic. Each method narrows or expands the candidate set in a different way.
| Method | Meaning | SQL equivalent |
|---|---|---|
.must() | All conditions must match. | AND |
.should() | At least one condition should match. | OR |
.must_not() | Exclude any points that match. | NOT |
Expected output
All three conditions are applied simultaneously. Only films released after 2000, rated at least 8.5, and not in the drama genre are considered for ranking.Step 11: Retrieve a specific movie by ID
The following snippet fetches movie ID0 directly from the collection by passing the integer ID to points.get(). No search is performed — the server returns the exact point and its payload.
Expected output
Point0 is the first movie ingested in this tutorial, so the output shows Interstellar’s full payload.
points.get() with with_payload=True, so the server returns the exact point and its complete metadata without performing any similarity search. The function checks whether any points were returned, then prints the title, year, genre, rating, and full plot description of the matching record.
Step 12: Update movie metadata
After ingestion, payload fields can be updated without re-embedding the vector. The following snippet callsset_payload to change the rating for movie ID 0 to 8.8, then calls get_movie to confirm the change was applied.
Expected output
The second call toget_movie(0) confirms the rating was updated from 8.7 to 8.8 while all other fields remain unchanged.
set_payload merges the provided fields into the existing payload. Three properties define its behaviour.
- Merge behaviour: Only the specified fields are updated. All other fields in the existing payload remain unchanged.
- No re-embedding: The vector stays the same — only the metadata is modified, so there is no reprocessing cost.
- Immediate effect: Subsequent searches and retrievals reflect the updated values right away.
Add new fields
set_payload can also add entirely new keys to a point. The following snippet adds a tags list to movie ID 0. Because set_payload merges rather than replaces, the title, plot, genre, and all other existing fields are preserved.
get_movie(0) call confirms the new tags field was merged into the payload. All previously stored fields — title, plot, genre, year, and rating — remain intact.
add_tags with movie ID 0 and a list of four descriptive tags: space, wormhole, survival, and time-dilation. The set_payload call merges the new tags field into the existing payload for that point, leaving all previously stored fields — title, plot, genre, year, rating, and director — unchanged. The follow-up call to get_movie(0) reads the point back from the collection so you can confirm the tags were stored correctly.
Step 13: Delete points
Points can be removed individually by ID or in bulk by filter.Delete by ID
The following snippet removes movie ID9 by passing an explicit ID list to points.delete(), then reads back the vector count to confirm the deletion.
Expected output
The vector count drops from 10 to 9, confirming that movie ID9 (Blade Runner 2049) was removed from the collection.
9 — corresponding to “Blade Runner 2049”, the last movie in the dataset — to points.delete(). After the deletion, vde.get_vector_count reads the updated total and prints it so you can confirm the point was removed.
Delete by filter
The following snippet deletes all movies whose rating falls below a given threshold. Thefilter_obj uses .lt() (less than) to identify matching points. The vector count is read before and after the operation so the result is visible.
Step 14: Count points
The following snippet counts the total number of points in the collection, then runs three filtered counts to check how many sci-fi movies exist, how many have a rating of 8.8 or higher, and how many were directed by Christopher Nolan.exact parameter controls whether the count is precise or approximate. The table below explains the trade-off.
| Value | Behaviour |
|---|---|
exact=True | Scans all points and returns the precise count. |
exact=False | Uses an approximate count from the index (faster for large collections). |
exact=True. For millions of points, exact=False avoids a full scan.
client.points.count() returns a count response object. The integer count is accessed via the .count attribute (for example, sci_fi.count). The code samples above print the response object directly for readability; update them to access .count if your SDK version returns a structured object rather than a raw integer.Expected output
Counts reflect the tutorial dataset after the earlier deletion of movie ID9.
exact=True. The first count returns the total number of points currently in the collection. The second filters by genre == "sci-fi", the third by rating >= 8.8, and the fourth by director == "Christopher Nolan". The results reflect the dataset state after movie ID 9 was deleted in Step 13.
Step 15: Inspect collection status
The following snippet retrieves the collection’s status, configuration, and current VDE lifecycle state, then prints them alongside the vector count. Run this at any point to verify the collection is healthy before running searches.Expected output
This code connects to the server, callscollections.get_info to retrieve the collection’s operational status and vector configuration, then calls vde.get_state to read the current VDE lifecycle state, and finally calls vde.get_vector_count to confirm the number of stored vectors. A green status and active VDE state indicate the collection is healthy and ready for searches.
collections.get_info to retrieve the collection’s operational status and vector configuration, then calls vde.get_state to read the current VDE lifecycle state, and finally calls vde.get_vector_count to confirm the number of stored vectors. All three values are printed together so you can verify the collection is healthy and correctly configured before running searches.
Step 16: List all collections
The following snippet retrieves the names of every collection on the server and prints them as a numbered list. This is useful for confirming which collections are available before connecting a client.Expected output
Because only one collection was created in this tutorial,collections.list() returns a single entry. The count in the header updates automatically as collections are added or removed.
collections.list(), which returns the names of all collections currently provisioned on the server. The result is printed as a numbered list with the total count shown in the header. In this tutorial only one collection has been created, so the output lists Movies as the single entry.
Step 17: Put it all together — a complete search function
The previous steps introduced each operation individually. This section consolidates them into a single reusablerecommend_movies function that accepts optional filters and applies only the ones provided.
The function below accepts a natural-language query and four optional filter parameters. For each filter that is not None, the corresponding condition is added to the FilterBuilder. Running the three example calls prints results for an unfiltered sci-fi query, a crime story filtered to high-rated movies, and a feel-good query that excludes crime films made before 1990.
Expected output
Three calls are made with different queries and filter combinations. Each block shows the active filters and how many results matched before the ranked list is printed.min_rating >= 8.8 filter, narrowing results to only highly rated movies that match “an intense crime story”. The third call combines an exclude_genre="crime" exclusion with a min_year=1990 lower bound, so the search for “a feel-good movie about life” returns only non-crime films from 1990 onwards. Each call prints the query, active filters, result count, and ranked movies with truncated plot descriptions.
Step 18: Cleanup
The following snippet flushes any pending writes to disk and prints the current movie count. The two lines that delete the collection are commented out so the data is preserved by default — uncomment them only when the collection is no longer needed.Expected output
The vector count reflects the state of the collection after all previous steps. The flush confirmation line indicates that any pending writes have been safely persisted to disk.vde.flush to ensure any pending writes are persisted to disk. The two lines that delete the collection are commented out — they are safe to uncomment when the tutorial data is no longer needed, but the collection is preserved by default so the data remains available for further experimentation.
What you learned
The table below summarises every concept and API used in this tutorial.| Concept | API | What it does |
|---|---|---|
| Connect | AsyncVectorAIClient(url=...) | Open a gRPC connection to VectorAI DB. |
| Health check | client.health_check() | Verify the server is reachable. |
| Create collection | collections.get_or_create(vectors_config=VectorParams(...)) | Define a vector space with dimension and distance metric. |
| Embed text | SentenceTransformer.encode() | Convert text to a numerical vector. |
| Store data | points.upsert(collection, points=[PointStruct(...)]) | Insert or update points with vectors and metadata. |
| Persist | vde.flush(collection) | Write pending data to disk. |
| Semantic search | points.search(collection, vector=..., limit=5) | Find the most similar vectors. |
| Filter (equality) | Field("genre").eq("sci-fi") | Match a specific value. |
| Filter (range) | Field("rating").gte(8.5) | Numeric comparison. |
| Filter (exclude) | FilterBuilder().must_not(...) | Exclude matching points. |
| Combine filters | FilterBuilder().must(...).must(...).build() | Boolean AND/OR/NOT logic. |
| Get by ID | points.get(collection, ids=[0]) | Retrieve specific points. |
| Update metadata | points.set_payload(collection, payload={...}, ids=[0]) | Merge new fields into existing payloads. |
| Delete by ID | points.delete(collection, ids=[0]) | Remove specific points. |
| Delete by filter | points.delete(collection, filter=...) | Remove points matching conditions. |
| Count | points.count(collection, filter=..., exact=True) | Count matching points. |
| Vector count | vde.get_vector_count(collection) | Total vectors in the collection. |
| Collection info | collections.get_info(collection) | Status and configuration. |
| Collection state | vde.get_state(collection) | VDE lifecycle state. |
| List collections | collections.list() | All collection names on the server. |
| Delete collection | collections.delete(collection) | Remove a collection entirely. |
Common patterns quick reference
The patterns below capture the idioms used most often when building applications with Actian VectorAI DB.Pattern 1: Search with optional filters
Build the filter conditionally so the same function works with or without constraints. Usingis not None rather than a truthiness check prevents valid falsy values such as 0.0 from being silently skipped.
Pattern 2: Upsert is idempotent
Callingupsert with the same ID replaces the existing point, so ingestion scripts can be re-run safely without creating duplicates.
Pattern 3: Always flush after writes
Callvde.flush() immediately after points.upsert() to ensure data survives server restarts. Without it, recent writes may be lost if the server crashes.
Pattern 4: Use get_or_create for collections
get_or_create is safe to run on every application startup. It creates the collection if it does not exist and does nothing if it already does, so startup code does not need a separate existence check.
Next steps
Predicate filters
Master the full Filter DSL with all field types and operators.
Similarity search fundamentals
Explore search parameters, score thresholds, and pagination.
Use open-source embedding models
Choose the right model and configure quantization for production.
Optimizing retrieval quality
Tune HNSW parameters, quantization, and search settings.