Architecture
The system splits contracts into two parallel embedding pipelines. Individual clauses are stored in a clause collection using Dot distance, while full contract summaries are stored in a contract collection using Cosine distance. At query time, clause search runs with OrderBy date ordering, andlookup_from enriches each result with context from the contract collection before the analysis engine generates a risk report.
Environment setup
Before running any of the code in this tutorial, install the required Python packages. The setup uses two libraries: the Actian VectorAI SDK for database operations and Sentence Transformers for local embedding generation. Run the following command to install both packages:actian-vectorai— Official Python SDK for Actian VectorAI DB (connection pooling, cross-collection lookup, OrderBy retrieval, advanced rebuild management, gRPC transport).sentence-transformers— For generating text embeddings withall-MiniLM-L6-v2.
Implementation
The following steps build the agent end-to-end: setting up collections with custom distance metrics and durability config, ingesting clause and contract data, running cross-collection lookups and payload-sorted queries, and operating a risk analysis engine over the results.Step 1: Import dependencies and configure
The first step imports all required types, including connection pooling types, WAL and optimizer configs, quantization search params, rebuild management types,OrderBy, and alternative distance metrics. It also sets the server address, collection names, and loads the embedding model. Running this block prints the active server address, collection names, and embedding model so the rest of the steps can be verified against a known configuration.
OrderBy, and alternative distance metrics — then sets the server address, collection names, and loads the all-MiniLM-L6-v2 embedding model. The four print statements confirm the active configuration values so every subsequent step can be verified against a known baseline.
Expected Output
Step 2: Define embedding helpers
These two helper functions wrap the Sentence Transformers model to convert text into 384-dimensional vectors. The single-text version is used for queries, while the batch version is used during data ingestion. Defining these functions produces no output; they are called in later steps.Step 3: Create collections with alternative distance metrics, WAL, and optimizer config
Two collections are created using different distance metrics. The clause collection uses Dot product distance, while the contract collection uses Cosine distance. WAL and optimizer settings are configured on the clause collection for production workloads. Running this block creates both collections and prints a readiness message for each.get_or_create twice — once for the clause collection using Distance.Dot with WAL durability and optimizer settings tuned for large ingest workloads, and once for the contract collection using Distance.Cosine with a higher HNSW m value for broader graph connectivity. Both collections use 384-dimensional vectors to match the embedding model configured in Step 1. Each call prints a readiness message confirming the distance metric and any additional configuration applied.
Expected Output
WalConfigDiff controls Write-Ahead Log durability. Setting wal_capacity_mb=64 rotates WAL segments at 64 MB, and wal_segments_ahead=2 keeps two segments ahead for crash recovery. OptimizersConfigDiff controls background optimization: indexing_threshold=10000 delays HNSW index construction until 10,000 points are present, flush_interval_sec=30 sets the automatic flush interval, and max_optimization_threads=2 limits concurrent optimization threads.
Alternative distance metrics — previous tutorials used Distance.Cosine. Actian VectorAI DB supports four metrics:
| Metric | Value | Best for |
|---|---|---|
Distance.Cosine | 1 | Normalized embeddings, direction-based similarity. |
Distance.Dot | 3 | Raw dot product, magnitude-sensitive. |
Distance.Euclid | 2 | Absolute distance in embedding space. |
Distance.Manhattan | 4 | L1 norm, robust to outliers. |
Step 4: Create payload indexes on both collections
Payload indexes accelerate filtered queries by creating lookup structures on specific fields. This step creates indexes oncontract_date and contract_type on both collections, then adds clause_type and risk_score indexes on the clause collection to support type-filtered and risk-sorted retrieval. Running this block creates all indexes and prints a confirmation.
contract_date (marked as principal for optimized OrderBy queries) and a keyword index on contract_type. It then creates two additional indexes on the clause collection only: a keyword index on clause_type for type-filtered retrieval, and a float index on risk_score (also marked as principal) to support fast range queries against numeric risk values. All six index operations run sequentially and a single confirmation message is printed once every index is in place.
Expected Output
Step 5: Prepare sample contract and clause data
Two datasets are defined: contract-level summaries and clause-level extracts. Each clause references a contract by ID, which enables cross-collection lookup in later steps. Running this block loads the data into memory and prints a count of each dataset.contracts, which contains four contract-level records spanning vendor agreements, an NDA, a software license, and an employment contract; and clauses, which contains eight clause-level extracts covering indemnification, liability cap, force majeure, confidentiality, IP rights, SLA, non-compete, and IP assignment clause types. Each clause includes a contract_id field that links it to its parent contract, enabling cross-collection enrichment in later steps. The final print statement reports how many records of each type were loaded.
Expected Output
Step 6: Ingest data into both collections
This step embeds all contract summaries and clause texts, then upserts them into their respective collections. After upserting, it flushes both collections to disk and prints the final point counts. Running this block populates both collections and confirms the ingested totals.embed_texts helper, constructs PointStruct objects that pair each numeric ID with its vector and payload, and upserts them into the corresponding collections. After each upsert, vde.flush is called to persist the writes to disk immediately rather than waiting for the background flush interval. The final two get_vector_count calls read back the indexed totals from each collection to confirm all points were successfully written.
Expected Output
Step 7: OrderBy — payload-sorted retrieval
Thequery endpoint supports OrderBy for sorting results by a payload field instead of vector similarity. The function below accepts a clause type, filters to matching clauses, and returns them sorted by contract_date descending so the most recent clauses appear first. Running this block queries for indemnification and IP rights clauses and prints their dates and contract IDs.
get_recent_clauses twice — first filtering for clauses where clause_type equals "indemnification", then for "ip_rights". Each call issues a query against the clause collection with an OrderBy directive on contract_date in descending order, so the most recently dated clause of each type is returned first. No vector is provided; the query relies entirely on payload filtering and date sorting. Each result is printed with its point ID, contract date, contract ID, and the first 100 characters of the clause text.
Expected Output
OrderBy replaces vector similarity ranking with payload-field sorting. The query endpoint accepts it through a dict: {"order_by": OrderBy(key="contract_date", direction=Direction.Desc)}. Direction.Desc sorts newest first; Direction.Asc sorts oldest first. The is_principal=True flag set on the datetime index in step 4 optimizes this ordering. This retrieval mode is essential for legal workflows where recency matters — the most recent version of an indemnification clause is more relevant than one from five years ago.
Step 8: Cross-collection lookup with lookup_from
Thelookup_from parameter on query enriches results from one collection with data from another. The function below searches the clause collection by vector similarity and uses lookup_from to draw in contract-level vectors during scoring. Running this block searches for clauses related to liability limitation and prints the top matches with their scores.
"Liability limitation and cap on damages" into a 384-dimensional vector and searches the clause collection for the top five most similar points. The lookup_from parameter instructs the query to draw in vectors from the contract collection during scoring, enriching clause-level results with contract-level embedding context. Each result is printed with its point ID, similarity score, clause type, parent contract ID, and the first 100 characters of the clause text.
Expected Output
lookup_from parameter accepts two keys: "collection" for the name of the external collection to look up vectors from, and "vector_name" for the named vector to use (an empty string selects the default vector). The result is clause-level granularity for matching combined with contract-level embeddings for contextual scoring — useful when you need both.
Step 9: Retrieve vectors alongside payloads
By default,points.get and points.search return only payloads and scores. Setting with_vectors=True includes the actual embedding data in the response, enabling client-side similarity analysis. The code below retrieves the embeddings for three specific clauses by ID, computes their pairwise cosine similarity, and then runs a search that also returns vectors. Running this block prints the vector dimension and first three values for each retrieved point, followed by the similarity score between clause 0 and clause 1.
with_vectors=True, then computes the cosine similarity between clause 0 (indemnification) and clause 1 (liability cap) using a manual dot-product calculation. It also runs a similarity search for "limitation of liability" with with_vectors=True to show that search results can carry their full embedding arrays. Each retrieved point is printed with its clause type, vector dimension (confirming the 384-dim model), and the first three float values of the embedding. The pairwise similarity and the ranked search results follow.
Expected Output
with_vectors=True selectively. The primary use cases are client-side pairwise comparison, visualization (t-SNE or UMAP projections), debugging embedding quality, and exporting vectors for use in other systems.
Step 10: Approximate count for fast dashboards
points.count supports an exact flag. When set to False, it returns an approximate count using index metadata rather than scanning all segments. The function below counts clauses and contracts both ways, then runs two filtered approximate counts — one for high-risk clauses and one for vendor agreement clauses. Running this block prints all four counts.
points.count with exact=True and exact=False on both the clause and contract collections to illustrate the speed-accuracy trade-off between full segment scans and index-metadata reads. The third count applies a float filter (risk_score >= 5.0) on the clause collection to return an approximate tally of high-risk clauses. The fourth applies a keyword filter (contract_type == "vendor_agreement") to count vendor agreement clauses only. All results are printed in sequence.
Expected Output
| Mode | Speed | Accuracy |
|---|---|---|
exact=True | Slower (scans all segments). | 100% accurate. |
exact=False | Fast (uses index metadata). | May differ slightly from exact. |
Step 11: Strict deletion — validate before removing
strict=True on points.delete validates that all specified IDs exist before performing the deletion. If any ID is missing, the entire operation is rejected without deleting anything. The code below inserts a temporary test clause, deletes it successfully with strict=True, then attempts a second deletion that includes a non-existent ID to demonstrate the error behavior. Running this block prints the result of each operation.
upsert_single, then deletes it immediately with strict=True to confirm that a valid deletion succeeds and returns UpdateStatus.Completed. It then attempts a second deletion that includes both ID 999 (now removed) and ID 9999 (never existed) to trigger the strict validation failure. Because at least one ID in the batch cannot be found, the operation is rejected entirely and a PointNotFoundError is raised listing all missing IDs. The collection is flushed at the end to persist the final state.
Expected Output
| Mode | Behavior |
|---|---|
strict=False (default) | Silently ignores non-existent IDs. |
strict=True | Raises PointNotFoundError listing every missing ID, deletes nothing. |
Step 12: Quantization-aware search with SearchParams
When a collection uses quantization (scalar, product, or binary), search can use the compressed vectors for speed and optionally rescore with full-precision vectors for accuracy. TheQuantizationSearchParams object inside SearchParams controls this behavior. The function below runs a clause search with oversampling and rescoring enabled to maximize accuracy. Running this block searches for liability cap provisions and prints the top matching clauses.
"Limitation of damages and liability cap provisions" and searches the clause collection using a SearchParams object that configures quantization-aware retrieval. ignore=False allows the initial HNSW scan to use compressed quantized vectors for speed, oversampling=2.0 doubles the candidate pool to 10 before final ranking, and rescore=True re-evaluates the top candidates using full-precision vectors to recover accuracy lost during compression. Each result is printed with its point ID, similarity score, clause type, and numeric risk score from the payload.
Expected Output
QuantizationSearchParams settings interact as follows:
| Parameter | Effect |
|---|---|
ignore=False | Use quantized vectors during search (fast). |
ignore=True | Skip quantization, use full-precision vectors. |
rescore=True | After initial retrieval with quantized vectors, rescore candidates with full-precision vectors. |
oversampling=2.0 | Retrieve 2x candidates before rescoring (higher recall at cost of latency). |
ignore=False, rescore=True, oversampling=2.0 provides the best accuracy-speed trade-off: fast initial search with quantized vectors, then precise rescoring with original vectors over a 2x candidate pool.
Step 13: Update specific payload keys with the key parameter
Thekey parameter on set_payload places the new payload data under a specific nested key path rather than merging it at the top level. The code below adds review metadata to three clauses under the key "review_metadata", then retrieves clause 0 to confirm the nested structure. Running this block prints the updated payload keys and the content of the new nested field.
set_payload with key="review_metadata" to write three review fields — reviewed, reviewer, and review_date — as a nested object under that key for clause points 0, 1, and 2. Without the key parameter, these fields would be merged at the top level alongside clause_type and clause_text. After the update, points.get retrieves clause 0 to confirm that review_metadata appears as a new top-level key and that its nested content matches what was written.
Expected Output
"review_metadata" rather than added at the top level alongside clause_type and clause_text:
Step 14: Connection pooling for production workloads
pool_size on AsyncVectorAIClient creates multiple concurrent gRPC connections for high-throughput scenarios. The code below initializes a client with four connections and issues four clause searches in parallel using asyncio.gather, distributing the queries across the connection pool. Running this block prints the top results for each of the four queries.
AsyncVectorAIClient with pool_size=4, which opens four concurrent gRPC channels to the server. Four search queries are defined — covering indemnification, liability cap, force majeure, and IP assignment — and submitted simultaneously using asyncio.gather, which distributes each query across the available pool connections. Each search_one coroutine embeds its query string independently and retrieves the top three matching clauses. The results for all four queries are printed in order once all concurrent searches complete.
Expected Output
| Parameter | Default | Purpose |
|---|---|---|
pool_size | 1 | Number of concurrent gRPC channels. |
timeout | 30.0 | Per-call timeout in seconds. |
max_retries | 3 | Automatic retry on transient failures. |
pool_size=4, four concurrent searches run simultaneously over separate gRPC channels. A single connection would serialize all requests, which is unacceptable for production workloads with high query concurrency.
Step 15: Advanced rebuild management
trigger_rebuild with a full configuration object provides fine-grained control over index rebuilds. The code below triggers a rebuild on the clause collection, immediately checks its status with get_rebuild_task, and then lists all rebuild tasks for the collection. Running this block prints the task ID, initial state, and a summary of all active tasks.
RebuildDataSourceConfig (reading from the current index), a RebuildTargetConfig (rebuilding to HNSW), and a RebuildRunConfig with a batch size of 1,000 vectors. wait=False means the call returns immediately after submitting the task rather than blocking until completion. The assigned task ID is then used with get_rebuild_task to fetch the initial task state, and list_rebuild_tasks retrieves all rebuild tasks for the collection so their states can be compared. Because the collection is small, the task may already be in TASK_COMPLETED state by the time list_rebuild_tasks is called.
Expected Output
trigger_rebuild parameters and monitoring methods are as follows:
| Parameter | Type | Purpose |
|---|---|---|
source | RebuildDataSourceConfig | Where to read vectors from (current index, storage, snapshot). |
target | RebuildTargetConfig | What index type to build (HNSW, IVF, flat). |
run_config | RebuildRunConfig | Batch size, catchup rounds, and other runtime settings. |
wait | bool | Block until rebuild completes. |
priority | int | Higher priority tasks execute first. |
| Method | Purpose |
|---|---|
get_rebuild_task(task_id) | Get status and progress of a specific task. |
list_rebuild_tasks(...) | List all tasks, optionally filtered by collection or state. |
cancel_rebuild_task(task_id) | Cancel a running rebuild. |
Step 16: Compaction with CompactOptions
CompactOptions provides fine-grained control over collection compaction — merging segments, purging deleted vector tombstones, and reclaiming storage. The code below triggers a compaction on the clause collection with wait=True so the call blocks until completion, then retrieves the collection state to confirm it is ready. Running this block prints the task ID, any available stats, and the post-compaction collection state.
compact_collection on the clause collection with wait=True and a 120-second timeout, which blocks the calling coroutine until all compaction work finishes. Compaction merges small segments, purges deleted vector tombstones accumulated from earlier operations such as the strict deletion demo in Step 11, and reclaims the freed disk space. Once compaction completes, vde.get_state is called to confirm the collection has returned to CollectionState.READY and is ready to serve queries.
Expected Output
wait=True blocks until compaction finishes, and wait_timeout=120 sets a maximum wait time in seconds.
Step 17: Build the contract analysis engine
The analysis engine takes a query clause and a list of retrieved precedent clauses, scores each precedent by risk level and match quality, and returns a structured risk report. This function does not call the database — it operates entirely on theScoredPoint objects returned by earlier search steps. Defining this function produces no output; it is called in step 18 where the full analysis runs.
Step 18: Run the end-to-end contract analysis
analyze_new_clause embeds an incoming clause, searches the clause collection for the top five most similar precedents above a score threshold of 0.3, and passes the results to analyze_clauses to produce a risk report. Running this block analyzes two new clauses — a liability limitation clause and a non-compete clause — and prints a formatted risk report for each.
analyze_new_clause embeds the text, searches the clause collection for the top five precedents with a minimum similarity threshold of 0.3, and passes the results to analyze_clauses. The analysis engine scores each precedent by risk level and match quality, computes the average and maximum risk scores across all findings, and determines an overall risk level of low, medium, or high. Each report is printed with a header separator, the truncated clause text, the summary message, risk metrics, and a ranked list of precedent findings annotated with match quality labels and risk alerts.
Expected Output