Skip to main content
LangChain is a framework for developing applications powered by Large Language Models. It unifies interfaces to various embedding providers and vector stores, allowing you to focus on application logic rather than infrastructure. Actian VectorAI DB integrates with LangChain as a vector store through the langchain-actian-vectorai package. This integration supports all standard LangChain vector store operations, including adding documents, similarity search, max marginal relevance search, and using VectorAI DB as a retriever in LangChain chains.

Installation

Install the VectorAI DB vector store integration for LangChain:
pip install langchain-actian-vectorai
This package includes actian_vectorai as a transitive dependency, so you do not need to install it separately. You also need an embedding provider such as langchain-openai:
pip install langchain-openai

Requirements

Before using this integration, make sure your environment meets the following prerequisites:
  • Python 3.10 or later
  • A running Actian VectorAI DB instance (default endpoint: localhost:50051). See Docker installation for setup instructions.
  • An OPENAI_API_KEY environment variable set with a valid OpenAI API key, if using OpenAIEmbeddings as your embedding provider.

Quickstart

The following example connects to a VectorAI DB server, creates a collection with cosine distance, adds two texts, and runs a similarity search. The ActianVectorAIVectorStore handles embedding generation and vector storage automatically. The vector dimension is set to 1536 to match the default OpenAIEmbeddings model.
from actian_vectorai import VectorAIClient, VectorParams, Distance
from langchain_actian_vectorai import ActianVectorAIVectorStore
from langchain_openai import OpenAIEmbeddings

# Connect to the VectorAI DB server.
client = VectorAIClient("localhost:50051")
client.connect()

# Create a collection configured for OpenAI embeddings with cosine distance.
client.collections.create(
    "my_collection",
    vectors_config=VectorParams(size=1536, distance=Distance.Cosine),
)

# Initialize the vector store with the collection and embedding provider.
store = ActianVectorAIVectorStore(
    client=client,
    collection_name="my_collection",
    embedding=OpenAIEmbeddings(),
)

# Add texts and run a similarity search.
ids = store.add_texts(["hello world", "goodbye world"])
results = store.similarity_search("hello", k=1)

Creating a vector store

You can create a vector store from plain text strings or from LangChain Document objects. Both methods handle collection creation and vector insertion in a single call. Use these helper constructors when you want automatic setup. Use the manual approach shown in the Quickstart when you need explicit control over collection parameters such as vector dimension or distance metric.

From texts

Use from_texts to create a vector store, set up a collection, and add texts in a single call. The metadatas parameter attaches metadata to each text as payload in VectorAI DB, which you can use for filtering during search.
store = ActianVectorAIVectorStore.from_texts(
    texts=["the cat sat on the mat", "the dog played in the park"],
    embedding=OpenAIEmbeddings(),
    metadatas=[{"source": "book"}, {"source": "article"}],
    collection_name="my_collection",
    url="localhost:50051",
)

From documents

Use from_documents to create a vector store from LangChain Document objects. The page_content field is embedded and stored as a vector, and the metadata field is stored as payload in VectorAI DB.
from langchain_core.documents import Document

docs = [
    Document(page_content="foo", metadata={"baz": "bar"}),
    Document(page_content="thud", metadata={"bar": "baz"}),
]
store = ActianVectorAIVectorStore.from_documents(
    documents=docs,
    embedding=OpenAIEmbeddings(),
    collection_name="my_collection",
    url="localhost:50051",
)

Async operations

All creation and search methods have async counterparts for non-blocking operations. Async methods use AsyncVectorAIClient under the hood. The await calls in this section must run inside an async function or an environment that supports top level await, such as a Jupyter notebook.

Async from texts

Use afrom_texts to create a vector store and add texts asynchronously. The returned store supports all async operations, including asimilarity_search.
store = await ActianVectorAIVectorStore.afrom_texts(
    texts=["the cat sat on the mat", "the dog played in the park"],
    embedding=OpenAIEmbeddings(),
    metadatas=[{"source": "book"}, {"source": "article"}],
    collection_name="my_collection",
    url="localhost:50051",
)

# Run an async similarity search on the store.
results = await store.asimilarity_search("cat", k=2)

Async from documents

Use afrom_documents to create a vector store from Document objects asynchronously. Document IDs are preserved when set, and you can use adelete to remove documents by ID.
from langchain_core.documents import Document

docs = [
    Document(page_content="foo", metadata={"baz": "bar"}, id="doc1"),
    Document(page_content="thud", metadata={"bar": "baz"}, id="doc2"),
]
store = await ActianVectorAIVectorStore.afrom_documents(
    documents=docs,
    embedding=OpenAIEmbeddings(),
    collection_name="my_collection",
    url="localhost:50051",
)

# Search for similar documents and delete by ID.
results = await store.asimilarity_search("foo", k=1)
await store.adelete(ids=["doc1"])
The integration provides several similarity search methods, each available in both sync and async variants. Use similarity_search to return the k most similar documents to a query.
results = store.similarity_search("hello", k=4)

Search with scores

Use similarity_search_with_score to return documents paired with their raw similarity scores. Lower scores indicate closer matches when using cosine distance.
results = store.similarity_search_with_score("hello", k=4)

# Print each document with its similarity score.
for doc, score in results:
    print(f"[{score:.3f}] {doc.page_content}")

Search with relevance scores

Use similarity_search_with_relevance_scores to return documents with scores normalized to a zero-to-one range, where higher values indicate greater relevance.
results = store.similarity_search_with_relevance_scores("hello", k=4)
All search methods have async variants prefixed with a.
# Run similarity searches asynchronously.
results = await store.asimilarity_search("hello", k=4)
results = await store.asimilarity_search_with_score("hello", k=4)
Max Marginal Relevance (MMR) optimizes for both similarity to the query and diversity among results. This is useful when you want relevant results that cover different aspects of the query rather than returning near-duplicate matches. The following table describes the MMR-specific parameters.
ParameterDescription
kNumber of results to return.
fetch_kNumber of candidates to fetch before reranking. Higher values give MMR more candidates to select from.
lambda_multBalance between relevance and diversity. Values closer to 1.0 favor relevance, values closer to 0.0 favor diversity.
The following example runs an MMR search that fetches 20 candidates and returns the 4 most relevant yet diverse results. The lambda_mult value of 0.5 balances relevance and diversity equally.
# Run a sync MMR search.
results = store.max_marginal_relevance_search(
    "machine learning",
    k=4,
    fetch_k=20,
    lambda_mult=0.5,
)

# Run an async MMR search.
results = await store.amax_marginal_relevance_search(
    "machine learning", k=4, fetch_k=20, lambda_mult=0.5,
)

Use as a retriever

You can convert the vector store into a LangChain retriever for use in chains and agents. The search_type parameter accepts "similarity" for standard vector search or "mmr" for Max Marginal Relevance search. Pass additional search parameters through search_kwargs.
# Create a retriever that uses MMR search.
retriever = store.as_retriever(
    search_type="mmr",
    search_kwargs={"k": 4, "fetch_k": 20, "lambda_mult": 0.5},
)

# Retrieve documents synchronously.
docs = retriever.invoke("machine learning")

# Retrieve documents asynchronously.
docs = await retriever.ainvoke("machine learning")

Configuration

The following table lists the parameters you can pass when creating an ActianVectorAIVectorStore instance.
ParameterDefaultDescription
url"localhost:50051"VectorAI server gRPC address.
collection_nameAuto-generated UUIDCollection name in VectorAI DB.
distance"COSINE"Distance metric: COSINE, EUCLID, or DOT.
content_payload_key"page_content"Payload key for document content.
metadata_payload_key"metadata"Payload key for document metadata.
batch_size64Batch size for upsert operations.
force_recreateFalseRecreate collection if it already exists.

API reference

The following table lists all available methods on ActianVectorAIVectorStore and their async counterparts.
MethodAsync variantDescription
from_texts()afrom_texts()Create store, collection, and add texts.
from_documents()afrom_documents()Create store, collection, and add documents.
add_texts()aadd_texts()Add texts to an existing store.
add_documents()aadd_documents()Add documents to an existing store.
similarity_search()asimilarity_search()Search by query text.
similarity_search_with_score()asimilarity_search_with_score()Search with raw scores.
similarity_search_by_vector()asimilarity_search_by_vector()Search by embedding vector.
max_marginal_relevance_search()amax_marginal_relevance_search()MMR search for diverse results.
delete()adelete()Delete documents by IDs.
get_by_ids()aget_by_ids()Retrieve documents by IDs.

Next steps

Explore related topics to continue building with LangChain and VectorAI DB:
  • OpenAI embeddings — Configure OpenAI as your embedding provider.
  • LlamaIndex — Use VectorAI DB with the LlamaIndex framework.
  • Search — Understand the underlying vector search operations.
  • Filtering — Apply metadata conditions to narrow search results.