Skip to main content
Query variation fusion generates slight perturbations of a query vector and searches with each variation. Fusing the results reduces sensitivity to the exact query representation and improves the robustness of retrieval. This approach is useful when small changes in the query vector can lead to different result rankings. The example below creates four variations of a base query vector by adding small amounts of random noise, searches with each variation, and then applies DBSF to combine the results into a single ranked list.
from actian_vectorai import VectorAIClient, distribution_based_score_fusion
import random

COLLECTION = "documents"
DIMENSION = 128

def generate_query_variations(base_query_vector, num_variations=3):
    """Generate query variations with slight perturbations"""
    variations = [base_query_vector]

    for _ in range(num_variations - 1):
        # Add small random noise to create variations
        variation = [
            x + random.gauss(0, 0.1)
            for x in base_query_vector
        ]
        variations.append(variation)

    return variations

with VectorAIClient("localhost:50051") as client:
    # Base query
    base_query = [random.gauss(0, 1) for _ in range(DIMENSION)]

    # Generate variations
    query_variations = generate_query_variations(base_query, num_variations=4)

    # Search with each variation
    all_results = []
    for i, query in enumerate(query_variations, 1):
        results = client.points.search(
            COLLECTION,
            vector=query,
            limit=10
        )
        print(f"Query variation {i}: {len(results)} results")
        all_results.append(results)

    # Fuse all variations
    final_results = distribution_based_score_fusion(all_results)

    print(f"\nFinal fused results: {len(final_results)}")
    for i, point in enumerate(final_results[:3], 1):
        print(f"{i}. Score: {point.score:.4f}, Payload: {point.payload}")
Each fused result includes these fields:
  • id: The unique identifier of the matching point
  • score: Normalized fused score from distribution-based fusion across all query variations
  • payload: Metadata object from the matching point
Query variation fusion is effective when:
  • Small perturbations in embedding space lead to different result rankings
  • You want more stable, reproducible search results
  • The query embedding may not perfectly capture the user’s intent