System architecture
The following diagram shows the complete facial recognition pipeline, from raw image or video input through face detection, embedding extraction, and similarity search, to a final identity decision:Concepts
Facial recognition systems are built as multi-stage pipelines, where each component transforms raw images into searchable representations. The following concepts are foundational to how these systems operate at scale:Face embeddings as identity representations
Face embeddings as identity representations
Face embeddings are dense, fixed-length vectors (typically 128-512 dimensions) that encode the distinguishing features of a face. Models such as FaceNet and ArcFace are trained so that embeddings of the same individual cluster closely in vector space, while different identities are pushed farther apart.
This transformation enables efficient similarity search in Actian VectorAI DB, replacing expensive image comparisons with fast vector distance calculations.
Detection as a prerequisite to recognition
Detection as a prerequisite to recognition
Facial recognition begins with detection, which locates and extracts faces from raw images using bounding boxes. Recognition models assume clean, aligned face inputs, making detection a critical preprocessing step.
Errors at this stage, such as missed or misaligned faces, directly impact downstream embedding quality and overall system accuracy.
Verification vs. large-scale identification
Verification vs. large-scale identification
Facial recognition systems typically operate in two modes. Verification (1:1) confirms whether a face matches a claimed identity, while identification (1:N) searches across a database to find the closest match.
Identification introduces additional challenges around scalability and latency, where vector databases like Actian VectorAI DB play a key role in enabling fast nearest-neighbor search across large embedding collections.
Prerequisites
Before following the implementation steps, ensure your environment meets the following requirements:- Python 3.9 or later is installed.
- Actian VectorAI DB is running and accessible at
localhost:50051(see installation guide). - A GPU is optional but strongly recommended for DeepFace embedding extraction at production throughput.
- The following system-level dependencies are installed for OpenCV and DeepFace:
- On Ubuntu/Debian:
libgl1-mesa-glx libglib2.0-0. - On macOS: Xcode command line tools (
xcode-select --install).
- On Ubuntu/Debian:
- At least one face image per person is available for registration.
Implementation
All code snippets in this guide are designed to be placed in a single Python file and run together usingasyncio.run(main()) at the end. The individual asyncio.run(...) calls shown in steps 2 and 3 are only for testing those steps in isolation during development.
Step 1: Install dependencies
Run the following command to install the three packages required across all steps. This installs DeepFace for embedding extraction, OpenCV for image and video processing, and the Actian VectorAI DB Python client:Confirm the exact installable package name and version with your Actian VectorAI DB distribution before running this command, as the package name may differ between release channels.
Step 2: Create the face database collection
The following code connects to Actian VectorAI DB and creates a collection namedface_database configured for 512-dimensional vectors using cosine similarity. If the collection already exists, the function exits without making changes. All imports used across the full implementation are included at the top so they can be placed once at the top of the file:
Verify with your Actian engineering team that
AsyncVectorAIClient, VectorParams, Distance, and PointStruct are exported from actian_vectorai with the method signatures used here before running in production.Step 3: Extract face embeddings
The following two functions extract embeddings from static image files and from live video frames respectively.extract_face_embedding returns None when no face is detected so the caller can skip the registration step. extract_embeddings_from_frame returns an empty list when no faces are detected, and silently drops any face whose bounding box is incomplete to prevent runtime errors downstream:
The output shape of
DeepFace.represent() can vary depending on the detector backend and DeepFace version. Verify that result["embedding"], result.get("facial_area"), and result.get("face_confidence") are present in your installed version before deploying.Step 4: Register faces in the database
register_face extracts an embedding from a single image and stores it in Actian VectorAI DB as a point with a UUID, the person’s ID and name, the source image path, and any optional metadata. register_multiple_faces calls register_face in a loop to support registering several photos of the same person, which improves recognition accuracy by covering different angles and lighting conditions:
Verify with engineering that
db.points.upsert() accepts a string UUID as the id field in PointStruct. Some SDK versions may require integer IDs or a different point schema.Step 5: Search and verify faces
search_face converts a query image into an embedding and returns up to limit matching records whose cosine similarity score meets or exceeds threshold. verify_identity performs the same search but restricts the candidate set to faces already registered under claimed_person_id, returning a verification result with the matched name and confidence score:
Verify with engineering that
score_threshold is a supported parameter on db.points.search(), that r.score represents a cosine similarity value between 0 and 1, and that Filter.all([Conditions.match(...)]) is the correct filter constructor in your SDK version. If the backend returns a distance metric rather than a similarity score, the threshold values above will need to be adjusted accordingly.Step 6: Run real-time video recognition
RealTimeFaceRecognition opens a video capture source, processes every fifth frame to extract face embeddings, queries the database for each high-confidence detection, and draws a bounding box with the matched name and similarity score on the frame. Labels are only updated on processed frames; on skipped frames the previous overlay remains visible, which can cause brief flickering during fast movement. This is an intentional tradeoff to reduce embedding computation. Cache the last detection results if a consistent per-frame overlay is required:
Usage example
Place all the functions and the class above in a single Python file. Add the followingmain() function at the bottom of the file. Running the file calls create_face_collection() to set up the database, registers three photos of a person, runs a similarity search against a query image, verifies an identity, and then opens a live recognition window:
main() function orchestrates the complete pipeline in sequence. It first calls create_face_collection() to initialize the face_database collection in Actian VectorAI DB with 512-dimensional cosine-similarity vectors. It then registers three JPEG photos of John Smith (with person_id="person_001") by extracting a Facenet512 embedding from each image and upserting it as a separate point. Next, it runs a similarity search against query_photo.jpg using a cosine similarity threshold of 0.6, printing the name and score for every match that meets the threshold. It then verifies verify_photo.jpg against the registered embeddings for person_001 using the stricter threshold of 0.7 and prints the full verification result dictionary. Finally, it opens the default system camera (video_source=0) and starts the real-time recognition loop.
Expected Output
Face collection created confirms that the face_database collection was initialized in Actian VectorAI DB. The three Registered face for John Smith lines each display the UUID assigned to the embedding extracted from one of the three registration photos. Registered 3 faces for John Smith summarizes the total count of successfully stored records for that identity. Match: John Smith (similarity: 0.87) shows the top result returned by the similarity search against query_photo.jpg, where 0.87 is the cosine similarity score exceeding the 0.6 threshold. The final Verification dict shows that verify_photo.jpg was confirmed as person_001 with a similarity of 0.91, which exceeds the 0.7 verification threshold, and returns matched_name as John Smith. After the terminal output, an OpenCV window opens showing the camera feed with bounding boxes and name labels overlaid on detected faces. Green boxes indicate a matched identity with the name and score shown; red boxes indicate an unrecognized face. Press q to close the window and end the process.
Performance optimization
Each of the following techniques targets a different bottleneck in the pipeline. GPU acceleration and batching reduce embedding extraction time, caching reduces redundant computation, ANN indexing reduces search latency at large scale, and quantization reduces storage and memory overhead:- Use GPU acceleration for face embedding extraction.
- Batch-process multiple faces when possible.
- Cache recent embeddings to reduce repeated computation for the same person across consecutive frames.
- Use approximate nearest neighbor (ANN) indexing for databases containing more than a few thousand faces.
- Consider vector quantization to reduce embedding storage size without significantly reducing accuracy.
Ethical and legal considerations
Facial recognition systems handle sensitive biometric data and require careful attention to legal and ethical obligations before any deployment:- Ensure compliance with applicable regulations such as GDPR, CCPA, or BIPA.
- Obtain informed consent before collecting or processing facial data.
- Account for bias: all facial recognition models exhibit varying accuracy across demographic groups, which can result in false positives or false negatives at different rates.
- Do not use automated facial recognition as the sole decision mechanism in high-stakes contexts such as law enforcement, hiring, or access control. Always include human review for consequential decisions.
Next steps
The following cards link to related articles and tutorials that extend the concepts covered in this guide:Vector database fundamentals
Core concepts for collections, points, vectors, and payloads.
Similarity search
Search patterns and techniques.