tests — search

Module: tests-search Cohesion: 0.80 Members: 0

tests — search

This document provides an overview of the test suite for the search functionalities within the codebase, located in the tests/search directory. This module is crucial for ensuring the correctness, performance, and reliability of both lexical (BM25) and vector (USearch) search implementations.

The tests are organized into two primary files, each focusing on a distinct search paradigm:

BM25 Search Tests (hybrid-search.test.ts)

This test file validates the BM25Index class and its associated utility functions, which together provide a robust lexical search capability. It ensures that text processing (tokenization, stemming) and document indexing/retrieval work as expected.

Overview

The tests in hybrid-search.test.ts interact directly with the BM25Index class and several helper functions imported from src/search/bm25.js. They cover the entire lifecycle of a BM25 index, from text preprocessing to document management and search queries.

Core Functionality Tests

These tests focus on the fundamental operations of the BM25 search:

  1. Tokenization:

  1. Stemming:

  1. Combined Tokenization and Stemming:

  1. BM25Index Operations:

Index Management Tests

This section validates the singleton pattern and lifecycle management for BM25 indexes:

USearch Vector Index Tests (usearch-index.test.ts)

This test file is dedicated to validating the USearchVectorIndex class, which provides approximate nearest neighbor (ANN) search capabilities using the USearch library. It covers vector operations, persistence, and index management.

Overview

The tests in usearch-index.test.ts interact with the USearchVectorIndex class and related functions from src/search/usearch-index.js. They ensure the correct behavior of vector indexing, similarity search, and the persistence mechanisms. The tests also verify event emission for various operations.

Configuration and Initialization

Vector Operations (Add, Search, Remove)

These tests cover the core functionality of managing and querying vectors:

  1. Adding Vectors:

  1. Searching Vectors:

  1. Removing Vectors:

  1. Utility Methods:

Persistence and Lifecycle

Different Metrics

The tests confirm that USearchVectorIndex correctly operates with different similarity metrics:

Singleton Management

Similar to BM25, USearch indexes can be managed as singletons:

Edge Cases

The tests cover various edge cases to ensure robustness:

Conclusion

The tests/search module provides comprehensive validation for both lexical and vector search capabilities. By thoroughly testing BM25Index and USearchVectorIndex, it ensures that these critical components of the search system are reliable, performant, and correctly integrated, covering everything from low-level text processing and vector operations to high-level index management and persistence. Developers contributing to the search functionality should refer to these tests to understand expected behavior and to add new tests for any new features or bug fixes.