docs: improve docstring for RabitQ in Python (#2808)

This PR improves the docstring for `IVF_RQ` (RabitQ) in Python. The
earlier version referred to it as "residual quantization", which is
confusing to future readers of the code.

In contrast, the TypeScript and Rust codebases defined `IVF_RQ` as
RabitQ. So now the three languages use comments that are consistent with
one another.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This commit is contained in:
Prashanth Rao
2025-11-23 21:35:19 -08:00
committed by GitHub
parent 5a2b33581e
commit a250d8e7df

View File

@@ -609,9 +609,19 @@ class IvfPq:
class IvfRq: class IvfRq:
"""Describes an IVF RQ Index """Describes an IVF RQ Index
IVF-RQ (Residual Quantization) stores a compressed copy of each vector using IVF-RQ (RabitQ Quantization) compresses vectors using RabitQ quantization
residual quantization and organizes them into IVF partitions. Parameters and organizes them into IVF partitions.
largely mirror IVF-PQ for consistency.
The compression scheme is called RabitQ quantization. Each dimension is
quantized into a small number of bits. The parameters `num_bits` and
`num_partitions` control this process, providing a tradeoff between
index size (and thus search speed) and index accuracy.
The partitioning process is called IVF and the `num_partitions` parameter
controls how many groups to create.
Note that training an IVF RQ index on a large dataset is a slow operation
and currently is also a memory intensive operation.
Attributes Attributes
---------- ----------
@@ -628,7 +638,7 @@ class IvfRq:
Number of IVF partitions to create. Number of IVF partitions to create.
num_bits: int, default 1 num_bits: int, default 1
Number of bits to encode each dimension. Number of bits to encode each dimension in the RabitQ codebook.
max_iterations: int, default 50 max_iterations: int, default 50
Max iterations to train kmeans when computing IVF partitions. Max iterations to train kmeans when computing IVF partitions.