Skip to content

Conversation

@tteofili
Copy link
Contributor

This adds symmetric 4 bit support to DIskBBQ indices.

@tteofili
Copy link
Contributor Author

tteofili commented Oct 29, 2025

this gives us a bump in recall over 1 and 2 bits, however it's currently much slower.
1-bit

index_name       index_type  num_docs  doc_add_time(ms)  total_index_time(ms)  force_merge_time(ms)  num_segments
---------------  ----------  --------  ----------------  --------------------  --------------------  ------------  
wiki1024en.docs         ivf    499000               836                 25145                     0             8

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count      QPS  recall    visited  filter_selectivity
---------------  ----------  -------------------  -----------  ----------------  -------------  -------  ------  ---------  ------------------  
wiki1024en.docs         ivf                 1.00         0.35              0.87           2.49  2857.14    0.74   13043.17                4.00
wiki1024en.docs         ivf                 5.00         0.44              1.43           3.25  2272.73    0.78   52763.14                4.00
wiki1024en.docs         ivf                10.00         1.01              3.43           3.40   990.10    0.79  102704.67                4.00
wiki1024en.docs         ivf                30.00         2.00              6.98           3.49   500.00    0.79  302511.11                4.00
wiki1024en.docs         ivf                50.00         2.71              9.43           3.48   369.00    0.79  501910.95                4.00
wiki1024en.docs         ivf                70.00         3.75             13.02           3.47   266.67    0.79  701296.15                4.00
wiki1024en.docs         ivf               100.00         5.22             18.33           3.51   191.57    0.79  998000.00                4.00

2-bits

index_name       index_type  num_docs  doc_add_time(ms)  total_index_time(ms)  force_merge_time(ms)  num_segments
---------------  ----------  --------  ----------------  --------------------  --------------------  ------------  
wiki1024en.docs         ivf    499000              1153                 26075                     0             8

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count      QPS  recall    visited  filter_selectivity
---------------  ----------  -------------------  -----------  ----------------  -------------  -------  ------  ---------  ------------------  
wiki1024en.docs         ivf                 1.00         0.99              3.28           3.31  1010.10    0.81   12898.42                1.00
wiki1024en.docs         ivf                 5.00         4.28             11.63           2.72   233.64    0.87   52753.10                1.00
wiki1024en.docs         ivf                10.00         8.24             21.96           2.67   121.36    0.88  102670.69                1.00
wiki1024en.docs         ivf                30.00        23.02             60.50           2.63    43.44    0.88  302414.15                1.00
wiki1024en.docs         ivf                50.00        37.37             96.84           2.59    26.76    0.88  502034.51                1.00
wiki1024en.docs         ivf                70.00        52.31            135.03           2.58    19.12    0.88  701306.98                1.00
wiki1024en.docs         ivf               100.00        76.94            198.21           2.58    13.00    0.88  997999.00                1.00

4-bits

index_name       index_type  num_docs  doc_add_time(ms)  total_index_time(ms)  force_merge_time(ms)  num_segments
---------------  ----------  --------  ----------------  --------------------  --------------------  ------------  
wiki1024en.docs         ivf    499000              1120                 26909                     0             8

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall    visited  filter_selectivity
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  ---------  ------------------  
wiki1024en.docs         ivf                 1.00         2.39              7.46           3.12  418.41    0.87   13259.86                1.00
wiki1024en.docs         ivf                 5.00         7.93             22.14           2.79  126.10    0.93   52943.50                1.00
wiki1024en.docs         ivf                10.00        15.46             42.11           2.72   64.68    0.94  102636.98                1.00
wiki1024en.docs         ivf                30.00        45.84            122.04           2.66   21.82    0.95  302375.00                1.00
wiki1024en.docs         ivf                50.00        76.42            201.22           2.63   13.09    0.95  501783.49                1.00
wiki1024en.docs         ivf                70.00       107.32            280.86           2.62    9.32    0.95  701135.30                1.00
wiki1024en.docs         ivf               100.00       164.37            418.40           2.55    6.08    0.95  998000.00                1.00

the slowness is partially expected because of the missing stuff in native computations (which would make sense to implement in a separate PR).
from this very initial tests, 2-bits is 2x slower than 1-bit and this one is 2x slower than 2-bits.

@tteofili tteofili marked this pull request as ready for review October 29, 2025 14:50
@tteofili tteofili changed the title Add 4-bit quantization to bbq_disk Add 4-bit quantization to DiskBBQ Oct 29, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Oct 29, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@tteofili tteofili changed the title Add 4-bit quantization to DiskBBQ Add 4-bit quantization to DiskBBQ next Oct 29, 2025
Copy link
Contributor

@john-wagster john-wagster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@tteofili tteofili merged commit 1f29688 into elastic:main Oct 30, 2025
34 checks passed
chrisparrinello pushed a commit to chrisparrinello/elasticsearch that referenced this pull request Nov 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants