Skip to Content

Iterators

Iterators allow you to paginate through large result sets that exceed the topk limit (16,384). Instead of retrieving all results at once, iterators fetch data in configurable batches.

Search Iterator

searchIterator() performs vector similarity search and returns results in batches.

Basic Usage

const iterator = await client.searchIterator({ collection_name: 'my_collection', data: [0.1, 0.2, 0.3, ...], // search vector batchSize: 100, limit: 1000, // total results to return (-1 or omit for no limit) output_fields: ['id', 'text', 'score'], expr: 'age > 25', }); for await (const batch of iterator) { console.log('Batch size:', batch.length); // Process each batch }

Parameters

ParameterTypeRequiredDescription
collection_namestringYesCollection to search
datanumber[] or number[][]YesSearch vector(s)
batchSizenumberYesItems per batch (max 16384)
limitnumberNoTotal results limit (-1 for unlimited)
exprstringNoFilter expression
output_fieldsstring[]NoFields to return
anns_fieldstringNoVector field name (auto-detected if only one)
paramsobjectNoSearch parameters (e.g., { nprobe: 10 })
external_filter_fnfunctionNoClient-side filter function (see below)

Client-Side Filtering

Use external_filter_fn to apply additional filtering on the client side after results are returned from the server:

const iterator = await client.searchIterator({ collection_name: 'my_collection', data: [0.1, 0.2, 0.3, ...], batchSize: 100, external_filter_fn: (row) => { // Only keep results where the text length > 50 return row.text && row.text.length > 50; }, }); for await (const batch of iterator) { // All items in batch satisfy the external filter console.log(batch); }

Query Iterator

queryIterator() retrieves entities matching a filter expression in batches.

Basic Usage

const iterator = await client.queryIterator({ collection_name: 'my_collection', expr: 'age > 30', output_fields: ['id', 'text', 'age'], batchSize: 100, limit: 5000, }); for await (const batch of iterator) { console.log('Batch:', batch.length); // Process each batch of query results }

Parameters

ParameterTypeRequiredDescription
collection_namestringYesCollection to query
exprstringYesFilter expression
batchSizenumberYesItems per batch
limitnumberNoTotal results limit
output_fieldsstring[]NoFields to return
partition_namesstring[]NoPartitions to query

End-to-End Example: Full Collection Scan

import { MilvusClient, DataType } from '@zilliz/milvus2-sdk-node'; const client = new MilvusClient({ address: 'localhost:19530' }); // Create and populate a collection await client.createCollection({ collection_name: 'iterator_demo', fields: [ { name: 'id', data_type: DataType.Int64, is_primary_key: true, autoID: true }, { name: 'category', data_type: DataType.VarChar, max_length: 64 }, { name: 'vector', data_type: DataType.FloatVector, dim: 4 }, ], }); await client.createIndex({ collection_name: 'iterator_demo', field_name: 'vector', index_type: 'AUTOINDEX', metric_type: 'COSINE', }); await client.loadCollectionSync({ collection_name: 'iterator_demo' }); // Insert sample data const data = Array.from({ length: 1000 }, (_, i) => ({ category: `cat_${i % 10}`, vector: Array.from({ length: 4 }, () => Math.random()), })); await client.insert({ collection_name: 'iterator_demo', data }); // Scan all entities in category 'cat_5' let totalCount = 0; const iterator = await client.queryIterator({ collection_name: 'iterator_demo', expr: 'category == "cat_5"', output_fields: ['id', 'category'], batchSize: 50, }); for await (const batch of iterator) { totalCount += batch.length; console.log(`Fetched ${batch.length} items, total: ${totalCount}`); } console.log(`Total matching entities: ${totalCount}`); // Cleanup await client.dropCollection({ collection_name: 'iterator_demo' });

Best Practices

  1. Batch size tuning — Start with 100-500. Larger batches reduce round trips but increase memory usage per batch. Maximum is 16,384.
  2. Use filters — Apply server-side filters (expr) to reduce data transfer. Use external_filter_fn only for logic that can’t be expressed as a Milvus filter.
  3. Output fields — Only request fields you need to minimize data transfer.
  4. Memory — For very large scans, process and discard each batch promptly rather than accumulating all results in memory.

Next Steps

Commit

git add docs/content/operations/iterators.mdx git commit --signoff -m "docs: add iterators documentation page"
Last updated on