Index Management
Indexes are crucial for optimizing search performance in Milvus. This guide covers creating, managing, and optimizing indexes.
Creating Indexes
Section titled “Creating Indexes”Basic Index Creation
Section titled “Basic Index Creation”Create an index on a vector field:
await client.createIndex({ collection_name: 'my_collection', field_name: 'vector', index_type: 'HNSW', metric_type: 'L2',});Index with Parameters
Section titled “Index with Parameters”Specify index parameters:
await client.createIndex({ collection_name: 'my_collection', field_name: 'vector', index_type: 'HNSW', metric_type: 'L2', params: { M: 16, efConstruction: 200, },});Index Types
Section titled “Index Types”HNSW (Hierarchical Navigable Small World)
Section titled “HNSW (Hierarchical Navigable Small World)”Best for high-dimensional vectors and high recall:
await client.createIndex({ collection_name: 'my_collection', field_name: 'vector', index_type: 'HNSW', metric_type: 'L2', params: { M: 16, // Number of connections efConstruction: 200, // Construction parameter },});IVF_FLAT (Inverted File Flat)
Section titled “IVF_FLAT (Inverted File Flat)”Good balance of speed and accuracy:
await client.createIndex({ collection_name: 'my_collection', field_name: 'vector', index_type: 'IVF_FLAT', metric_type: 'L2', params: { nlist: 1024, // Number of clusters },});IVF_SQ8 (Inverted File Scalar Quantization)
Section titled “IVF_SQ8 (Inverted File Scalar Quantization)”Memory-efficient version of IVF_FLAT:
await client.createIndex({ collection_name: 'my_collection', field_name: 'vector', index_type: 'IVF_SQ8', metric_type: 'L2', params: { nlist: 1024, },});IVF_PQ (Inverted File Product Quantization)
Section titled “IVF_PQ (Inverted File Product Quantization)”Most memory-efficient, good for large datasets:
await client.createIndex({ collection_name: 'my_collection', field_name: 'vector', index_type: 'IVF_PQ', metric_type: 'L2', params: { nlist: 1024, m: 8, // Number of sub-vectors nbits: 8, // Number of bits per sub-vector },});Exact search, no index (for small datasets):
await client.createIndex({ collection_name: 'my_collection', field_name: 'vector', index_type: 'FLAT', metric_type: 'L2',});AUTOINDEX
Section titled “AUTOINDEX”Let Milvus choose the best index:
await client.createIndex({ collection_name: 'my_collection', field_name: 'vector', index_type: 'AUTOINDEX', metric_type: 'L2',});MINHASH_LSH
Section titled “MINHASH_LSH”MINHASH_LSH is an index for binary vectors that uses MinHash locality-sensitive hashing. It is useful for approximate similarity search on binary signatures, sets, shingles, or other binary encodings where Jaccard-style similarity is appropriate.
await client.createCollection({ collection_name: 'binary_docs', fields: [ { name: 'id', data_type: DataType.Int64, is_primary_key: true }, { name: 'binary_vector', data_type: DataType.BinaryVector, dim: 128 }, ],});
await client.createIndex({ collection_name: 'binary_docs', field_name: 'binary_vector', index_type: 'MINHASH_LSH', metric_type: 'JACCARD',});For binary vectors, the dim value must be a multiple of 8, and inserted binary vector payloads use dim / 8 bytes.
Metric Types
Section titled “Metric Types”Choose the appropriate metric type for your use case:
- L2: Euclidean distance (most common)
- IP: Inner product
- COSINE: Cosine similarity
- HAMMING: Hamming distance (for binary vectors)
- JACCARD: Jaccard distance (for binary vectors, including MinHash use cases)
// L2 distanceawait client.createIndex({ collection_name: 'my_collection', field_name: 'vector', index_type: 'HNSW', metric_type: 'L2',});
// Cosine similarityawait client.createIndex({ collection_name: 'my_collection', field_name: 'vector', index_type: 'HNSW', metric_type: 'COSINE',});Index Operations
Section titled “Index Operations”Describe Index
Section titled “Describe Index”Get index information:
const indexInfo = await client.describeIndex({ collection_name: 'my_collection', field_name: 'vector',});
console.log('Index type:', indexInfo.index_type);console.log('Metric type:', indexInfo.metric_type);console.log('Parameters:', indexInfo.params);List Indexes
Section titled “List Indexes”List all indexes in a collection:
const indexes = await client.listIndexes({ collection_name: 'my_collection',});
console.log('Indexes:', indexes.index_descriptions);Get Index State
Section titled “Get Index State”Check if index is built:
const state = await client.getIndexState({ collection_name: 'my_collection', field_name: 'vector',});
console.log('Index state:', state.state); // 'IndexStateNone', 'IndexStateUnissued', 'IndexStateInProgress', 'IndexStateFinished', 'IndexStateFailed'Get Index Build Progress
Section titled “Get Index Build Progress”Monitor index build progress:
const progress = await client.getIndexBuildProgress({ collection_name: 'my_collection', field_name: 'vector',});
console.log('Indexed rows:', progress.indexed_rows);console.log('Total rows:', progress.total_rows);Alter Index Properties
Section titled “Alter Index Properties”Modify index properties:
await client.alterIndexProperties({ collection_name: 'my_collection', field_name: 'vector', properties: { 'index.params.ef': 100, },});Drop Index Properties
Section titled “Drop Index Properties”Remove index properties:
await client.dropIndexProperties({ collection_name: 'my_collection', field_name: 'vector', property_names: ['index.params.ef'],});Drop Index
Section titled “Drop Index”Delete an index:
await client.dropIndex({ collection_name: 'my_collection', field_name: 'vector',});Index Best Practices
Section titled “Index Best Practices”Choosing an Index Type
Section titled “Choosing an Index Type”- Small datasets (< 1M vectors): Use FLAT for exact search
- Medium datasets (1M - 10M vectors): Use HNSW for high recall
- Large datasets (> 10M vectors): Use IVF_PQ for memory efficiency
- High-dimensional vectors: Prefer HNSW or IVF_FLAT
- Memory-constrained: Use IVF_SQ8 or IVF_PQ
- Binary signatures / set similarity: Use MINHASH_LSH with JACCARD
Index Parameters
Section titled “Index Parameters”HNSW Parameters:
M: Higher values improve recall but increase memory (typical: 16-32)efConstruction: Higher values improve index quality but slower build (typical: 100-500)
IVF Parameters:
nlist: Number of clusters (typical: sqrt(total_vectors) to total_vectors/10)
IVF_PQ Parameters:
m: Number of sub-vectors (typical: 8-16)nbits: Bits per sub-vector (typical: 8)
Metric Type Selection
Section titled “Metric Type Selection”- L2: Use for embeddings trained with L2 distance
- IP: Use for embeddings trained with inner product
- COSINE: Use for normalized embeddings
- HAMMING/JACCARD: Use for binary vectors
- JACCARD + MINHASH_LSH: Use for binary signatures and set-similarity workloads
Index Building
Section titled “Index Building”- Build after insertion: Create index after inserting data
- Monitor progress: Check build progress for large datasets
- Load after indexing: Load collection after index is built
- One index per vector field: Each vector field can have one index
Complete Example
Section titled “Complete Example”// Create collectionawait client.createCollection({ collection_name: 'my_collection', fields: [ { name: 'id', data_type: DataType.Int64, is_primary_key: true, autoID: true, }, { name: 'vector', data_type: DataType.FloatVector, dim: 128, }, ],});
// Insert dataawait client.insert({ collection_name: 'my_collection', data: [ /* ... */ ],});
// Create indexawait client.createIndex({ collection_name: 'my_collection', field_name: 'vector', index_type: 'HNSW', metric_type: 'L2', params: { M: 16, efConstruction: 200, },});
// Wait for index to be builtlet state;do { state = await client.getIndexState({ collection_name: 'my_collection', field_name: 'vector', }); await new Promise(resolve => setTimeout(resolve, 1000));} while (state.state !== 'IndexStateFinished');
// Load collectionawait client.loadCollectionSync({ collection_name: 'my_collection',});
// Now you can searchconst results = await client.search({ collection_name: 'my_collection', data: [ /* vector */ ], limit: 10,});Next Steps
Section titled “Next Steps”- Learn about Data Operations
- Explore Search Operations
- Check out Best Practices