Skip to Content
Data OperationsInsert & Update

Data Operations - Insert & Update

This guide covers inserting and updating data in Milvus collections.

Inserting Data

Basic Insert

Insert data into a collection:

await client.insert({ collection_name: 'my_collection', data: [ { id: 1, vector: [0.1, 0.2, 0.3, 0.4], text: 'Hello Milvus', }, { id: 2, vector: [0.5, 0.6, 0.7, 0.8], text: 'Vector database', }, ], });

Insert with AutoID

If your collection uses autoID, omit the ID field:

await client.insert({ collection_name: 'my_collection', data: [ { vector: [0.1, 0.2, 0.3, 0.4], text: 'Hello Milvus', }, { vector: [0.5, 0.6, 0.7, 0.8], text: 'Vector database', }, ], });

Insert into Specific Partition

Insert data into a specific partition:

await client.insert({ collection_name: 'my_collection', partition_name: 'partition_1', data: [ { vector: [0.1, 0.2, 0.3, 0.4], text: 'Data for partition 1', }, ], });

Batch Insertion

Insert large amounts of data efficiently:

const batchSize = 1000; const allData = [/* large array of data */]; for (let i = 0; i < allData.length; i += batchSize) { const batch = allData.slice(i, i + batchSize); await client.insert({ collection_name: 'my_collection', data: batch, }); }

Data Format

Vector Data Formats

FloatVector

{ vector: [0.1, 0.2, 0.3, 0.4, /* ... */], // Array of numbers }

BinaryVector

{ binary_vector: [1, 0, 1, 0, /* ... */], // Array of 0s and 1s, length = dim / 8 }

SparseFloatVector

Multiple formats supported:

// Array format { sparse_vector: [1.0, undefined, undefined, 2.5, undefined], } // Dictionary format { sparse_vector: { '0': 1.0, '3': 2.5 }, } // CSR format { sparse_vector: { indices: [0, 3], values: [1.0, 2.5], }, } // COO format { sparse_vector: [ { index: 0, value: 1.0 }, { index: 3, value: 2.5 }, ], }

Float16Vector / BFloat16Vector

{ vector: [0.1, 0.2, 0.3, 0.4], // Array of numbers (will be converted) }

Scalar Data Formats

{ id: 123, // Int64 age: 25, // Int64 score: 98.5, // Float is_active: true, // Bool name: 'John Doe', // VarChar metadata: { // JSON category: 'tech', tags: ['ai', 'ml'], }, }

Upserting Data

Upsert updates existing entities or inserts new ones:

await client.upsert({ collection_name: 'my_collection', data: [ { id: 1, vector: [0.9, 0.8, 0.7, 0.6], // Update existing entity text: 'Updated text', }, { id: 999, vector: [0.1, 0.2, 0.3, 0.4], // Insert new entity text: 'New text', }, ], });

Partial Update

Update only specific fields:

await client.upsert({ collection_name: 'my_collection', partial_update: true, data: [ { id: 1, text: 'Updated text only', // Only update text field // vector field not included, will remain unchanged }, ], });

Insert Transformers

For Float16 and BFloat16 vectors, use transformers:

import { f32ArrayToF16Bytes } from '@zilliz/milvus2-sdk-node'; await client.insert({ collection_name: 'my_collection', data: [ { vector: [0.1, 0.2, 0.3, 0.4], // f32 array }, ], transformers: { vector: (data) => f32ArrayToF16Bytes(data), // Convert to f16 bytes }, });

Insert Response

The insert operation returns mutation results:

const result = await client.insert({ collection_name: 'my_collection', data: [/* ... */], }); console.log('IDs:', result.IDs); // Array of inserted IDs console.log('Insert count:', result.insert_cnt); console.log('Delete count:', result.delete_cnt); console.log('Upsert count:', result.upsert_cnt); console.log('Timestamp:', result.timestamp);

Data Validation

The SDK validates data before insertion:

  • Field existence: All fields must exist in the collection schema
  • Data types: Values must match field data types
  • Vector dimensions: Vector length must match field dimension
  • Binary vectors: Length must be dimension / 8
  • VarChar length: String length must not exceed max_length

Error Handling

Handle insertion errors:

try { await client.insert({ collection_name: 'my_collection', data: [/* ... */], }); } catch (error) { if (error.message.includes('field does not exist')) { console.error('Invalid field name'); } else if (error.message.includes('dimension')) { console.error('Vector dimension mismatch'); } else { console.error('Insert failed:', error); } }

Best Practices

  1. Batch inserts: Insert data in batches (1000-10000 entities per batch)
  2. Flush after insertion: Call flush() after inserting data to ensure persistence
  3. Use autoID: Let Milvus generate IDs unless you have specific requirements
  4. Validate data: Validate data before insertion to avoid errors
  5. Handle errors: Always wrap insertions in try-catch blocks
  6. Use upsert for updates: Use upsert() instead of delete + insert for updates

Complete Example

import { MilvusClient, DataType } from '@zilliz/milvus2-sdk-node'; const client = new MilvusClient({ address: 'localhost:19530', }); await client.connectPromise; // Create collection await client.createCollection({ collection_name: 'my_collection', fields: [ { name: 'id', data_type: DataType.Int64, is_primary_key: true, autoID: true, }, { name: 'vector', data_type: DataType.FloatVector, dim: 128, }, { name: 'text', data_type: DataType.VarChar, max_length: 256, }, ], }); // Insert data const result = await client.insert({ collection_name: 'my_collection', data: [ { vector: Array.from({ length: 128 }, () => Math.random()), text: 'First document', }, { vector: Array.from({ length: 128 }, () => Math.random()), text: 'Second document', }, ], }); console.log('Inserted IDs:', result.IDs); // Flush to ensure data is persisted await client.flush({ collection_names: ['my_collection'], });

Next Steps

Last updated on