Skip to content
API Reference Zilliz Cloud Milvus Attu

Data Management

This guide covers data management operations including flushing, compaction, bulk import, and segment management.

Flush ensures data is persisted to disk.

Flush a collection asynchronously:

const result = await client.flush({
collection_names: ['my_collection'],
});
console.log('Flush segment IDs:', result.coll_segIDs);

Wait until flush completes:

const result = await client.flushSync({
collection_names: ['my_collection'],
});
console.log('Flush completed:', result.flushed);

Check if segments are flushed:

const state = await client.getFlushState({
segmentIDs: [12345, 12346],
});
console.log('Flushed:', state.flushed);

Flush all collections in the cluster, or wait until the flush-all operation completes:

const flushAll = await client.flushAll();
const state = await client.getFlushAllState({
flush_all_tss: flushAll.flush_all_tss,
});
console.log('Flush-all completed:', state.flushed);

Use flushAllSync() to trigger flush-all and poll until completion:

const result = await client.flushAllSync();
console.log('Flush-all completed:', result.flushed);

flushAll() and getFlushAllState() also accept the common request options such as timeout. The legacy db_name and flush_all_ts fields are still accepted for compatibility, but flush_all_tss is preferred.

Compaction merges small segments to improve query performance.

const result = await client.compact({
collection_name: 'my_collection',
});
console.log('Compaction ID:', result.compactionID);

Pass advanced compaction options when you need to target specific partitions, channels, or segments:

const result = await client.compact({
collection_name: 'my_collection',
timetravel: 0,
majorCompaction: true,
partition_id: '12345',
channel: 'by-dev-rootcoord-dml_0',
segment_ids: [111, 222],
l0Compaction: false,
target_size: '536870912', // bytes
});

Common advanced fields:

FieldDescription
timetravelTimestamp boundary for compaction
majorCompactionRequest major compaction
partition_idTarget a specific partition
channelTarget a specific DML channel
segment_idsCompact specific segments
l0CompactionRequest L0 compaction
target_sizeTarget segment size in bytes

Check compaction status:

const state = await client.getCompactionState({
compactionID: 12345,
});
console.log('State:', state.state);

Get detailed compaction information:

const plans = await client.getCompactionStateWithPlans({
compactionID: 12345,
});
console.log('Plans:', plans.mergeInfos);

Import data from files.

const result = await client.bulkInsert({
collection_name: 'my_collection',
files: ['/path/to/data.json', '/path/to/data2.json'],
});
console.log('Task ID:', result.tasks[0].taskID);

Check import task status:

const tasks = await client.listImportTasks({
collection_name: 'my_collection',
});
tasks.tasks.forEach(task => {
console.log('Task ID:', task.taskID);
console.log('State:', task.state);
console.log('Progress:', task.progress);
});

Balance data across nodes:

await client.loadBalance({
src_nodeID: 1,
dst_nodeIDs: [2, 3],
sealed_segmentIDs: [12345, 12346],
});

Get information about query segments:

const info = await client.getQuerySegmentInfo({
collectionName: 'my_collection',
dbName: 'default',
});
info.infos.forEach(segment => {
console.log('Segment ID:', segment.segmentID);
console.log('State:', segment.state);
console.log('Level:', segment.level);
});

The segment info APIs use proto field names: collectionName and dbName.

Get information about persistent segments:

const info = await client.getPersistentSegmentInfo({
collectionName: 'my_collection',
dbName: 'default',
});
info.infos.forEach(segment => {
console.log('Segment ID:', segment.segmentID);
console.log('State:', segment.state);
console.log('Level:', segment.level);
});

Get collection metrics:

const metrics = await client.getMetric({
request: {
metric_type: 'system_info',
},
});
console.log('Metrics:', metrics);
  1. Flush regularly: Flush after bulk insertions
  2. Monitor compaction: Check compaction state for large collections
  3. Use bulk import: Use bulk import for large datasets
  4. Balance load: Use load balancing for distributed deployments
  5. Monitor segments: Check segment information for optimization