Data Types & Schemas
Milvus supports various data types for storing different kinds of data. Understanding these types is essential for designing effective schemas.
Supported Data Types
Section titled “Supported Data Types”Scalar Types
Section titled “Scalar Types”Integer Types
Section titled “Integer Types”- Int8: 8-bit signed integer (-128 to 127)
- Int16: 16-bit signed integer (-32,768 to 32,767)
- Int32: 32-bit signed integer
- Int64: 64-bit signed integer
import { DataType } from '@zilliz/milvus2-sdk-node';
{ name: 'age', data_type: DataType.Int64,}Floating Point Types
Section titled “Floating Point Types”- Float: 32-bit floating point number
- Double: 64-bit floating point number
{ name: 'score', data_type: DataType.Float,}Boolean Type
Section titled “Boolean Type”- Bool: Boolean value (true/false)
{ name: 'is_active', data_type: DataType.Bool,}String Types
Section titled “String Types”- VarChar: Variable-length string (requires
max_length)
{ name: 'title', data_type: DataType.VarChar, max_length: 256,}JSON Type
Section titled “JSON Type”- JSON: JSON object
{ name: 'metadata', data_type: DataType.JSON,}Geometry
Section titled “Geometry”Geospatial data stored in WKT (Well-Known Text) format.
{ name: 'location', data_type: DataType.Geometry,}
// Insert as WKT string{ location: 'POINT(121.47 31.23)' }Timestamptz
Section titled “Timestamptz”Timestamp with timezone information.
{ name: 'created_at', data_type: DataType.Timestamptz,}Struct
Section titled “Struct”Nested structured data type for complex objects.
{ name: 'metadata', data_type: DataType.Struct,}Vector Types
Section titled “Vector Types”FloatVector
Section titled “FloatVector”Dense float vector (most common):
{ name: 'embedding', data_type: DataType.FloatVector, dim: 128, // Required: dimension of the vector}BinaryVector
Section titled “BinaryVector”Binary vector (for binary embeddings):
{ name: 'binary_embedding', data_type: DataType.BinaryVector, dim: 128, // Must be multiple of 8}SparseFloatVector
Section titled “SparseFloatVector”Sparse vectors store only non-zero values with their indices. Commonly used for BM25 full-text search and keyword-based retrieval.
{ name: 'sparse_vector', data_type: DataType.SparseFloatVector,}Sparse vectors can be inserted in two formats:
Dictionary format (recommended):
// Keys are indices, values are float weights{ 10: 0.5, 100: 0.3, 500: 0.8, 1200: 0.1 }Array of tuples format:
// [index, value] pairs[ [10, 0.5], [100, 0.3], [500, 0.8], [1200, 0.1],];Note: Dimension is determined automatically from the maximum index across all vectors.
Float16Vector
Section titled “Float16Vector”16-bit float vector:
{ name: 'f16_embedding', data_type: DataType.Float16Vector, dim: 128,}BFloat16Vector
Section titled “BFloat16Vector”BFloat16 vector:
{ name: 'bf16_embedding', data_type: DataType.BFloat16Vector, dim: 128,}Int8Vector
Section titled “Int8Vector”8-bit integer vector:
{ name: 'int8_embedding', data_type: DataType.Int8Vector, dim: 128,}Nullable Vector Fields
Section titled “Nullable Vector Fields”Vector fields can be marked as nullable. Insert or upsert null for rows that do not have an embedding yet; query and search results return null for those fields.
{ name: 'optional_embedding', data_type: DataType.FloatVector, dim: 128, nullable: true,}
await client.insert({ collection_name: 'my_collection', data: [ { id: 1, optional_embedding: [/* 128 floats */] }, { id: 2, optional_embedding: null }, ],});Nullable vector payloads are supported for FloatVector, BinaryVector, Float16Vector, BFloat16Vector, SparseFloatVector, and Int8Vector.
Complex Types
Section titled “Complex Types”Array of scalar values:
{ name: 'tags', data_type: DataType.Array, element_type: DataType.VarChar, max_capacity: 100,}ArrayOfVector
Section titled “ArrayOfVector”DataType.ArrayOfVector stores multiple vectors in a single field value. This is useful for documents split into chunks, multi-vector embeddings, or struct array payloads that contain vector arrays.
{ name: 'chunk_embeddings', data_type: DataType.ArrayOfVector, element_type: DataType.FloatVector, dim: 128, max_capacity: 32,}
await client.insert({ collection_name: 'my_collection', data: [ { id: 1, chunk_embeddings: [ [/* first 128-d vector */], [/* second 128-d vector */], ], }, ],});Supported element vector types include dense, binary, float16, bfloat16, sparse, and int8 vector payloads where supported by Milvus.
Struct
Section titled “Struct”Nested structure:
{ name: 'user_info', data_type: DataType.Struct, element_type: { name: 'name', data_type: DataType.VarChar, max_length: 100, },}Field Schema Definition
Section titled “Field Schema Definition”Each field in a collection schema must include:
{ name: 'field_name', // Required: field name data_type: DataType.Int64, // Required: data type description: 'Field description', // Optional is_primary_key: false, // Optional: primary key flag autoID: false, // Optional: auto-generate IDs max_length: 256, // Required for VarChar dim: 128, // Required for vector types nullable: false, // Optional: allow null values default_value: undefined, // Optional scalar default value external_field: undefined, // Optional: source field name for external collections}Primary Key Fields
Section titled “Primary Key Fields”Every collection must have exactly one primary key field:
{ name: 'id', data_type: DataType.Int64, is_primary_key: true, autoID: true, // Let Milvus generate IDs automatically}Or with manual IDs:
{ name: 'user_id', data_type: DataType.Int64, is_primary_key: true, autoID: false, // You provide IDs}Collection Schema Creation
Section titled “Collection Schema Creation”Basic Schema
Section titled “Basic Schema”import { MilvusClient, DataType } from '@zilliz/milvus2-sdk-node';
const schema = [ { name: 'id', data_type: DataType.Int64, is_primary_key: true, autoID: true, }, { name: 'vector', data_type: DataType.FloatVector, dim: 128, }, { name: 'text', data_type: DataType.VarChar, max_length: 256, },];
await client.createCollection({ collection_name: 'my_collection', fields: schema,});Schema with Multiple Vector Fields
Section titled “Schema with Multiple Vector Fields”const schema = [ { name: 'id', data_type: DataType.Int64, is_primary_key: true, autoID: true, }, { name: 'text_vector', data_type: DataType.FloatVector, dim: 768, }, { name: 'image_vector', data_type: DataType.FloatVector, dim: 512, }, { name: 'metadata', data_type: DataType.JSON, },];Schema with Consistency Level
Section titled “Schema with Consistency Level”await client.createCollection({ collection_name: 'my_collection', fields: schema, consistency_level: 'Bounded', // 'Strong', 'Session', 'Bounded', 'Eventually'});Dynamic Schema
Section titled “Dynamic Schema”Enable dynamic schema to add fields without redefining the schema:
await client.createCollection({ collection_name: 'my_collection', fields: [ { name: 'id', data_type: DataType.Int64, is_primary_key: true, autoID: true, }, { name: 'vector', data_type: DataType.FloatVector, dim: 128, }, ], enable_dynamic_field: true, // Enable dynamic fields});With dynamic schema enabled, you can insert data with additional fields:
await client.insert({ collection_name: 'my_collection', data: [ { vector: [ /* ... */ ], dynamic_field_1: 'value1', // Automatically added dynamic_field_2: 123, // Automatically added }, ],});Schema Validation
Section titled “Schema Validation”The SDK validates schemas before creating collections. Common validation errors:
- Missing primary key field
- Multiple primary key fields
- Missing dimension for vector fields
- Missing max_length for VarChar fields
- Invalid dimension for BinaryVector (must be multiple of 8)
Schema Best Practices
Section titled “Schema Best Practices”- Choose appropriate data types: Use Int64 for IDs, FloatVector for embeddings
- Set reasonable max_length: For VarChar fields, set max_length based on expected content
- Use autoID: Let Milvus generate IDs unless you have specific requirements
- Enable dynamic schema: For flexible schemas that may change over time
- Document fields: Use descriptions to document field purposes
Next Steps
Section titled “Next Steps”- Learn about Collection Management
- Explore Data Operations
- Check out Best Practices