API reference

Client

Constructor

Constructor

Description

Milvus()

milvus client

Methods

API

Description

create_collection()

Create a collection.

has_collection()

Check if collection exists.

get_collection_info()

Obtain collection information.

count_entities()

Obtain the number of entity in a collection.

list_collections()

Get the list of collections.

get_collection_stats()

Obtain collection statistics information.

load_collection()

Load collection from disk to memory.

drop_collection()

drop a collection.

insert()

insert entities into specified collection.

get_entity_by_id()

Obtain entities by providing entity ids.

list_id_in_segment()

Obtain the list of ids in specified segment.

create_index()

Create an index on specified field.

drop_index()

Drop index on specified field.

create_partition()

Create a partition under specified collection.

has_partition()

Check if specified partition exists under a collection.

list_partitions()

Obtain list of partitions under a collection.

drop_partition()

Drop specified partition under a collection.

search()

Search approximate nearest entities.

search_in_segment()

Search approximate nearest entities in specified segments.

delete_entity_by_id()

Delete entities by providing entity ids.

flush()

Flush collection data from memory to storage.

compact()

Compact specified collection.

APIs

class milvus.Milvus(host=None, port=None, handler='GRPC', pool='SingletonThread', **kwargs)[source]
compact(collection_name, threshold=0.2, timeout=None, **kwargs)[source]

Compacts a specified collection. After deleting some data in a segment, you can call compact to free up the disk space occupied by the deleted data. Calling compact also deletes empty segments, but does not merge segments.

Parameters
  • collection_name (str) -- The name of the collection to compact.

  • threshold -- The threshold for compact. When the percentage of deleted entities in a segment is below the threshold, the server skips this segment when compacting the collection. The default value is 0.2, range is [0, 1].

Returns

Status of compact request. The compact request will still execute successfully if server skip some of collections, in this case the returned status will differ. Note that in current version his is an EXPERIMENTAL function.

Return type

Status.

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

count_entities(collection_name, timeout=30)[source]

Returns the number of entities in a specified collection.

Parameters

collection_name (str) -- The name of the collection to count entities of.

Returns

The number of entities

Return type

int

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

create_collection(collection_name, fields, timeout=30)[source]

Creates a collection.

Parameters

collection_name -- The name of the collection. A collection name can only include

numbers, letters, and underscores, and must not begin with a number. :type str :param fields: Field parameters. :type fields: str

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

create_index(collection_name, field_name, params, timeout=None, **kwargs)[source]

Creates an index for a field in a specified collection. Milvus does not support creating multiple indexes for a field. In a scenario where the field already has an index, if you create another one that is equivalent (in terms of type and parameters) to the existing one, the server returns this index to the client; otherwise, the server replaces the existing index with the new one.

Parameters
  • collection_name (str) -- The name of the collection to create field indexes.

  • field_name (str) -- The name of the field to create an index for.

  • params (dict) -- Indexing parameters.

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

create_partition(collection_name, partition_tag, timeout=30)[source]

Creates a partition in a specified collection. You only need to import the parameters of partition_tag to create a partition. A collection cannot hold partitions of the same tag, whilst you can insert the same tag in different collections.

Parameters
  • collection_name (str) -- The name of the collection to create partitions in.

  • partition_tag (str) -- Name of the partition.

  • partition_tag -- The tag name of the partition.

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

delete_entity_by_id(collection_name, ids, timeout=None)[source]

Deletes the entities specified by a given list of IDs.

Parameters
  • collection_name (str) -- The name of the collection to remove entities from.

  • ids (list[int]) -- A list of IDs of the entities to delete.

Returns

Status of delete request. The delete request will still execute successfully if Some of ids may not exist in specified collection, in this case the returned status will differ. Note that in current version his is an EXPERIMENTAL function.

Return type

Status.

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

drop_collection(collection_name, timeout=30)[source]

Deletes a specified collection.

Parameters

collection_name (str) -- The name of the collection to delete.

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

drop_index(collection_name, field_name, timeout=30)[source]

Removes the index of a field in a specified collection.

Parameters
  • collection_name (str) -- The name of the collection to remove the field index from.

  • field_name (str) -- The name of the field to remove the index of.

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

drop_partition(collection_name, partition_tag, timeout=30)[source]

Deletes the specified partitions in a collection.

Parameters
  • collection_name (str) -- The name of the collection to delete partitions from.

  • partition_tag (str) -- The tag name of the partition to delete.

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

flush(collection_name_array=None, timeout=None, **kwargs)[source]

Flushes data in the specified collections from memory to disk. When you insert or delete data, the server stores the data in the memory temporarily and then flushes it to the disk at fixed intervals. Calling flush ensures that the newly inserted data is visible and the deleted data is no longer recoverable.

Parameters

collection_name_array (An array of names of the collections to flush.) -- list[str]

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

get_collection_info(collection_name, timeout=30)[source]

Returns information of a specified collection, including field information of the collection and index information of fields.

Parameters

collection_name (str) -- The name of the collection to describe.

Returns

The information of collection to describe.

Return type

dict

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

get_collection_stats(collection_name, timeout=30)[source]

Returns statistical information about a specified collection, including the number of entities and the storage size of each segment of the collection.

Parameters

collection_name (str) -- The name of the collection to get statistics about.

Returns

The collection stats.

Return type

dict

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

get_entity_by_id(collection_name, ids, fields=None, timeout=None)[source]

Returns the entities specified by given IDs.

Parameters
  • collection_name (str) -- The name of the collection to retrieve entities from.

  • ids (list[int]) -- A list of IDs of the entities to retrieve.

Returns

The entities specified by given IDs.

Return type

Entities

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

has_collection(collection_name, timeout=30)[source]

Checks whether a specified collection exists.

Parameters

collection_name (str) -- The name of the collection to check.

Returns

If specified collection exists

Return type

bool

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

has_partition(collection_name, partition_tag, timeout=30)[source]

Checks if a specified partition exists in a collection.

Parameters
  • collection_name (str) -- The name of the collection to find the partition in.

  • partition_tag (str) -- The tag name of the partition to check

Returns

Whether a specified partition exists in a collection.

Return type

bool

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

insert(collection_name, entities, ids=None, partition_tag=None, params=None, timeout=None, **kwargs)[source]

Inserts entities in a specified collection.

Parameters
  • collection_name (str.) -- The name of the collection to insert entities in.

  • entities (list) -- The entities to insert.

  • ids (list[int]) -- The list of ids corresponding to the inserted entities.

  • partition_tag (str) -- The name of the partition to insert entities in. The default value is None. The server stores entities in the “_default” partition by default.

Returns

list of ids of the inserted vectors.

Return type

list[int]

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

list_collections(timeout=30)[source]

Returns a list of all collection names.

Returns

List of collection names, return when operation is successful

Return type

list[str]

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

list_id_in_segment(collection_name, segment_id, timeout=None)[source]

Returns all entity IDs in a specified segment.

Parameters
  • collection_name (str) -- The name of the collection that contains the specified segment

  • segment_id (int) -- The ID of the segment. You can get segment IDs by calling the get_collection_stats() method.

Returns

List of IDs in a specified segment.

Return type

list[int]

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

list_partitions(collection_name, timeout=30)[source]

Returns a list of all partition tags in a specified collection.

Parameters

collection_name (str) -- The name of the collection to retrieve partition tags from.

Returns

A list of all partition tags in specified collection.

Return type

list[str]

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

load_collection(collection_name, timeout=None)[source]

Loads a specified collection from disk to memory.

Parameters

collection_name (str) -- The name of the collection to load.

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

search(collection_name, dsl, partition_tags=None, fields=None, timeout=None, **kwargs)[source]

Searches a collection based on the given DSL clauses and returns query results.

Parameters
  • collection_name (str) -- The name of the collection to search.

  • dsl (dict) -- The DSL that defines the query.

  • partition_tags (list[str]) -- The tags of partitions to search.

  • fields (list[str]) -- The fields to return in the search result

Returns

Query result.

Return type

QueryResult

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok

search_in_segment(collection_name, segment_ids, dsl, fields=None, timeout=None, **kwargs)[source]

Searches in the specified segments of a collection.

The Milvus server stores entity data into multiple files. Searching for entities in specific files is a method used in Mishards. Obtain more detail about Mishards, see <a href="https://github.com/milvus-io/milvus/tree/master/shards">

Parameters
  • collection_name (str:param collection_name: table name been queried) -- The name of the collection to search.

  • dsl (dict) -- The DSL that defines the query.:type collection_name: str

  • partition_tags (list[str]:param file_ids: Specified files id array) -- The tags of partitions to search.

  • fields (list[str]:type query_records: list[list[float]]) -- The fields to return in the search result

Returns

Query result.

Return type

QueryResult

Raises

RpcError: If grpc encounter an error ParamError: If parameters are invalid BaseException: If the return result from server is not ok