Skip to content
API Reference Zilliz Cloud Milvus Attu

Deployment Guide

This guide provides an overview of deploying applications using the Milvus Node.js SDK to various platforms.

The Milvus SDK provides two clients:

  • MilvusClient (gRPC): Default client, best for traditional server deployments
  • HttpClient: Recommended for serverless platforms

Use HttpClient when deploying to:

  • Serverless platforms (Vercel, AWS Lambda, Cloudflare Workers)
  • Edge computing environments
  • Firewall-restricted environments
  • HTTP-only infrastructures

Use MilvusClient (gRPC) for:

  • Traditional servers (Node.js, Express, NestJS)
  • Long-lived connections
  • Maximum performance requirements
  • Dedicated infrastructure

Deploy to Vercel serverless functions with minimal configuration.

Quick Start:

api/search.js
import { HttpClient } from '@zilliz/milvus2-sdk-node';
let client;
function getClient() {
if (!client) {
client = new HttpClient({
baseURL: process.env.MILVUS_ENDPOINT,
token: process.env.MILVUS_TOKEN,
timeout: 10000,
});
}
return client;
}
export default async function handler(req, res) {
const milvusClient = getClient();
const results = await milvusClient.search({
collection_name: 'my_collection',
data: req.body.vectors,
limit: 10,
});
res.status(200).json({ results });
}

Learn more: Vercel Deployment Guide

Deploy to Cloudflare Workers for edge computing.

Quick Start:

worker.js
import { HttpClient } from '@zilliz/milvus2-sdk-node';
export default {
async fetch(request, env) {
const client = new HttpClient({
baseURL: env.MILVUS_ENDPOINT,
token: env.MILVUS_TOKEN,
timeout: 25000,
});
const results = await client.search({
collection_name: 'my_collection',
data: [/* vectors */],
limit: 10,
});
return new Response(JSON.stringify({ results }), {
headers: { 'Content-Type': 'application/json' },
});
},
};

Learn more: Cloudflare Workers Deployment Guide

Deploy to AWS Lambda for scalable serverless functions.

Quick Start:

index.js
import { HttpClient } from '@zilliz/milvus2-sdk-node';
let client;
function getClient() {
if (!client) {
client = new HttpClient({
baseURL: process.env.MILVUS_ENDPOINT,
token: process.env.MILVUS_TOKEN,
timeout: 25000,
});
}
return client;
}
export const handler = async (event) => {
const milvusClient = getClient();
const results = await milvusClient.search({
collection_name: 'my_collection',
data: JSON.parse(event.body).vectors,
limit: 10,
});
return {
statusCode: 200,
body: JSON.stringify({ results }),
};
};

Learn more: AWS Lambda Deployment Guide

For traditional Node.js servers, you can use either client:

// Using gRPC Client (Recommended)
import { MilvusClient } from '@zilliz/milvus2-sdk-node';
const client = new MilvusClient({
address: 'localhost:19530',
token: 'your-token',
});
await client.connectPromise;
// Using HTTP Client
import { HttpClient } from '@zilliz/milvus2-sdk-node';
const client = new HttpClient({
baseURL: 'http://localhost:19530',
token: 'your-token',
});

Deploy in Docker containers:

FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
EXPOSE 3000
CMD ["node", "index.js"]

Always use environment variables for sensitive configuration:

.env
MILVUS_ENDPOINT=https://your-instance.zillizcloud.com
MILVUS_TOKEN=your-api-token
const client = new HttpClient({
baseURL: process.env.MILVUS_ENDPOINT,
token: process.env.MILVUS_TOKEN,
});
PlatformClient TypeTimeout LimitBest For
VercelHTTP10s (Hobby), 60s (Pro)Next.js apps, static sites
Cloudflare WorkersHTTP30s (free), 15min (paid)Edge computing, global distribution
AWS LambdaHTTPUp to 15 minutesEnterprise serverless
Traditional ServergRPC or HTTPNo limitLong-lived applications

Reuse client instances to improve performance:

let client;
function getClient() {
if (!client) {
client = new HttpClient({
baseURL: process.env.MILVUS_ENDPOINT,
token: process.env.MILVUS_TOKEN,
});
}
return client;
}

Always handle errors appropriately:

try {
const results = await client.search({ /* ... */ });
return { success: true, results };
} catch (error) {
console.error('Milvus error:', error);
return {
success: false,
error: error.message,
status: error.status || 500,
};
}

Set timeouts based on platform limits:

const client = new HttpClient({
baseURL: process.env.MILVUS_ENDPOINT,
token: process.env.MILVUS_TOKEN,
timeout: 10000, // Adjust based on platform
});