Milvus Vector Search¶

High-performance vector database optimized for billion-scale vector search with IBM watsonx integration.

Overview¶

Milvus is an open-source vector database built for AI applications, offering high-performance similarity search and analytics for embedding vectors. This building block provides a complete FastAPI service for ingesting documents from IBM Cloud Object Storage (COS) into Milvus with Docling-based parsing and IBM Watsonx embeddings.

IBM Products Used¶

This building block leverages the following IBM products and services:

watsonx.data: Data lakehouse platform with integrated Milvus vector database support
watsonx.ai: Foundation models and embedding services for document vectorization
IBM Cloud Object Storage (COS): Scalable object storage for document repositories
Milvus: Open-source vector database for semantic search (integrated with watsonx.data)

Features¶

Data Ingestion Service¶

FastAPI-based REST API for document ingestion
Docling-based document parsing and processing
IBM Watsonx embedding generation
Automatic vector storage and indexing in Milvus
Interactive Swagger UI for API testing

Document Processing¶

Support for multiple document formats (PDF, HTML, JSON, Markdown)
Intelligent chunking strategies:
DOCLING_DOCS: Structure-aware chunking based on document layout
MARKDOWN: Preserves markdown formatting during chunking
RECURSIVE: Hierarchical text splitting
Metadata extraction and preservation

Vector Operations¶

Automatic collection creation and schema management
Efficient vector upsert operations
Configurable embedding dimensions
Index optimization for fast similarity search

Architecture¶

IBM COS → FastAPI Service → Docling Parser → Watsonx Embeddings → Milvus DB

The service pulls documents from COS, processes them with Docling, generates embeddings using Watsonx, and stores the vectors in Milvus for semantic search.

Getting Started¶

Prerequisites¶

Requirements

watsonx.data environment with Milvus database configured
Setup Guide
Python 3.13 installed locally
git installed locally
Milvus credentials (host, port, username, password)
IBM COS credentials (API key, endpoint, service instance ID)

Installation¶

Clone the repository:

git clone https://github.com/ibm-self-serve-assets/building-blocks.git
cd building-blocks/data-for-ai/vector-search/milvus/assets/data-ingestion-asset/

Create a Python virtual environment:

python3 -m venv virtual-env
source virtual-env/bin/activate
pip3 install -r requirements.txt

Configure environment variables:
```
cp .env.example .env
```
Update .env with your credentials:

Milvus Credentials: - WXD_MILVUS_HOST: Milvus host URL from watsonx.data UI - WXD_MILVUS_PORT: Milvus port from watsonx.data UI - WXD_MILVUS_USER: Set to 'ibmlhapikey' - WXD_MILVUS_PASSWORD: IBM Cloud API Key for Milvus service account

IBM COS Credentials: - IBM_CLOUD_API_KEY: IBM Cloud API Key for COS access - COS_ENDPOINT: Service endpoint URL for your COS instance - COS_SERVICE_INSTANCE_ID: CRN value of COS instance

API Security: - REST_API_KEY: Set a unique value for API authentication

Starting the Application¶

Start the application locally:

python3 main.py

Or using Uvicorn:

uvicorn app.main:app --host 127.0.0.1 --port 4050 --reload

Access Swagger UI at: http://127.0.0.1:4050/docs

API Usage¶

Ingestion Endpoint¶

Endpoint: POST /ingest-files

Request Body:

{
    "bucket_name": "<cos-bucket>",
    "collection_name": "<milvus-collection>",
    "chunk_type": "DOCLING_DOCS"
}

Parameters:

bucket_name: Name of the S3/COS bucket containing documents
collection_name: Target Milvus collection to create or upsert into
chunk_type: Chunking strategy (DOCLING_DOCS, MARKDOWN, RECURSIVE)

Headers:

REST_API_KEY: <your-secret>
Content-Type: application/json

Example using Python:

import json, requests

url = "http://127.0.0.1:4050/ingest-files"

payload = json.dumps({
    "bucket_name": "<cos-bucket>",
    "collection_name": "<milvus-collection>",
    "chunk_type": "DOCLING_DOCS"
})

headers = {
    "REST_API_KEY": "<your-secret>",
    "Content-Type": "application/json"
}

response = requests.post(url, headers=headers, data=payload)
print(response.text)

Testing via Swagger UI¶

Navigate to http://127.0.0.1:4050/docs
Expand POST /ingest-files
Click Try it out
Fill in bucket_name, collection_name, and chunk_type
Click Execute
Verify the 200 response and review ingestion statistics

Use Cases¶

Semantic Search: Find documents based on meaning, not just keywords
RAG Pipelines: Retrieval-augmented generation for LLMs
Knowledge Bases: Build searchable knowledge repositories
Document Discovery: Find similar documents across large collections
Question Answering: Retrieve relevant context for Q&A systems
Content Recommendation: Suggest similar content based on embeddings

Chunking Strategies¶

DOCLING_DOCS¶

Structure-aware chunking based on document layout
Preserves document hierarchy (headings, sections, paragraphs)
Optimal for well-structured documents
Best for maintaining context across document sections

MARKDOWN¶

Preserves markdown formatting during chunking
Respects markdown structure (headers, lists, code blocks)
Ideal for markdown-formatted documentation
Maintains formatting for better readability

RECURSIVE¶

Hierarchical text splitting with configurable chunk size
Splits on multiple separators (paragraphs, sentences, words)
Flexible for various document types
Good for general-purpose chunking

Performance Considerations¶

Optimization Guidelines

Batch Processing: Process multiple documents in parallel for faster ingestion
Chunk Size: Balance between context preservation and retrieval precision
Embedding Dimensions: Higher dimensions provide more accuracy but slower search
Index Type: Choose appropriate index type (IVF_FLAT, HNSW) based on use case
Collection Sharding: Distribute data across multiple shards for scalability

Coming Soon¶

Upcoming Features

.png and .jpg VLM (Vision Language Model) support
Additional Docling processing functions:
Image annotation
Table exports
Enhanced error logging with structured logs
Performance optimization for large-scale ingestion

Resources¶

Team¶

Created and Architected By: Anand Das, Anindya Neogi, Joseph Kim, Shivam Solanki

Support¶

For issues or questions, please refer to the GitHub repository or open an issue.