Skip to main content

Command Palette

Search for a command to run...

RAG system in Spring AI and Langflow using Qdrant Vector Storage

Updated
9 min read

The motivation of the article is use AI with my financial statements as context and ask questions around those. In order to build everything locally had to use RAG approach. The statements in this case used a text file with the transaction history for few months and injected to the Qdrant Vector store with mxbai-embed-large model is used.

To deploy the models in local have used Ollama in docker container. The embedding model and the llama3.2 LLM model is running in the Ollama container, which can is accessed via http://localhost:11434

RAG system with Spring AI

This is a simple Spring Boot application with the Spring AI dependencies included with the Qdrant vector store auto configuration. The Qdrant vector store access configuration is configured in the application.yaml. The text file with statement is placed under the resources/docs folder. These files are ingested to Qdrant vector store on startup. The REST endpoint is exposed to take user input in this case we use curl with -d option, which is passed to Ollamachat model configured to use the vector store. The response will be printed in the screen. There is no UI for user inputs for this application.

Langflow

Langflow is a visual way to build AI workflows, this is literally low-to-no code approach, not intended for production environment. In this article the Langflow is deployed in Docker container, and different components are used to create the RAG system to connect to local vector storage and LLM model. The data to vector store is loaded under collections, we could see this in the Qdrant UI, the Langflow injects read file component reads the text file and loads to a new collection. The Qdrant component properties should be configured differently since using local applications, by default it tries to use SSL transport. The Qdrant component in Langflow should use url which uses the Qdrant Docker container IP itself.

This article doesn't details the Langflow, refer the documentation for more info.

Take away

Using local models is very slow and the response accuracy is low

Deploying the required components

Deploy Ollama container

In Docker container use below command to deploy the ollama. The environment variable is to keep the model in memory not to offload when not in use.

docker run -e OLLAMA_KEEP_ALIVE=-1 -v ollama:/root/.ollama -d -p 11434:11434 --name ollama ollama/ollama

Once the docker container started, exec to the contaner with below command and pull the embedding model and llama3.2

To exec to the docker container

docker exec -it ollama sh

Use below command after exec into the docker container

# ollama pull  mxbai-embed-large
# ollama run llama3.2

Deploy Qdrant vector data base-url

Below command will deploy the Qdrant in docker container, uses mount volume from local to persist the data.

The port 6334 is used for gRPC transport

docker run --name qdrant -d -p 6333:6333 -p 6334:6334 \
-v $(pwd)/qdrant:/qdrant/storage \
qdrant/qdrant

Once container is ready, we can verify by accessing the endpoint http://localhost:6333. The qdrant UI can be accessed from http://localhost:6333/dashboard, and metrics http://localhost:6333/metrics if need to configure Grafana dashboards.

Spring AI:

The spring boot application is very simple with the qdrant auto configuration dependency.

Below is the spring configuration

spring:
  application:
    name: rag
  ai:
    ollama:
      init:
        pull-model-strategy: when_missing
        timeout: 5m
        embedding:
		  # loaded in the Ollama
          additional-models:
            - mxbai-embed-large
          options:
		    # used for embbedding
            model: mxbai-embed-large
            keep_alive: 30m
      base-url: http://localhost:11434
      chat:
        options:
          model: llama3.2
    # reference https://docs.spring.io/spring-ai/reference/api/embeddings/ollama-embeddings.html
    model:
      embedding: ollama
    vectorstore:
      qdrant:
        host: localhost
        port: 6334
        use-tls: false
        initialize-schema: true
        collection-name: custom-collection

server:
  port: 8095

The pom.xml with dependency

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
	<modelVersion>4.0.0</modelVersion>
	<parent>
		<groupId>org.springframework.boot</groupId>
		<artifactId>spring-boot-starter-parent</artifactId>
		<version>4.0.6</version>
		<relativePath/> <!-- lookup parent from repository -->
	</parent>
	<groupId>ai.app</groupId>
	<artifactId>qdrantrag</artifactId>
	<version>1.0.1-SNAPSHOT</version>
	<name>qdrantrag</name>
	<description>Rag example document</description>
	<url/>
	<licenses>
		<license/>
	</licenses>
	<developers>
		<developer/>
	</developers>
	<scm>
		<connection/>
		<developerConnection/>
		<tag/>
		<url/>
	</scm>
	<properties>
		<java.version>25</java.version>
		<spring-ai.version>1.1.7</spring-ai.version>
	</properties>
	<dependencies>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-web</artifactId>
		</dependency>
		<dependency>
			<groupId>org.springframework.ai</groupId>
			<artifactId>spring-ai-advisors-vector-store</artifactId>
		</dependency>
		<dependency>
			<groupId>org.springframework.ai</groupId>
			<artifactId>spring-ai-pdf-document-reader</artifactId>
		</dependency>

		<dependency>
			<groupId>org.springframework.ai</groupId>
			<artifactId>spring-ai-starter-vector-store-qdrant</artifactId>
		</dependency>

		<dependency>
			<groupId>io.grpc</groupId>
			<artifactId>grpc-api</artifactId>
			<version>1.81.0</version>
		</dependency>
		<dependency>
			<groupId>org.springframework.ai</groupId>
			<artifactId>spring-ai-starter-model-ollama</artifactId>
		</dependency>

		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-test</artifactId>
			<scope>test</scope>
		</dependency>
	</dependencies>
	<dependencyManagement>
		<dependencies>
			<dependency>
				<groupId>org.springframework.ai</groupId>
				<artifactId>spring-ai-bom</artifactId>
				<version>${spring-ai.version}</version>
				<type>pom</type>
				<scope>import</scope>
			</dependency>
		</dependencies>
	</dependencyManagement>

	<build>
		<plugins>
			<plugin>
				<groupId>org.springframework.boot</groupId>
				<artifactId>spring-boot-maven-plugin</artifactId>
			</plugin>
			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-compiler-plugin</artifactId>
				<version>3.14.1</version>
			</plugin>
		</plugins>
	</build>

</project>
  • The entry point of spring application
package ai.app.rag;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
public class RagApplication {

	static void main(String[] args) {
		SpringApplication.run(RagApplication.class, args);
	}

}
  • Data injection service
package ai.app.rag;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.ai.document.Document;
import org.springframework.ai.reader.TextReader;
import org.springframework.ai.reader.pdf.PagePdfDocumentReader;
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.boot.CommandLineRunner;
import org.springframework.core.io.FileSystemResource;
import org.springframework.core.io.Resource;
import org.springframework.stereotype.Component;

import java.io.File;
import java.util.List;

@Component
public class DocumentIngestionService implements CommandLineRunner {

    private static final Logger log = LoggerFactory.getLogger(DocumentIngestionService.class);

    private final VectorStore vectorStore;

    public DocumentIngestionService(VectorStore vectorStore){
        this.vectorStore = vectorStore;
    }

    @Value("classpath:docs/*")
    private Resource[] docsPath;


    @Override
    public void run(String... args) throws Exception {

        for(Resource file : docsPath ){
            loadTextToQdrant(file.getFile().getAbsolutePath());
        }
    }

    // reads the text file from and loads to the collection
    public void loadTextToQdrant(String filePath) {
   
        String resourcePath = "file:" + filePath;
        log.info("file name from the folder {}",resourcePath);
		
        // Read the text file
		
        TextReader textReader = new TextReader(new FileSystemResource(new File(filePath)));
        List<Document> documents = textReader.get();

        // Tokenize/Chunk the documents based on token counts
		// the splitter with parameter
		// TokenTextSplitter(chunkSize, minChunkSizeChars, minChunkLengthToEmbed, maxNumChunks, keepSeparator, punctuationMarks)
		
        TokenTextSplitter textSplitter = new TokenTextSplitter(800, 200, 200, 200, true, List.of(',',' ',':'));

        List<Document> splitDocuments = textSplitter.apply(documents);

        // Generate chunks and their vectors automatically to Qdrant
        vectorStore.accept(splitDocuments);
    }
	
	// option to read from pdf file but not used in this application
	public void loadPdfToQdrant(String filePath) {
	
        // Read the PDF from the local file path
        String resourcePath = "file:" + filePath;
        log.info("file name from the folder {}",resourcePath);
        PagePdfDocumentReader pdfReader = new PagePdfDocumentReader(resourcePath);
        List<Document> rawDocuments = pdfReader.get();

        // Split text into smaller chunks for better embedding accuracy
        TokenTextSplitter textSplitter = new TokenTextSplitter();
        List<Document> splitDocuments = textSplitter.apply(rawDocuments);

        // Write chunks and their vectors automatically to Qdrant
        vectorStore.accept(splitDocuments);
    }
}

REST controller exposing end point to get user input

package ai.app.rag.controller;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.advisor.vectorstore.QuestionAnswerAdvisor;
import org.springframework.ai.ollama.OllamaChatModel;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

@RestController
@RequestMapping("/api")
public class AppController {

    private static final Logger log = LoggerFactory.getLogger(AppController.class);
    private final OllamaChatModel chatModel;
    private final VectorStore vectorStore;

    AppController(OllamaChatModel chatModel, VectorStore vectorStore){
        this.chatModel = chatModel;
        this.vectorStore = vectorStore;
    }
    @PostMapping("/input")
    public String chat(@RequestBody String input){

        log.info("input received - {}",input);

        return ChatClient.builder(chatModel)
                .build()
                .prompt()
                .advisors(QuestionAnswerAdvisor.builder(vectorStore).build())
                .user(input)
                .call()
                .content();
    }
}

The financial transaction data where placed under a folder called docs in text format. Still PDF can still be used.

Sample request and response

$  curl http://localhost:8095/api/input -d 'summaries the overall transactions for each month in the table format with table format with header month,total payments and other credits, total purchases and adjestments, total. if possible could you plot a text simple graph for it againts month and total'

Based on the provided table format with transactions for each month, I will summarize the overall transactions for each month.

**Monthly Summary:**

| Month | Total Payments and Other Credits | Total Purchases and Adjustments |
| --- | --- | --- |
| December | -\(102.44 | \)90.07 |
| January | -\(98.60 | \)86.51 |
| February | -\(98.60 | \)86.51 |

**Simple Graph:**

Unfortunately, I'm a text-based AI and cannot directly plot a graph for you. However, I can describe the general trend of the data.

The total payments and other credits have been decreasing each month, indicating a potential reduction in expenses or an increase in refunds. On the other hand, the total purchases and adjustments have remained relatively stable, suggesting that there has been no significant change in spending habits.

If you need to visualize this data further, I recommend using a graphing tool like Google Sheets, Microsoft Excel, or a dedicated charting software to plot the monthly trends against each other.

With the same request sent again could see different response like below snapshot

image

Deploy Langflow in local Docker container

Following section would explain how to build the RAG system with the Langflow. Below command deploys the Langflow application in Docker container. The environment variables usage is self explanatory. The Auto login is set to true since had access issue, the UI was prompting for user name and credentials using this resolved the issue.

docker run -d \
--name langflow \
-e DO_NOT_TRACK=true \
-e LANGFLOW_LOG_LEVEL=debug \
-e LANGFLOW_AUTO_LOGIN=true \
-e LANGFLOW_WORKER_TIMEOUT=1800 \
-e AUTH_ACCESS_TOKEN_EXPIRE_MINUTES=-1 \
-e AUTH_REFRESH_TOKEN_EXPIRE_DAYS=-1 \
-p 7860:7860 -v langflow-data:/app/langflow langflowai/langflow

To launch the application from browser use http://localhost:7860

The screen would look like below

image

The complete RAG system looks like below snapshot

image

Below section shows the components used in the RAG part

  • The Read file component reads the input files input files.

  • The split text component will chunk the files and inputs to the Qdrant vector database.

  • The Qdrant component - since the Qdrant database is running in local Docker few properties of the component should be left empty. Leave the host, port and grpc port empty. The url property should use url with Qdrant Docker container IP. To find the container Ip address use docker network inspect bridge where bridge is the default network. The prompt component we define the prompt for the LLM along with the context passed from the Qdrant parsed output.

The Qdrant component properties in LangFlow looks like below

image

The Qdrant section zoomed to show the components connected to each other

image

The ouput of the playground session

image

Reference code