Spring AI Concepts Tutorial With Examples java Spring Spring AI Spring Boot by devs5003 - March 5, 2025March 7, 20250 Last Updated on March 7th, 2025In the revolving world of artificial intelligence, integrating AI capabilities into applications has become gradually advisable. The Spring community addresses this need with the introduction of Spring AI (a module designed to streamline the incorporation of AI functionalities into Spring-based applications). Spring AI provides a set of abstractions and interfaces that promote continuous interaction with various AI models. It offers developers to enhance their applications with AI-driven features without needless complications. In this article we will explore Spring AI Concepts Tutorial With Examples. Table of Contents Toggle Why to Learn Spring AI Concepts Important?Spring AI Features Advantages of Spring AI Integration with AI ModelsReal-World Use Cases with Problems & Their Solutions Using Spring AIPrerequisites for Using Spring AIPrompts & Prompt Templates in Generative AIImportant TerminologiesTokenTemperatureThe Spring AI Chat Model APIKey Components of the Spring AI Chat Model APIChatModel InterfaceStreamingChatModel InterfacePrompt ClassMessage InterfaceChatOptions InterfaceChatResponse ClassGeneration ClassStep-by-Step Guide to Implement POC using Spring AI Chat Model APIStep#1: Create a Spring Boot ProjectAdding Dependencies Manually (Alternative Approach)Step#2: Configure OpenAI API in application.propertiesStep#3: Create a Chat ServiceStep#4: Create a REST ControllerStep#5: Error Handling and LoggingRun and Test the Application Why to Learn Spring AI Concepts Important? The Spring AI module is crucial for developers who want to integrate AI capabilities into their Java applications without the complexity of manually handling AI models, APIs, and data processing. It provides a structured and standardized approach to working with AI models, and makes it easier to develop, deploy, and manage AI-driven applications. Here are the key reasons why learning Spring AI is important: Smooth AI Integration in Java Applications: Traditionally, integrating AI models in Java requires manual API calls, handling responses, and managing configurations. Spring AI simplifies this process by offering pre-built connectors for AI services such as OpenAI, Hugging Face, and Google Vertex AI etc. Familiarity with Spring Ecosystem: Developers already working with Spring Boot can easily adopt Spring AI since it follows the same programming model. This reduces the learning curve compared to other AI integration approaches. Support for Multiple AI Providers: Spring AI allows switching between different AI models (e.g., OpenAI’s GPT, Google Gemini, Llama, Ollama, DeepSeek etc.) without significant changes in code. This flexibility helps in choosing the most cost-effective and efficient AI provider as per the requirement. Scalability and Deployment Flexibility: Spring AI supports cloud-based AI models and on-premise deployments, ensuring that organizations can maintain control over sensitive AI workloads if needed. Security and Configuration Management: Spring AI provides built-in security features to manage API keys, authentication, and secure data handling when working with AI models. Standardized API Usage: Instead of writing custom code for each AI provider, Spring AI offers a common abstraction layer, that allows developers to call AI models with a unified API. Spring AI Features Broad AI Model Support – Works with major AI providers like OpenAI, Microsoft, Amazon, Google, Anthropic, and Ollama. Multiple AI Capabilities – Supports Chat Completion, Embeddings, Text-to-Image, Audio Transcription, Text-to-Speech, and Moderation. Portable API – Offers a unified API for both synchronous and streaming AI model interactions, with access to provider-specific features. Structured Output Handling – Maps AI-generated responses directly to Plain Old Java Objects (POJOs) for easier data management. Vector Database Integration – Compatible with major Vector Stores like Cassandra, Pinecone, Redis, MongoDB Atlas, PostgreSQL/PGVector, Milvus, and more. Advanced Filtering – Provides a SQL-like metadata filter API for working with different Vector Store providers. Function Calling & Tool Invocation – Allows AI models to trigger real-time client-side functions to fetch dynamic data. Observability & Monitoring – Includes tools for tracking AI operations, evaluating models, and preventing hallucinations. AI-powered Chat & Conversation Management – Supports ChatClient API, Conversational Memory, and Retrieval-Augmented Generation (RAG). Spring Boot Integration – Comes with auto-configuration and starters for AI models and vector stores, making it easy to set up via Spring Initializr (start.spring.io). Advantages of Spring AI Integration with AI Models Easy Integration: Spring AI provides ready-to-use APIs to connect with AI models. Consistent Development: Uses the familiar Spring Boot ecosystem for smooth adoption. Scalability: Supports cloud-based and on-premise AI models for different business needs. Security & Configuration Management: Built-in support for secure API calls and configurable properties. Multi-Model Support: Works with multiple AI providers like OpenAI, Hugging Face, and Vertex AI. Real-World Use Cases with Problems & Their Solutions Using Spring AI Spring AI can be used in various industries, including software development, finance, healthcare, e-commerce, customer support, etc. Below are some real-world scenarios where Spring AI provides practical solutions: Use Case Problem Solution Using Spring AI Chatbots for Customer Support Businesses need AI-powered chatbots to handle common customer queries efficiently, reducing human workload. Spring AI integrates with OpenAI’s GPT models to process user queries, generate intelligent responses, and automate customer interactions via APIs. Automated Document Summarization Reading long reports and research papers is time-consuming, and users need quick summaries. Spring AI connects with NLP models (like OpenAI’s GPT or Hugging Face Transformers) to extract key points and summarize documents efficiently. AI-Powered Code Review Developers need assistance in identifying bugs, security vulnerabilities, and code optimizations. Spring AI integrates with AI-powered code analyzers, allowing Java developers to get automated suggestions for improving their code quality. Fraud Detection in Financial Transactions Banks struggle to identify fraudulent transactions in real-time due to complex patterns. Spring AI connects with machine learning fraud detection models, analyzing transaction behaviors and flagging anomalies instantly. Personalized Product Recommendations E-commerce websites need AI-driven recommendations to increase sales and improve user experience. Spring AI integrates with recommendation models, analyzing user behavior, past purchases, and preferences to suggest relevant products. Prerequisites for Using Spring AI Before entering into the implementation, ensure you have the following: Spring Development Understanding: A common understanding of Spring Framework and Spring Boot is essential. AI Model Access: Obtain access credentials (such as API keys) for the AI model you intend to use (e.g., OpenAI, Azure OpenAI). Familiarity with Prompts: Understanding how to write general prompts is important for interacting with generative AI models. Prompts & Prompt Templates in Generative AI A prompt in generative AI serves as the input or query provided to a language model, directing it to generate a desired response. Creating clear and specific prompts is essential, as the quality of the output considerably depends on the input provided. Prompts can range from simple questions to complex instructions, depending on the application’s requirements. Prompt templates are predefined structures that standardize the format of prompts, ensuring consistency and efficiency in interactions with AI models. These templates often include placeholders for dynamic content that allow developers to customize prompts as needed. For example, observe the below prompt template: Translate the following English text into Spanish: "{text}" In this template, {text} is a placeholder that can be replaced with the actual text to be translated. Spring AI utilizes the StringTemplate library to manage such templates effectively. Important Terminologies Token The smallest unit of the text that the AI model processes. It can be word, character, or subword. For example: In “Hello, Java”, there are three tokens: Hello, comma, Java Temperature Controlls the randomness of the output/response. Lower values (eg. 0.2) indicates that the output will be more deterministic & focused. On the other hand, higher values (eg. 0.8) indicates that the output will be more creative & diverse. The Spring AI Chat Model API The Spring AI Chat Model API is designed to provide a unified and flexible interface for integrating AI-powered chat functionalities into applications. It abstracts the complexities of interacting with various AI models and allows developers to focus on building features. The API works by sending a question or part of a conversation to the AI model. The AI then responds by continuing the conversation based on what it has learned from its training. The response is sent back to the application, which can show it to the user or use it for further processing. The Spring AI Chat Model API provides an easy way for developers to use different AI models without making big changes to their code. It follows Spring’s approach of being modular and flexible, allowing smooth switching between AI models as needed. Key Components of the Spring AI Chat Model API This section explains the Spring AI Chat Model API interfaces and associated classes. ChatModel Interface The ChatModel interface defines the primary method for sending prompts to the AI model and receiving responses. It extends the Model interface with Prompt as the input and ChatResponse as the output. public interface ChatModel extends Model<Prompt, ChatResponse> { default String call(String message) { // Implementation details } @Override ChatResponse call(Prompt prompt); } The call(String message) method offers a simplified way to interact with the model using plain text, while the call(Prompt prompt) method provides more control and flexibility by allowing detailed prompt configurations. StreamingChatModel Interface For applications requiring real-time or incremental responses, the StreamingChatModel interface extends ChatModel and supports streaming outputs using Project Reactor’s Flux. public interface StreamingChatModel extends StreamingModel<Prompt, ChatResponse> { default Flux<String> stream(String message) { // Implementation details } @Override Flux<ChatResponse> stream(Prompt prompt); } This interface is particularly useful for applications like live chat systems, where responses are processed and displayed as they are generated. Prompt Class The Prompt class encapsulates the input to be sent to the AI model, consisting of a list of Message objects and optional ChatOptions. public class Prompt implements ModelRequest<List<Message>> { private final List<Message> messages; private ChatOptions modelOptions; @Override public ChatOptions getOptions() { // Implementation details } @Override public List<Message> getInstructions() { // Implementation details } // Constructors and utility methods omitted } This structure allows for complex conversational contexts to be constructed and sent to the AI model, enabling more nuanced and context-aware responses. Message Interface The Message interface represents individual messages within a conversation, each categorized by a MessageType (e.g., system, user, assistant). public interface Content { String getContent(); Map<String, Object> getMetadata(); } public interface Message extends Content { MessageType getMessageType(); } Implementations of this interface, such as UserMessage and AssistantMessage, help define the role of each message in the conversation, providing context that the AI model can use to generate appropriate responses. ChatOptions Interface ChatOptions defines configurable parameters that influence the AI model’s responses, such as the model type, temperature, and maximum tokens. public interface ChatOptions extends ModelOptions { String getModel(); Float getFrequencyPenalty(); Integer getMaxTokens(); Float getPresencePenalty(); List<String> getStopSequences(); Float getTemperature(); Integer getTopK(); Float getTopP(); ChatOptions copy(); } These options allow developers to fine-tune the behavior of the AI model to suit specific application needs. ChatResponse Class The ChatResponse class encapsulates the AI model’s output, including generated messages and associated metadata. public class ChatResponse implements ModelResponse<Generation> { private final ChatResponseMetadata chatResponseMetadata; private final List<Generation> generations; @Override public ChatResponseMetadata getMetadata() { // Implementation details } @Override public List<Generation> getResults() { // Implementation details } // Other methods omitted } the ChatResponse class represents an individual response generated by the AI model, allowing developers to extract and utilize the AI’s output efficiently. The metadata included in the response can contain additional information, such as token usage and latency, which can be useful for optimizing performance. Generation Class Finally, the Generation class implements the ModelResult<AssistantMessage> interface, meaning it provides AI-generated responses (AssistantMessage) and related metadata. public class Generation implements ModelResult<AssistantMessage> { private final AssistantMessage assistantMessage; private ChatGenerationMetadata chatGenerationMetadata; @Override public AssistantMessage getOutput() {...} @Override public ChatGenerationMetadata getMetadata() {...} // other methods omitted } It has two fields: assistantMessage, which stores the AI’s response, and chatGenerationMetadata, which holds additional information about the response. The class includes methods like getOutput() to retrieve the AI-generated message and getMetadata() to fetch metadata about the generation process. Step-by-Step Guide to Implement POC using Spring AI Chat Model API Step#1: Create a Spring Boot Project Use Spring Initializr to generate a new Spring Boot project: Spring Boot Version: 3.x Dependencies: Spring Web OpenAI Spring Reactive Web (WebFlux for streaming responses) Adding Dependencies Manually (Alternative Approach) Alternatively, add below dependencies in pom.xml <dependencies> <!-- Spring Boot AI Starter for OpenAI --> <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-openai-spring-boot-starter</artifactId> </dependency> <!-- Spring Boot Web Starter for REST API --> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <!-- Spring Boot Reactive Streaming Starter for WebFlux --> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-webflux</artifactId> </dependency> </dependencies> Additionally, add below Bill of Materials (BOM) in <dependency-management>: <dependencyManagement> <dependencies> <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-bom</artifactId> <version>${spring-ai.version}</version> <type>pom</type> <scope>import</scope> </dependency> </dependencies> </dependencyManagement> The above spring-ai-bom will manage the versions of Spring AI dependencies in a consistent way across the project. Here’s what each part does: dependencyManagement Section – Ensures that all Spring AI-related dependencies use the same version (${spring-ai.version}), avoiding conflicts and compatibility issues. Bill of Materials (BOM) – The spring-ai-bom acts as a centralized version manager, so you don’t need to specify the version for each Spring AI dependency separately when adding them. scope=import – This tells Maven to use the BOM’s dependency versions as defaults, simplifying dependency management while keeping flexibility for overrides if needed. Without BOM: <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-openai-spring-boot-starter</artifactId> <version>1.0.0</version> </dependency> With BOM (version managed centrally): <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-openai-spring-boot-starter</artifactId> </dependency> This makes project maintenance easier, cleaner, and avoids version mismatches. If we use Spring Initializr or any IDE to add dependencies, BOM is added automatically. Step#2: Configure OpenAI API in application.properties Add your OpenAI API Key and model settings in src/main/resources/application.properties: spring.ai.openai.api-key=your_openai_api_key_here spring.ai.openai.model=gpt-4o Note: Replace your_openai_api_key_here with your actual OpenAI API Key. Step#3: Create a Chat Service Now, let’s create a ChatService class that will use the Spring AI Chat Model API. import org.springframework.ai.chat.model.ChatModel; import org.springframework.ai.chat.model.ChatResponse; import org.springframework.ai.chat.prompt.Prompt; import org.springframework.ai.openai.OpenAiChatOptions; import org.springframework.stereotype.Service; @Service public class ChatService { private ChatModel chatModel; public ChatService(ChatModel chatModel) { this.chatModel = chatModel; } public String getChatResponse(String userMessage) { return chatModel.call(userMessage); } public String getPromptResponse(String prompt) { ChatResponse response = chatModel.call( new Prompt( prompt, OpenAiChatOptions.builder() .model("gpt-4o") .maxTokens(100) .temperature(0.5) .build() )); return response.getResult().getOutput().getText(); } } If you want to stream your response add below code in your ChatService class: private StreamingChatModel streamingChatModel; public ChatService(StreamingChatModel streamingChatModel) { this.streamingChatModel = streamingChatModel; } public Flux<String> streamChatResponse(String userMessage) { return streamingChatModel.stream(userMessage); // Prompt prompt = new Prompt(userMessage); // return streamingChatModel.stream(prompt) // .map(response -> response.getResults().toString()); } Here, the commented code is useful if you are passing Prompt object in place of String message. Step#4: Create a REST Controller Now, create a REST API to accept user input and return the AI-generated response. Here are the implementation of various commonly useful methods. import org.springframework.http.ResponseEntity; import org.springframework.web.bind.annotation.GetMapping; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RequestParam; import org.springframework.web.bind.annotation.RestController; import com.example.spring.ai.chat.model.service.ChatService; import com.example.spring.ai.chat.model.service.StreamingChatService; import reactor.core.publisher.Flux; @RestController @RequestMapping("/api/chat") public class ChatController { private ChatService chatService; private StreamingChatService streamingChatService; public ChatController(ChatService chatService) { this.chatService = chatService; } public ChatController(StreamingChatService streamingChatService) { this.streamingChatService = streamingChatService; } @GetMapping("ask-me") public String getResponse(@RequestParam String userMessage) { return chatService.getChatResponse(userMessage); } @GetMapping("ask-by-prompt") public String getPromptResponse(@RequestParam String prompt) { return chatService.getPromptResponse(prompt); } @GetMapping("ask-by-prompt/stream") public Flux<String> getStreamedResponse(@RequestParam String userInput) { return streamingChatService.streamChatResponse(userInput); } @GetMapping("chat-response") public ResponseEntity<String> getModelResponse(@RequestParam String input){ String response=chatService.getChatResponse(input); return ResponseEntity.ok(response); } } Step#5: Error Handling and Logging To improve error handling, add exception handling. import org.springframework.http.HttpStatus; import org.springframework.web.bind.annotation.*; @ControllerAdvice public class GlobalExceptionHandler { @ExceptionHandler(Exception.class) @ResponseStatus(HttpStatus.INTERNAL_SERVER_ERROR) public String handleException(Exception ex) { return "Error: " + ex.getMessage(); } } Run and Test the Application Start the Spring Boot application and trest the results: Test Using Postman Method: GET URL: http://localhost:8080/api/chat/ask-me?How to integrate Spring AI with DeepSeek? You may also go through other articles on Spring AI with Examples. References: https://docs.spring.io/spring-ai/reference/index.html https://docs.spring.io/spring-ai/reference/api/chatmodel.html You can find more about available implementations in the Available Implementations section as well as detailed comparison in the Chat Models Comparison section. Related
ChatModel Interface The ChatModel interface defines the primary method for sending prompts to the AI model and receiving responses. It extends the Model interface with Prompt as the input and ChatResponse as the output. public interface ChatModel extends Model<Prompt, ChatResponse> { default String call(String message) { // Implementation details } @Override ChatResponse call(Prompt prompt); } The call(String message) method offers a simplified way to interact with the model using plain text, while the call(Prompt prompt) method provides more control and flexibility by allowing detailed prompt configurations. StreamingChatModel Interface For applications requiring real-time or incremental responses, the StreamingChatModel interface extends ChatModel and supports streaming outputs using Project Reactor’s Flux. public interface StreamingChatModel extends StreamingModel<Prompt, ChatResponse> { default Flux<String> stream(String message) { // Implementation details } @Override Flux<ChatResponse> stream(Prompt prompt); } This interface is particularly useful for applications like live chat systems, where responses are processed and displayed as they are generated. Prompt Class The Prompt class encapsulates the input to be sent to the AI model, consisting of a list of Message objects and optional ChatOptions. public class Prompt implements ModelRequest<List<Message>> { private final List<Message> messages; private ChatOptions modelOptions; @Override public ChatOptions getOptions() { // Implementation details } @Override public List<Message> getInstructions() { // Implementation details } // Constructors and utility methods omitted } This structure allows for complex conversational contexts to be constructed and sent to the AI model, enabling more nuanced and context-aware responses. Message Interface The Message interface represents individual messages within a conversation, each categorized by a MessageType (e.g., system, user, assistant). public interface Content { String getContent(); Map<String, Object> getMetadata(); } public interface Message extends Content { MessageType getMessageType(); } Implementations of this interface, such as UserMessage and AssistantMessage, help define the role of each message in the conversation, providing context that the AI model can use to generate appropriate responses. ChatOptions Interface ChatOptions defines configurable parameters that influence the AI model’s responses, such as the model type, temperature, and maximum tokens. public interface ChatOptions extends ModelOptions { String getModel(); Float getFrequencyPenalty(); Integer getMaxTokens(); Float getPresencePenalty(); List<String> getStopSequences(); Float getTemperature(); Integer getTopK(); Float getTopP(); ChatOptions copy(); } These options allow developers to fine-tune the behavior of the AI model to suit specific application needs. ChatResponse Class The ChatResponse class encapsulates the AI model’s output, including generated messages and associated metadata. public class ChatResponse implements ModelResponse<Generation> { private final ChatResponseMetadata chatResponseMetadata; private final List<Generation> generations; @Override public ChatResponseMetadata getMetadata() { // Implementation details } @Override public List<Generation> getResults() { // Implementation details } // Other methods omitted }
the ChatResponse class represents an individual response generated by the AI model, allowing developers to extract and utilize the AI’s output efficiently. The metadata included in the response can contain additional information, such as token usage and latency, which can be useful for optimizing performance.