基于SpringAI构建大模型应用

臧莞然 发表于 2025-10-5 17:17:04

1. 背景

在这里，我主要分享的是在应用层面大模型相关的技术，假如你已有一个现成的大模型接口，无论是符合OpenAI规范的，还是各家公司一些自己的接口，例如Gemini，Deepseek，通义千问，问心一言等，让用这些大模型来构建一些应用，可以选取下面的方案：

[*]使用低代码大模型应用搭建平台，例如Coze，Dify， FastGPT等，这些平台带了流程编排，知识库，也很方便的和各种大模型对接，向量模型、向量库，有些还有了监控界面或者插件市场等，能满足我们的大部分需求
[*]使用编程的方式来构建应用，这种可以使用公司现有的技术栈，提供更为灵活的使用，接入现有的系统等，或者从更高层面来说，定制自己的大模型应用规范，定制大模型应用构建平台，接入平台等；也可以把上面所说的低代码平台看作为自建大模型应用体系的一部分，即可以通过代码的方式灵活去构建应用，也通过平台更高效的去构建应用
我们后面讲的主要是使用第二种方式，我们选取一些现有的框架来实现，Python主要Langchain相关技术，Java也有一个对应的框架Langchain4j，也有SpringAI
这些框架帮我们做了很多事情：

[*]封装一个通用的模型调用接口，屏蔽了底层不同公司大模型接口的差异
[*]管理了会话和上下文，会话就是将用户之前问的问题和现在问的问题关联起来
[*]结构化输出，将大模型的文本输出转为程序可以使用的结构化对象，例如JSON对象
[*]工具/函数调用
[*]可观测性
[*]模型效果评估
一个框架LiteLLM专门把不同大模型接口适配为OpenAI格式的，这也是一种屏蔽差异的方式
2. 架构

.png)
整体架构如上，下面主要介绍部分实现

[*]模型框架层，SpringAI，Java大模型应用开发框架
[*]模型推理能力，Ollama，利用本地CPU或GPU，和现有训练模型，实现模型推理能力，便于开发，生产环境需要替换
[*]监控追踪，LangFuse，追踪大模型应用请求，例如输入输出，耗时、Token消耗等
3. 实现

3.1 安装

Ollama和LangFuse参照文档安装和启动
SpringAI通过pom导入，除此之外，还导入了springboot ollama、opentelemetry等相关包，后者是为了接入LangFuse
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.jd.jt</groupId>
ai-base</artifactId>
<version>1.0-SNAPSHOT</version>
<name>Archetype - ai-base</name>
<url>http://maven.apache.org</url>
<parent>
   <groupId>org.springframework.boot</groupId>
   spring-boot-starter-parent</artifactId>
   <version>3.4.3</version>
   <relativePath/>
</parent>
<properties>
   <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
   <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
   <maven.compiler.source>21</maven.compiler.source>
   <maven.compiler.target>21</maven.compiler.target>
</properties>
<dependencyManagement>
   <dependencies>
         <dependency>
            <groupId>io.opentelemetry.instrumentation</groupId>
            opentelemetry-instrumentation-bom</artifactId>
            <version>2.17.0</version>
            <type>pom</type>
            <scope>import</scope>
         </dependency>
         <dependency>
            <groupId>org.springframework.ai</groupId>
            spring-ai-bom</artifactId>
            <version>1.0.1</version>
            <type>pom</type>
            <scope>import</scope>
         </dependency>
   </dependencies>
</dependencyManagement>

<dependencies>
   <dependency>
         <groupId>org.springframework.boot</groupId>
         spring-boot-starter</artifactId>
   </dependency>
   <dependency>
         <groupId>org.springframework.ai</groupId>
         spring-ai-starter-model-ollama</artifactId>
   </dependency>

   <dependency>
         <groupId>org.springframework.boot</groupId>
         spring-boot-starter-web</artifactId>
   </dependency>
   <dependency>
         <groupId>io.opentelemetry.instrumentation</groupId>
         opentelemetry-spring-boot-starter</artifactId>
   </dependency>

   <dependency>
         <groupId>org.springframework.boot</groupId>
         spring-boot-starter-actuator</artifactId>
   </dependency>

   <dependency>
         <groupId>io.micrometer</groupId>
         micrometer-tracing-bridge-otel</artifactId>
   </dependency>

   <dependency>
         <groupId>io.opentelemetry</groupId>
         opentelemetry-exporter-otlp</artifactId>
   </dependency>
   <dependency>
         <groupId>org.projectlombok</groupId>
         lombok</artifactId>
         <version>1.18.38</version>
   </dependency>
</dependencies>
</project>3.2 起步示例

import lombok.extern.slf4j.Slf4j;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.http.MediaType;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Flux;

@RestController
@Slf4j
class MyController {

private final ChatClient chatClient;

public MyController(ChatClient.Builder chatClientBuilder) {
   this.chatClient = chatClientBuilder.build();
}

@GetMapping(value = "/ai-stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
Flux<String> generationStream(@RequestParam("userInput") String userInput) {
   return this.chatClient.prompt()
            .user(userInput)
            .stream()
            .content();
}

@GetMapping("/ai")
String generation(String userInput) {
   return this.chatClient.prompt()
            .user(userInput)
            .call()
            .content();
}application.properties
spring.ai.ollama.chat.enabled=true
spring.ai.model.chat=ollama
spring.ai.ollama.chat.options.model=qwen3:8b
spring.ai.chat.observations.include-prompt=true
spring.ai.chat.observations.include-completion=true
management.tracing.sampling.probability=1.0执行命令
curl --location 'http://localhost:8080/ai?userInput=%E4%BD%A0%E5%A5%BD'嗯，用户发来“你好”，我需要以自然的方式回应。首先，应该用中文回复，保持友好和亲切的语气。可以简单问候，比如“你好！有什么我可以帮助你的吗？”这样既回应了对方的问候，又主动询问是否需要帮助，符合我的设计原则。同时，要避免使用复杂或生硬的表达，让对话显得轻松。另外，考虑到用户可能有各种需求，保持开放式的提问可以引导他们进一步说明具体问题，这样我才能更好地提供帮助。确保回复简洁明了，符合日常交流的习惯。你好！有什么我可以帮助你的吗？
来源：程序园用户自行投稿发布，如果侵权，请联系站长删除
免责声明：如果侵犯了您的权益，请联系站长，我们会及时删除侵权内容，谢谢合作！

页: [1]

程序园's Archiver

基于SpringAI构建大模型应用