Spring AI Alibaba RAG Example 示例项目源码地址:
https://github.com/springaialibaba/spring-ai-alibaba-examples/tree/main/spring-ai-alibaba-rag-example
我要投稿
掌握下一代RAG技术,构建高效AI应用。核心内容:1. RAG应用架构与核心组件介绍2. 环境准备与Ollama服务启动方法3. 模型下载与Elasticsearch部署指南
Spring AI Alibaba RAG Example 示例项目源码地址:
https://github.com/springaialibaba/spring-ai-alibaba-examples/tree/main/spring-ai-alibaba-rag-example
RAG 应用架构概述
Cloud Native
Ollama:本地大模型运行引擎,大模型时代的 Docker,支持快速体验部署大模型。
Spring AI Alibaba:Spring AI 增强,集成 DashScope 模型平台,快速构建大模型应用。
环境准备
Cloud Native
services:
ollama:
container_name: ollama
image: ollama/ollama:latest
ports:
11434:11434
:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
ports:
3005:8080
environment:
'OLLAMA_BASE_URL=http://host.docker.internal:11434'
# 允许容器访问宿主机网络
extra_hosts:
host.docker.internal:host-gateway
docker exec -it ollama ollama pull deepseek-r1:8b
docker exec -it ollama ollama pull nomic-embed-text:latest
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.16.1
container_name: elasticsearch
privileged: true
environment:
"cluster.name=elasticsearch"
"discovery.type=single-node"
"ES_JAVA_OPTS=-Xms512m -Xmx1096m"
bootstrap.memory_lock=true
volumes:
./config/es.yaml:/usr/share/elasticsearch/config/elasticsearch.yml
ports:
"9200:9200"
"9300:9300"
deploy:
resources:
limits:
cpus: "2"
memory: 1000M
reservations:
memory: 200M
cluster.name: docker-es
node.name: es-node-1
network.host: 0.0.0.0
network.publish_host: 0.0.0.0
http.port: 9200
http.cors.enabled: true
http.cors.allow-origin: "*"
bootstrap.memory_lock: true
# 关闭认证授权 es 8.x 默认开启
xpack.security.enabled: false
项目配置
Cloud Native
<!-- Spring Boot Web Starter -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
<version>3.3.4</version>
</dependency>
<!-- Spring AI Ollama Starter -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
<version>1.0.0-M5</version>
</dependency>
<!-- 向量存储 -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-elasticsearch-store</artifactId>
<version>1.0.0-M5</version>
</dependency>
<!-- PDF 解析 -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-pdf-document-reader</artifactId>
<version>1.0.0-M5</version>
</dependency>
spring:
ai:
# ollama 配置
ollama:
http://127.0.0.1:11434 :
chat:
model: deepseek-r1:8b
embedding:
model: nomic-embed-text:latest
# 向量数据库配置
vectorstore:
elasticsearch:
ollama-rag-embedding-index :
similarity: cosine
dimensions: 768
elasticsearch:
uris: http://127.0.0.1:9200
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-elasticsearch-store</artifactId>
<version>1.0.0-M5</version>
</dependency>
你是一个MacOS专家,请基于以下上下文回答:
---------------------
{question_answer_context}
---------------------
请结合给定上下文和提供的历史信息,用中文 Markdown 格式回答,若答案不在上下文中请明确告知。
核心实现
Cloud Native
public class KnowledgeInitializer implements ApplicationRunner {
// 注入 VectorStore 实例,负责向量化数据的增查操作
private final VectorStore vectorStore;
// 向量数据库客户端,此处使用 es
private final ElasticsearchClient elasticsearchClient;
// .....
@Override
public void run(ApplicationArguments args) {
// 1. load pdf resources.
List<Resource> pdfResources = loadPdfResources();
// 2. parse pdf resources to Documents.
List<Document> documents = parsePdfResource(pdfResources);
// 3. import to ES.
importToES(documents);
}
private List<Document> parsePdfResource(List<Resource> pdfResources) {
// 按照指定策略切分文本并转为 Document 资源对象
for (Resource springAiResource : pdfResources) {
// 1. parse document
DocumentReader reader = new PagePdfDocumentReader(springAiResource);
List<Document> documents = reader.get();
logger.info("{} documents loaded", documents.size());
// 2. split trunks
List<Document> splitDocuments = new TokenTextSplitter().apply(documents);
logger.info("{} documents split", splitDocuments.size());
// 3. add res list
resList.addAll(splitDocuments);
}
}
// ......
}
public class AIRagService {
// 引入 system prompt tmpl
"classpath:/prompts/system-qa.st") (
private Resource systemResource;
// 注入相关 bean 实例
private final ChatModel ragChatModel;
private final VectorStore vectorStore;
// 文本过滤,增强向量检索精度
private static final String textField = "content";
// ......
public Flux<String> retrieve(String prompt) {
// 加载 prompt tmpl
String promptTemplate = getPromptTemplate(systemResource);
// 启用混合搜索,包括嵌入和全文搜索
SearchRequest searchRequest = SearchRequest.builder().
topK(4)
.similarityThresholdAll()
.build();
// build chatClient,发起大模型服务调用。
return ChatClient.builder(ragChatModel)
.build().prompt()
.advisors(new QuestionAnswerAdvisor(
vectorStore,
searchRequest,
promptTemplate)
).user(prompt)
.stream()
.content();
}
}
public class AIRagController {
public AIRagService aiRagService;
public Flux<String> chat(
String prompt,
HttpServletResponse response
) {
// 解决 stream 模式下响应乱码问题。
response.setCharacterEncoding("UTF-8");
if (!StringUtils.hasText(prompt)) {
return Flux.just("prompt is null.");
}
return aiRagService.retrieve(prompt);
}
}
请求演示
Cloud Native
这里以 我现在是一个mac新手,我想配置下 mac 的触控板,让他变得更好用,你有什么建议吗?问题为例,可以看到直接调用模型的回答是比较官方,实用性不高。
RAG 优化
Cloud Native
spring:
application:
name: ollama-rag
ai:
dashscope:
${AI_DASHSCOPE_API_KEY} :
chat:
options:
model: deepseek-r1
embedding:
enabled: false
ollama:
http://127.0.0.1:11434 :
chat:
model: deepseek-r1:8b
enabled: false
embedding:
model: nomic-embed-text:latest
vectorstore:
elasticsearch:
ollama-rag-embedding-index :
similarity: cosine
dimensions: 768
elasticsearch:
uris: http://127.0.0.1:9200
<!-- Spring AI Alibaba DashScope -->
<dependency>
<groupId>com.alibaba.cloud.ai</groupId>
<artifactId>spring-ai-alibaba-starter</artifactId>
<version>1.0.0-M6.1</version>
</dependency>
public Flux<String> retrieve(String prompt) {
// Get the vector store prompt tmpl.
String promptTemplate = getPromptTemplate(systemResource);
// Enable hybrid search, both embedding and full text search
SearchRequest searchRequest = SearchRequest.builder().
topK(4)
.similarityThresholdAll()
.build();
// Build ChatClient with retrieval rerank advisor:
ChatClient runtimeChatClient = ChatClient.builder(chatModel)
.defaultAdvisors(new RetrievalRerankAdvisor(
vectorStore,
rerankModel,
searchRequest,
promptTemplate,
0.1)
).build();
// Spring AI RetrievalAugmentationAdvisor
Advisor retrievalAugmentationAdvisor = RetrievalAugmentationAdvisor.builder()
.queryTransformers(RewriteQueryTransformer.builder()
.chatClientBuilder(ChatClient.builder(ragChatModel).build().mutate())
.build())
.documentRetriever(VectorStoreDocumentRetriever.builder()
.similarityThreshold(0.50)
.vectorStore(vectorStore)
.build())
.build();
// Retrieve and llm generate
return ragClient.prompt()
.advisors(retrievalAugmentationAdvisor)
.user(prompt)
.stream()
.content();
}
https://java2ai.com/docs/1.0.0-M5.1/tutorials/rag/
问题排查
Cloud Native
<repositories>
<repository>
<id>spring-milestones</id>
<name>Spring Milestones</name>
<url>https://repo.spring.io/milestone</url>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
<repository>
<id>spring-snapshots</id>
<name>Spring Snapshots</name>
<url>https://repo.spring.io/snapshot</url>
<releases>
<enabled>false</enabled>
</releases>
</repository>
</repositories>
总结
Cloud Native
53AI,企业落地大模型首选服务商
产品:场景落地咨询+大模型应用平台+行业解决方案
承诺:免费场景POC验证,效果验证后签署服务协议。零风险落地应用大模型,已交付160+中大型企业
2025-03-14
一文学会基于LangChain开发大模型RAG知识问答应用
2025-03-14
GraphRAG落地难,微软工业级RAG+Agent实施方案
2025-03-14
Embedding向量模型在RAG本地知识库中如何使用
2025-03-14
DeepSearch/DeepResearch中最优文本段选择和URL重排
2025-03-13
大模型私有知识库如何提高准确率?切块是关键
2025-03-13
别再为 RAG 文本分块发愁!Chonkie 让复杂操作轻松搞定
2025-03-13
为什么RAG一定需要Rerank?
2025-03-13
如何提升RAG知识库文档的召回准确率?
2024-10-27
2024-09-04
2024-07-18
2024-05-05
2024-06-20
2024-06-13
2024-07-09
2024-07-09
2024-05-19
2024-07-07