微信扫码
添加专属顾问
我要投稿
GraphRAG技术革新,将Agent精准接入知识图谱,实现数据自主精准定位。LangGraph融合后的终极形态,开启智能体新篇章。 核心内容: 1. GraphRAG技术详解,实现100%准确率的Agent数据定位 2. Agent在多智能体系统中的精准任务执行与工作流管理 3. 实操环境搭建与知识图谱构建,手把手教学,实现Agent智能体的高效应用
Agent能根据任务,自主去探索知识图谱,或任何数据库!
然后找到能支撑解决问题的,直接数据!
不存在模糊相似!更不存在胡编乱造!
sysdm.cpl
neo4j console
conda create -n agent python=3.10
conda activate agent
pip install -r requirements.txt
jupyter lab
# 连接你本地的neo4j
os.environ["NEO4J_URI"] = "bolt://localhost:7687"
os.environ["NEO4J_USERNAME"] = "neo4j"
os.environ["NEO4J_PASSWORD"] = "password"
graph = Neo4jGraph(refresh_schema=False)
graph.query("CREATE CONSTRAINT IF NOT EXISTS FOR (c:Chunk) REQUIRE c.id IS UNIQUE")
graph.query("CREATE CONSTRAINT IF NOT EXISTS FOR (c:AtomicFact) REQUIRE c.id IS UNIQUE")
graph.query("CREATE CONSTRAINT IF NOT EXISTS FOR (c:KeyElement) REQUIRE c.id IS UNIQUE")
graph.query("CREATE CONSTRAINT IF NOT EXISTS FOR (d:Document) REQUIRE d.id IS UNIQUE")
# 定义 PyPDFDirectoryLoader 实例
loader = PyPDFDirectoryLoader(
path="E:\\neo4j000\\data",
glob="**/[!.]*.pdf",
silent_errors=False,
load_hidden=False,
recursive=False,
extract_images=False,
password=None,
mode="page",
headers=None,
extraction_mode="plain",
# extraction_kwargs=None,
)
# 加载 PDF 文件
documents = loader.load()
# 打印加载的文档
for doc in documents:
print(doc)
# 定义提取知识图谱的关键元素和原子事实
construction_system = """
You are now an intelligent assistant tasked with meticulously extracting both key elements and
atomic facts from a long text.
1. Key Elements: The essential nouns (e.g., characters, times, events, places, numbers), verbs (e.g.,
actions), and adjectives (e.g., states, feelings) that are pivotal to the text’s narrative.
2. Atomic Facts: The smallest, indivisible facts, presented as concise sentences. These include
propositions, theories, existences, concepts, and implicit elements like logic, causality, event
sequences, interpersonal relationships, timelines, etc.
Requirements:
#####
1. Ensure that all identified key elements are reflected within the corresponding atomic facts.
2. You should extract key elements and atomic facts comprehensively, especially those that are
important and potentially query-worthy and do not leave out details.
3. Whenever applicable, replace pronouns with their specific noun counterparts (e.g., change I, He,
She to actual names).
4. Ensure that the key elements and atomic facts you extract are presented in the same language as
the original text (e.g., English or Chinese).
"""
construction_human = """Use the given format to extract information from the
following input: {input}"""
construction_prompt = ChatPromptTemplate.from_messages(
[
(
"system",
construction_system,
),
(
"human",
(
"Use the given format to extract information from the "
"following input: {input}"
),
),
]
)
你现在是一个智能助手,负责从长文本中细致地提取关键元素和原子事实关键元素:对文本叙述至关重要的核心名词(例如人物、时间、事件、地点、数字)、动词(例如动作)和形容词(例如状态、情感)原子事实:最小的、不可分割的事实,以简洁的句子形式呈现。这些包括命题、理论、存在、概念以及隐含的逻辑、因果关系、事件顺序、人际关系、时间线等元素要求:确保所有识别出的关键元素都反映在相应的原子事实中你应该全面提取关键元素和原子事实,特别是那些重要且可能被查询的内容,不要遗漏细节只要适用,用具体的名词替换代词(例如将“我”、“他”、“她”替换为实际名字)确保你提取的关键元素和原子事实与原文使用相同语言(例如英语或中文)
# 2k的chunk_sizeasync def process_document(text, document_name, chunk_size=2000, chunk_overlap=200): start = datetime.now() print(f"Started extraction at: {start}") text_splitter = TokenTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap) texts = text_splitter.split_text(text) print(f"Total text chunks: {len(texts)}") tasks = [ asyncio.create_task(construction_chain.ainvoke({"input":chunk_text})) for index, chunk_text in enumerate(texts) ] results = await asyncio.gather(*tasks) print(f"Finished LLM extraction after: {datetime.now() - start}") docs = [el.dict() for el in results] for index, doc in enumerate(docs): doc['chunk_id'] = encode_md5(texts[index]) doc['chunk_text'] = texts[index] doc['index'] = index for af in doc["atomic_facts"]: af["id"] = encode_md5(af["atomic_fact"]) # 导入块/原子事实/关键元素 graph.query(import_query, params={"data": docs, "document_name": document_name}) # 在块之间创建下一个关系 graph.query("""MATCH (c:Chunk)<-[:HAS_CHUNK]-(d:Document)WHERE d.id = $document_nameWITH c ORDER BY c.index WITH collect(c) AS nodesUNWIND range(0, size(nodes) -2) AS indexWITH nodes[index] AS start, nodes[index + 1] AS endMERGE (start)-[:NEXT]->(end)""", params={"document_name":document_name}) print(f"Finished import at: {datetime.now() - start}")
await process_document(text, "wse2", chunk_size=2000, chunk_overlap=100)
53AI,企业落地大模型首选服务商
产品:场景落地咨询+大模型应用平台+行业解决方案
承诺:免费场景POC验证,效果验证后签署服务协议。零风险落地应用大模型,已交付160+中大型企业
2025-04-15
知识库优化之路(三):嵌入模型的选择和使用方法
2025-04-15
微软Phi-4-mini:小模型如何在GraphRAG中大放异彩?
2025-04-14
MCP技术革命:元控制协议如何重构AI与数据库的交互范式
2025-04-13
用大模型构建企业级知识图谱真的太简单了!
2025-04-13
深入解析 GreptimeDB MCP:连接数据库与 LLM 的桥梁
2025-04-09
LLM知识图谱构建器:前端架构如何革新数据可视化?
2025-04-07
解决Dify与Milvus集成难题:从零到一的实战避坑指南
2025-04-07
别只看LLM!为什么知识图谱才是通往AGI的关键一步
2025-01-02
2024-07-17
2024-08-13
2025-01-03
2024-07-11
2024-08-27
2024-06-24
2024-07-13
2024-07-12
2024-06-10
2025-04-15
2025-04-09
2025-03-29
2025-02-13
2025-01-14
2025-01-10
2025-01-06
2025-01-02