石家庄模板建站代理,网络广告策划书撰写,网上购物流程,公司网站建设需要哪些设备欢迎关注我的CSDN#xff1a;https://spike.blog.csdn.net/ 本文地址#xff1a;https://spike.blog.csdn.net/article/details/142938982 免责声明#xff1a;本文来源于个人知识与公开资料#xff0c;仅用于学术交流#xff0c;欢迎讨论#xff0c;不支持转载。 Neo4j … 欢迎关注我的CSDNhttps://spike.blog.csdn.net/ 本文地址https://spike.blog.csdn.net/article/details/142938982 免责声明本文来源于个人知识与公开资料仅用于学术交流欢迎讨论不支持转载。 Neo4j 是一个高性能的图形数据库允许用户以图形的形式存储和检索数据这种形式非常适合处理复杂的关系和网络结构因其在数据关系处理方面的强大能力而广受欢迎尤其是在社交网络、推荐系统、网络分析等领域。 构建 GraphRAG 的知识图谱请参考配置 GraphRAG Ollama 服务 构建 中文知识图谱 教程(踩坑记录) Dochttps://neo4j.com/docs/apoc/current/ 1. 配置 Neo4j 服务
准备 Docker参考 Docker - Neo4j
docker pull neo4j:5.24.1启动 Docker (直接启动同时运行服务)
docker run --networkhost --gpus all --rm --name neo4j-apoc \
-e NEO4J_apoc_export_file_enabledtrue \
-e NEO4J_apoc_import_file_enabledtrue \
-e NEO4J_apoc_import_file_use__neo4j__configtrue \
-e NEO4J_PLUGINS\[\apoc\\] \
--volume[your folder]:[your folder] \
neo4j:5.24.1或者进入 Docker再启动服务
docker run --networkhost --gpus all -it --name neo4j-apoc -e NEO4J_apoc_export_file_enabledtrue -e NEO4J_apoc_import_file_enabledtrue -e NEO4J_apoc_import_file_use__neo4j__configtrue -e NEO4J_PLUGINS\[\apoc\\] --volume[your folder]:[your folder] neo4j:5.24.1 /bin/bashbin/neo4j start注意使用 Neo4j APOC 版本的 Docker。APOC(Awesome Procedures on Cypher) 是 Neo4j 图数据库的一个插件提供一组强大的过程和函数扩展 Cypher 查询语言的功能。参考Neo4J and APOC 日志
Installing Plugin apoc from /var/lib/neo4j/labs/apoc-*-core.jar to /var/lib/neo4j/plugins/apoc.jar
Applying default values for plugin apoc to neo4j.conf
2024-10-15 01:40:54.4290000 INFO Logging config in use: File /var/lib/neo4j/conf/user-logs.xml
2024-10-15 01:40:54.4430000 INFO Starting...
2024-10-15 01:40:55.1910000 INFO This instance is ServerId{0350f51a} (0350f51a-ef80-414f-b82f-8e4b38fc369f)
2024-10-15 01:40:56.0780000 INFO Neo4j 5.24.1
2024-10-15 01:40:58.8750000 INFO Anonymous Usage Data is being sent to Neo4j, see https://neo4j.com/docs/usage-data/
2024-10-15 01:40:58.9100000 INFO Bolt enabled on 0.0.0.0:7687.
2024-10-15 01:40:59.3250000 INFO HTTP enabled on 0.0.0.0:7474.
2024-10-15 01:40:59.3260000 INFO Remote interface available at http://localhost:7474/
2024-10-15 01:40:59.3280000 INFO id: 3C118963730B6744966FCB5FC5D9D5795B11AD1F791A4DDC113D02D1F926441F
2024-10-15 01:40:59.3290000 INFO name: system
2024-10-15 01:40:59.3290000 INFO creationDate: 2024-10-15T01:40:57.342Z
2024-10-15 01:40:59.3290000 INFO Started.启动服务http://[your ip]:7474/browser/默认账户和密码都是 neo4j需要修改新密码 xxxxxx建议 neo4j123 (自定义)。
启动页面注意实体和关系都空的即 2. 注入知识图谱数据
数据位于/var/lib/neo4j/data/databases/neo4j其中 neo4j 是数据库。
读取 GraphRAG 的知识图谱数据如下
import os
import pandas as pdrag_dir [your folder]/llm/graphrag/ragtest/output/entities pd.read_parquet(os.path.join(rag_dir, create_final_entities.parquet))
relationships pd.read_parquet(os.path.join(rag_dir, create_final_relationships.parquet))
text_units pd.read_parquet(os.path.join(rag_dir, create_final_text_units.parquet))
communities pd.read_parquet(os.path.join(rag_dir, create_final_communities.parquet))
community_reports pd.read_parquet(os.path.join(rag_dir, create_final_community_reports.parquet))测试数据
entities.head(2)
relationships.head(2)
text_units.head(2)
communities.head(2)
community_reports.head(2)连接服务器
NEO4J_URI neo4j://localhost:7687
NEO4J_USERNAME neo4j
NEO4J_PASSWORD xxxxxx # 之前修改的密码
NEO4J_DATABASE neo4j # 默认
driver GraphDatabase.driver(NEO4J_URI, auth(NEO4J_USERNAME, NEO4J_PASSWORD))注意社区版本不能创建新的 Database 只能使用默认的 neo4j创建命令 CREATE DATABASE my-database参考 数据导入函数
def import_data(cypher, df, batch_size1000):for i in range(0,len(df), batch_size):batch df.iloc[i: min(ibatch_size, len(df))]result driver.execute_query(UNWIND $rows AS value cypher, rowsbatch.to_dict(records),database_NEO4J_DATABASE)print(result.summary.counters)return 导入 text_units 命令
#导入text_units
cypher_text_units
MERGE (c:__Chunk__ {id:value.id})
SET c value {.text, .n_tokens}
WITH c, value
UNWIND value.document_ids AS document
MATCH (d:__Document__ {id:document})
MERGE (c)-[:PART_OF]-(d)
import_data(cypher_text_units, text_units)运行成功日志
{_contains_updates: True, labels_added: 99, relationships_created: 235, nodes_created: 99, properties_set: 396}
导入 entities 数据的命令
#导入entities
cypher_entities
MERGE (e:__Entity__ {id:value.id})
SET e value {.human_readable_id, .description, name:replace(value.name,,)}
WITH e, value
CALL db.create.setNodeVectorProperty(e, description_embedding, value.description_embedding)
CALL apoc.create.addLabels(e, case when coalesce(value.type,) then [] else [apoc.text.upperCamelCase(replace(value.type,,))] end) yield node
UNWIND value.text_unit_ids AS text_unit
MATCH (c:__Chunk__ {id:text_unit})
MERGE (c)-[:HAS_ENTITY]-(e)
import_data(cypher_entities, entities)导入 relationships 数据的命令
#导入relationships
cypher_relationships MATCH (source:__Entity__ {name:replace(value.source,,)})MATCH (target:__Entity__ {name:replace(value.target,,)})// not necessary to merge on id as there is only one relationship per pairMERGE (source)-[rel:RELATED {id: value.id}]-(target)SET rel value {.rank, .weight, .human_readable_id, .description, .text_unit_ids}RETURN count(*) as createdRels
import_data(cypher_relationships, relationships)导入 communities 数据的命令
#导入communities
cypher_communities
MERGE (c:__Community__ {community:value.id})
SET c value {.level, .title}
/*
UNWIND value.text_unit_ids as text_unit_id
MATCH (t:__Chunk__ {id:text_unit_id})
MERGE (c)-[:HAS_CHUNK]-(t)
WITH distinct c, value
*/
WITH *
UNWIND value.relationship_ids as rel_id
MATCH (start:__Entity__)-[:RELATED {id:rel_id}]-(end:__Entity__)
MERGE (start)-[:IN_COMMUNITY]-(c)
MERGE (end)-[:IN_COMMUNITY]-(c)
RETURn count(distinct c) as createdCommunities
import_data(cypher_communities, communities)导入 community_reports 数据的命令
#导入community_reports
cypher_community_reports MATCH (c:__Community__ {community: value.community})
SET c value {.level, .title, .rank, .rank_explanation, .full_content, .summary}
WITH c, value
UNWIND range(0, size(value.findings)-1) AS finding_idx
WITH c, value, finding_idx, value.findings[finding_idx] as finding
MERGE (c)-[:HAS_FINDING]-(f:Finding {id: finding_idx})
SET f finding
import_data(cypher_community_reports, community_reports)3. 测试效果
启动 Neo4j 页面知识图谱可视化包括 Node labels 和 Relationship types 等功能即 其他知识图谱元素的可视化参考 Neo4j 的文档。