Neo4j 向量索引
Neo4j 是一个开源图数据库,集成了向量相似性搜索的支持
它支持:
- 近似最近邻搜索
- 欧几里得相似性和余弦相似性
- 结合向量和关键词搜索的混合搜索
本笔记本展示了如何使用 Neo4j 向量索引 (Neo4jVector
)。
请参阅 安装说明。
# Pip install necessary package
%pip install --upgrade --quiet neo4j
%pip install --upgrade --quiet langchain-openai langchain-community
%pip install --upgrade --quiet tiktoken
我们想使用 OpenAIEmbeddings
,所以我们必须获取 OpenAI API 密钥。
import getpass
import os
if "OPENAI_API_KEY" not in os.environ:
os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")
OpenAI API Key: ········
<!--IMPORTS:[{"imported": "TextLoader", "source": "langchain_community.document_loaders", "docs": "https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.text.TextLoader.html", "title": "Neo4j Vector Index"}, {"imported": "Neo4jVector", "source": "langchain_community.vectorstores", "docs": "https://python.langchain.com/api_reference/community/vectorstores/langchain_community.vectorstores.neo4j_vector.Neo4jVector.html", "title": "Neo4j Vector Index"}, {"imported": "Document", "source": "langchain_core.documents", "docs": "https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html", "title": "Neo4j Vector Index"}, {"imported": "OpenAIEmbeddings", "source": "langchain_openai", "docs": "https://python.langchain.com/api_reference/openai/embeddings/langchain_openai.embeddings.base.OpenAIEmbeddings.html", "title": "Neo4j Vector Index"}, {"imported": "CharacterTextSplitter", "source": "langchain_text_splitters", "docs": "https://python.langchain.com/api_reference/text_splitters/character/langchain_text_splitters.character.CharacterTextSplitter.html", "title": "Neo4j Vector Index"}]-->
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import Neo4jVector
from langchain_core.documents import Document
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter
loader = TextLoader("../../how_to/state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)
embeddings = OpenAIEmbeddings()
# Neo4jVector requires the Neo4j database credentials
url = "bolt://localhost:7687"
username = "neo4j"
password = "password"
# You can also use environment variables instead of directly passing named parameters
# os.environ["NEO4J_URI"] = "bolt://localhost:7687"
# os.environ["NEO4J_USERNAME"] = "neo4j"
# os.environ["NEO4J_PASSWORD"] = "pleaseletmein"