✅ 模块组合总图

notion image
PDF/网页/文档 ↓ DocumentLoader ↓ TextSplitter ↓ Embeddings(OpenAI等) ↓ VectorStore(FAISS等) ↓ .as_retriever() ↓ 用于 RetrievalQA / ConversationalRetrievalChain
 
 

🧪 一句话跑通全部流程的示意代码(RAG 典范):

python 复制编辑 from langchain.document_loaders import PyPDFLoader from langchain.text_splitter import RecursiveCharacterTextSplitter from langchain.embeddings import OpenAIEmbeddings from langchain.vectorstores import FAISS # 1. 加载文档 loader = PyPDFLoader("10-K.pdf") docs = loader.load() # 2. 切块 splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50) chunks = splitter.split_documents(docs) # 3. 嵌入 + 建立向量数据库 embedding = OpenAIEmbeddings() vectorstore = FAISS.from_documents(chunks, embedding) # 4. 构造 Retriever retriever = vectorstore.as_retriever()