Text Generation
PDF Q&A Application
Let users upload any PDF and ask natural language questions about its contents. Combines document chunking, embeddings, and LLM inference on OneInfer.
OneInfer Chat APIOneInfer Embeddings APILangChainPythonFAISS
Architecture
1. Parse PDF + chunk text
2. Embed chunks via OneInfer
3. Query → retrieve → answer
Step-by-step guide
1
Install dependencies
bash
pip install openai langchain faiss-cpu pypdf2
Parse and chunk your PDF
python
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
loader = PyPDFLoader("document.pdf")
pages = loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=512, chunk_overlap=64)
chunks = splitter.split_documents(pages)3
Embed chunks using OneInfer
python
import openai
client = openai.OpenAI(
api_key="your-oneinfer-api-key",
base_url="https://api.oneinfer.ai/v1"
)
def embed(texts: list[str]) -> list[list[float]]:
response = client.embeddings.create(
model="text-embedding-3-small",
input=texts
)
return [d.embedding for d in response.data]
# Embed all chunks
texts = [c.page_content for c in chunks]
vectors = embed(texts)4
Build a FAISS index and query it
python
import numpy as np
import faiss
dim = len(vectors[0])
index = faiss.IndexFlatL2(dim)
index.add(np.array(vectors, dtype="float32"))
def answer_question(question: str) -> str:
q_vec = np.array(embed([question]), dtype="float32")
_, indices = index.search(q_vec, k=4)
context = "\n\n".join(texts[i] for i in indices[0])
response = client.chat.completions.create(
model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
messages=[
{"role": "system", "content": "Answer questions based only on the provided context."},
{"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"}
]
)
return response.choices[0].message.content
print(answer_question("What are the key findings in chapter 3?"))