# pdf-to-kcf A Python CLI tool that uses AI agents to parse PDF documents and extract structured insights. Built with `pydantic-ai`, this tool creates an intelligent agent that autonomously analyzes documents, requesting additional pages as needed to form complete insights. ## Features - **Autonomous Document Analysis**: AI agent decides how much of the document to read - **Structured Insight Extraction**: Classifies content as facts, opinions, or comments - **Rich Metadata**: Adds attributes like source, confidence, dates, and more - **Multiple AI Models**: Supports OpenAI and other compatible models - **JSON Output**: Exports insights in a structured, machine-readable format ## Installation This project uses [uv](https://github.com/astral-sh/uv) for dependency management: ```bash # Install dependencies uv sync ``` ## Setup 1. Copy the environment template: ```bash cp .env.example .env ``` 2. Add your OpenRouter API key to `.env`: ```bash OPENROUTER_API_KEY=your_openrouter_api_key_here ``` 3. Get your API key from [OpenRouter](https://openrouter.ai/) (free tier available) ## Usage ```bash # Basic usage (uses OpenRouter with Claude 3.5 Sonnet by default) uv run pdf-to-kcf document.pdf # Specify custom output file uv run pdf-to-kcf document.pdf -o insights.json # Start from a specific page (0-indexed) uv run pdf-to-kcf document.pdf -s 3 # Use a different AI model from OpenRouter uv run pdf-to-kcf document.pdf -m meta-llama/llama-3.1-70b-instruct uv run pdf-to-kcf document.pdf -m google/gemini-pro-1.5 ``` ### Options - `--output, -o`: Output JSON file path (default: `_insights.json`) - `--start-page, -s`: Starting page number, 0-indexed (default: 0) - `--model, -m`: AI model to use via OpenRouter (default: `anthropic/claude-3.5-sonnet`) ### Available Models When using OpenRouter, you can specify any model using the format `/`: - `anthropic/claude-3.5-sonnet` (default, recommended) - `anthropic/claude-3-opus` - `openai/gpt-4o` - `meta-llama/llama-3.1-70b-instruct` - `google/gemini-pro-1.5` - See [OpenRouter models](https://openrouter.ai/models) for full list ## Output Format The tool generates JSON files with structured insights: ```json { "insights": [ { "type": "fact", "insight": "Global temperatures have risen 1.1�C since pre-industrial times", "content": "According to the IPCC, global temperatures have risen approximately 1.1�C...", "attributes": [ {"attribute": "source", "value": "IPCC Report"}, {"attribute": "confidence", "value": "high"}, {"attribute": "year", "value": "2023"} ] }, { "type": "opinion", "insight": "The author believes immediate action is required", "content": "We must act now to prevent catastrophic consequences...", "attributes": [ {"attribute": "sentiment", "value": "urgent"}, {"attribute": "section", "value": "conclusion"} ] } ] } ``` ## How It Works 1. **PDF Loading**: Extracts text content from PDF using pypdf 2. **Agent Initialization**: Creates a pydantic-ai agent with the specified model 3. **Autonomous Analysis**: Agent analyzes content and can request additional pages 4. **Insight Extraction**: Classifies and structures insights with metadata 5. **JSON Export**: Saves all insights to a JSON file ## Requirements - Python 3.12 or higher - OpenRouter API key (set as `OPENROUTER_API_KEY` environment variable) - Get your free API key at [OpenRouter](https://openrouter.ai/) - Supports all major AI models (Claude, GPT-4, Gemini, Llama, etc.) - Alternatively, use `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or other provider keys ## Architecture The tool follows the agentic document parsing format with these core components: - **models.py**: Data structures (ContentInsight, PageContentAnalysis, etc.) - **pdf_reader.py**: PDF text extraction (PDFDocument class) - **agent.py**: AI agent with autonomous page reading capability - **cli.py**: Command-line interface See `CLAUDE.md` for detailed architecture documentation. ## License MIT