Init
This commit is contained in:
128
README.md
Normal file
128
README.md
Normal file
@@ -0,0 +1,128 @@
|
||||
# pdf-to-kcf
|
||||
|
||||
A Python CLI tool that uses AI agents to parse PDF documents and extract structured insights. Built with `pydantic-ai`, this tool creates an intelligent agent that autonomously analyzes documents, requesting additional pages as needed to form complete insights.
|
||||
|
||||
## Features
|
||||
|
||||
- **Autonomous Document Analysis**: AI agent decides how much of the document to read
|
||||
- **Structured Insight Extraction**: Classifies content as facts, opinions, or comments
|
||||
- **Rich Metadata**: Adds attributes like source, confidence, dates, and more
|
||||
- **Multiple AI Models**: Supports OpenAI and other compatible models
|
||||
- **JSON Output**: Exports insights in a structured, machine-readable format
|
||||
|
||||
## Installation
|
||||
|
||||
This project uses [uv](https://github.com/astral-sh/uv) for dependency management:
|
||||
|
||||
```bash
|
||||
# Install dependencies
|
||||
uv sync
|
||||
```
|
||||
|
||||
## Setup
|
||||
|
||||
1. Copy the environment template:
|
||||
```bash
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
2. Add your OpenRouter API key to `.env`:
|
||||
```bash
|
||||
OPENROUTER_API_KEY=your_openrouter_api_key_here
|
||||
```
|
||||
|
||||
3. Get your API key from [OpenRouter](https://openrouter.ai/) (free tier available)
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
# Basic usage (uses OpenRouter with Claude 3.5 Sonnet by default)
|
||||
uv run pdf-to-kcf document.pdf
|
||||
|
||||
# Specify custom output file
|
||||
uv run pdf-to-kcf document.pdf -o insights.json
|
||||
|
||||
# Start from a specific page (0-indexed)
|
||||
uv run pdf-to-kcf document.pdf -s 3
|
||||
|
||||
# Use a different AI model from OpenRouter
|
||||
uv run pdf-to-kcf document.pdf -m meta-llama/llama-3.1-70b-instruct
|
||||
uv run pdf-to-kcf document.pdf -m google/gemini-pro-1.5
|
||||
```
|
||||
|
||||
### Options
|
||||
|
||||
- `--output, -o`: Output JSON file path (default: `<pdf_name>_insights.json`)
|
||||
- `--start-page, -s`: Starting page number, 0-indexed (default: 0)
|
||||
- `--model, -m`: AI model to use via OpenRouter (default: `anthropic/claude-3.5-sonnet`)
|
||||
|
||||
### Available Models
|
||||
|
||||
When using OpenRouter, you can specify any model using the format `<provider>/<model-name>`:
|
||||
- `anthropic/claude-3.5-sonnet` (default, recommended)
|
||||
- `anthropic/claude-3-opus`
|
||||
- `openai/gpt-4o`
|
||||
- `meta-llama/llama-3.1-70b-instruct`
|
||||
- `google/gemini-pro-1.5`
|
||||
- See [OpenRouter models](https://openrouter.ai/models) for full list
|
||||
|
||||
## Output Format
|
||||
|
||||
The tool generates JSON files with structured insights:
|
||||
|
||||
```json
|
||||
{
|
||||
"insights": [
|
||||
{
|
||||
"type": "fact",
|
||||
"insight": "Global temperatures have risen 1.1<EFBFBD>C since pre-industrial times",
|
||||
"content": "According to the IPCC, global temperatures have risen approximately 1.1<EFBFBD>C...",
|
||||
"attributes": [
|
||||
{"attribute": "source", "value": "IPCC Report"},
|
||||
{"attribute": "confidence", "value": "high"},
|
||||
{"attribute": "year", "value": "2023"}
|
||||
]
|
||||
},
|
||||
{
|
||||
"type": "opinion",
|
||||
"insight": "The author believes immediate action is required",
|
||||
"content": "We must act now to prevent catastrophic consequences...",
|
||||
"attributes": [
|
||||
{"attribute": "sentiment", "value": "urgent"},
|
||||
{"attribute": "section", "value": "conclusion"}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## How It Works
|
||||
|
||||
1. **PDF Loading**: Extracts text content from PDF using pypdf
|
||||
2. **Agent Initialization**: Creates a pydantic-ai agent with the specified model
|
||||
3. **Autonomous Analysis**: Agent analyzes content and can request additional pages
|
||||
4. **Insight Extraction**: Classifies and structures insights with metadata
|
||||
5. **JSON Export**: Saves all insights to a JSON file
|
||||
|
||||
## Requirements
|
||||
|
||||
- Python 3.12 or higher
|
||||
- OpenRouter API key (set as `OPENROUTER_API_KEY` environment variable)
|
||||
- Get your free API key at [OpenRouter](https://openrouter.ai/)
|
||||
- Supports all major AI models (Claude, GPT-4, Gemini, Llama, etc.)
|
||||
- Alternatively, use `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or other provider keys
|
||||
|
||||
## Architecture
|
||||
|
||||
The tool follows the agentic document parsing format with these core components:
|
||||
|
||||
- **models.py**: Data structures (ContentInsight, PageContentAnalysis, etc.)
|
||||
- **pdf_reader.py**: PDF text extraction (PDFDocument class)
|
||||
- **agent.py**: AI agent with autonomous page reading capability
|
||||
- **cli.py**: Command-line interface
|
||||
|
||||
See `CLAUDE.md` for detailed architecture documentation.
|
||||
|
||||
## License
|
||||
|
||||
MIT
|
||||
Reference in New Issue
Block a user