RAG pipeline for semantic search over personal archives

This commit is contained in:
Eric Furst 2026-02-26 16:45:23 -05:00
commit 57d4f062ef
11 changed files with 1768 additions and 0 deletions

36
.gitignore vendored Normal file
View file

@ -0,0 +1,36 @@
# Python
.venv/
__pycache__/
*.pyc
# HuggingFace cached models (large, ~2 GB)
models/
# Vector stores (large, rebuild with build scripts)
store/
storage_clippings/
# Data (symlinks to private files)
data
clippings
# Generated file lists
ocr_needed.txt
# IDE and OS
.DS_Store
.vscode/
.idea/
# Jupyter checkpoints
.ipynb_checkpoints/
# Secrets
.env
API_key_temp
# Query log
query.log
# Duplicate of CLAUDE.md
claude.md