Microsoft Agent Framework Agent for reading bank statements.

Python 97.1%
Dockerfile 2.9%

Find a file

Ryan Goeken 8bb8277c4a Update python		2026-06-17 23:33:48 -05:00
.bg-shell	chore(M003/S02): auto-commit after research-slice	2026-03-19 00:54:54 -05:00
.gsd	gsd got more shit done	2026-03-19 12:06:09 -05:00
retrieval	gsd got more shit done	2026-03-19 12:06:09 -05:00
tests	fix(S02/T01): Fixed one-line `messages[-1].text` → `messages.text` bug…	2026-03-19 00:58:24 -05:00
.dockerignore	test(S01/T02): Built a two-stage Chainguard Python Docker image with a…	2026-03-19 01:20:10 -05:00
.env.example	test(S01/T02): Built a two-stage Chainguard Python Docker image with a…	2026-03-19 01:20:10 -05:00
.gitignore	chore(M003/S01): auto-commit after research-slice	2026-03-19 00:02:32 -05:00
.python-version	Update python	2026-06-17 23:33:48 -05:00
agent.py	feat(S01/T01): Added `CopilotClient` token-auth path to `build_agent()`…	2026-03-19 01:14:32 -05:00
app.py	gsd got more shit done	2026-03-19 12:25:53 -05:00
devui_server.py	gsd got more shit done	2026-03-19 12:06:09 -05:00
docker-compose.yml	update docker compose	2026-03-24 22:56:32 -05:00
Dockerfile	test(S01/T02): Built a two-stage Chainguard Python Docker image with a…	2026-03-19 01:20:10 -05:00
loader.py	gsd got more shit done	2026-03-18 16:30:44 -05:00
mise.toml	Update python	2026-06-17 23:33:48 -05:00
pyproject.toml	gsd got more shit done	2026-03-19 12:06:09 -05:00
README.md	gsd got more shit done	2026-03-19 12:06:09 -05:00
renovate.json	Add renovate.json	2026-03-09 05:07:40 +00:00
uv.lock	gsd got more shit done	2026-03-19 12:06:09 -05:00

README.md

spending-agent-ai

A personal Streamlit chat app for querying bank statements in natural language. Drop your PDFs in files/, run the app, and ask questions about your spending. Answers stream token-by-token, follow-up questions work, and a sidebar button reloads statements without restarting the server.

Prerequisites

Python 3.14 — managed via mise (mise install)
uv — for the virtualenv and deps (mise install or pip install uv)
GitHub Copilot — active subscription + CLI auth (gh auth login with Copilot access)

Setup

# 1. Clone and enter the repo
git clone <repo-url> spending-agent-ai
cd spending-agent-ai

# 2. Create virtualenv and install dependencies
uv sync

# 3. Authenticate GitHub Copilot (if not already done)
gh auth login

# 4. Add your bank statement PDFs
mkdir -p files
cp ~/Downloads/*.pdf files/

Run

streamlit run app.py

Open http://localhost:8501 in your browser. The sidebar shows how many statements were loaded and the date range. Start typing in the chat input at the bottom.

To stop the server: Ctrl+C.

DevUI (Development server)

A lightweight HTTP API server for querying the agent directly — no browser required. Useful for scripted testing, debugging context injection, or driving the agent from other tools.

Prerequisites: files/ must be populated with PDF bank statements (same as the Streamlit app).

Start:

uv run python devui_server.py

Open http://127.0.0.1:9898 in your browser for the DevUI interface, or query the API directly (see below).

Restarting during development:
The --reload flag is accepted by the devui CLI entry-point but is not forwarded to uvicorn.run() — it is a no-op. In-memory entities (the BM25 index and loaded chunks) have no source files to watch, so file-watch-based hot-reload is not supported. The correct dev loop is to restart the process after each edit:

uv run python devui_server.py

The server starts in ~3 seconds (BM25 index: ~1 s, uvicorn bind: ~2 s), making a manual restart the practical approach.

Direct API querying (two-step pattern):

# Step 1: get the entity ID
ENTITY_ID=$(curl -sf http://127.0.0.1:9898/v1/entities \
  | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['entities'][0]['id'])")

# Step 2: send a spending query
curl -sf http://127.0.0.1:9898/v1/responses \
  -H "Content-Type: application/json" \
  -d "{\"model\":\"spending-analyst\",\"input\":\"What did I spend the most on in November?\",\"metadata\":{\"entity_id\":\"$ENTITY_ID\"}}" \
  | python3 -c "import sys,json; r=json.load(sys.stdin); print(r['output'][0]['content'][0]['text'])"

Note on response shape: The DevUI API nests the answer text at output[0].content[0].text, not a top-level output_text key.

Diagnostic check (at import time, no server needed):

# Confirm BM25CopilotAgent is wired and index is populated
PYTHONPATH=. python -c "from devui_server import agent, bm25_index, chunks; print('agent:', type(agent).__name__, '| index:', type(bm25_index).__name__, '| chunks:', len(chunks))"

If index is NoneType or chunks is 0, BM25 context injection is silently skipped on every query — check that PDFs are present in files/.

Example queries

Top spending

What did I spend the most on?

Returns a ranked table of payees with totals across all loaded statements.

Monthly breakdown

Give me a full breakdown of my spending for each month.

Returns a month-by-month summary with totals and notable payees.

Follow-up (conversation memory)

What did I spend the most on?
→ [agent answers with ranked list]

What about just November?
→ [agent scopes previous answer to November only]

And how does that compare to December?
→ [agent compares the two months using prior context]

Specific vendor or category

How much did I spend at grocery stores total?
How many times did I pay rent and what was the total?
What recurring subscriptions do I have?

Date-range queries

What did I spend in Q4 2025?
Show me my spending for October and November combined.
Which month was most expensive overall?

Anomaly detection

Are there any unusually large transactions I should know about?
What's the biggest single purchase in each month?
Are there any charges that look like duplicates?

Reload statements

Click 🔄 Reload statements in the sidebar to re-parse the files/ directory and reset the conversation — useful after adding new monthly PDFs without restarting the server.

Project layout

app.py              — Streamlit chat UI
agent.py            — GitHub Copilot agent wrapper (build, session, stream)
devui_server.py     — DevUI entry point (dev server — BM25 context injection wired)
loader.py           — PDF text extraction (pdfplumber)
retrieval/          — BM25 chunker + index (used by both app.py and devui_server.py)
files/              — Bank statement PDFs (gitignored — add your own)
pyproject.toml      — Dependencies

Diagnostics

# Is the server healthy?
curl http://localhost:8501/_stcore/health   # → "ok"

# Test PDF extraction independently
python loader.py files/

# Test the agent independently (no Streamlit needed)
python agent.py

If the sidebar shows "No statements loaded", check that PDFs are present in files/ and that streamlit run app.py was launched from the project root directory.

If the chat input is disabled with "Agent not ready", check gh auth status — the Copilot CLI session may have expired.

Docker

Build the image (required once, or after code changes):

mise run docker:build

Copy .env.example to .env and set your token:

cp .env.example .env
# Edit .env and set GITHUB_TOKEN=<your_token>

Start the container (detached):

mise run docker:up

The Streamlit UI is available at http://localhost:8501. Bank statement PDFs in the files/ directory are bind-mounted into the container — the sidebar will show parsed statements automatically.

Tail logs:

mise run docker:logs

Stop the container:

mise run docker:down