The board of directors wanted an AI agent that could write grants. That's how the engagement started. Pivotal IQ, a consulting firm I advise on the technical side, had a pro-bono client — the S.S. Jeremiah O'Brien, a WWII Liberty Ship museum in San Francisco. The board had heard enough about AI to ask the question, which put them ahead of most nonprofits. But "can AI write our grants?" is a different question than "should we point AI at grant writing right now, and if so, where?"

The full vision would have been substantial. Hands-on interaction with the museum's staff to learn their language. Working against paper documents, probably years of prior submissions and internal reports. Tailoring the output to match how the Jeremiah O'Brien actually talks about itself, not how an LLM thinks a museum should. That's a real engagement, months of iterative work, and the museum didn't have the resources for it.

But before you can write grants, you need to know who to write them to. That's a different problem, and it turned out to be tractable on its own.

What We Were Actually Solving

Over a hundred thousand US foundations exist in the IRS database. The museum needed to figure out which ones might fund them. Traditional grant databases cost thousands a year and rely on keyword matching, which is terrible for an organization like this. The Jeremiah O'Brien is a ship, a museum, a war memorial, and an educational institution simultaneously. Search for "maritime" and you get noise. Search for "museum" and you get a different kind of noise. The multi-domain identity makes keyword discovery actively misleading.

The better signal was in public tax filings. Every foundation files a 990-PF listing every grant they made that year. If a foundation already funded the USS Intrepid Museum and the National WWII Museum, that tells you more about their priorities than their mission statement does. An LLM can read a grantee list and recognize that "Swords to Plowshares" is a veteran services org. Keyword search can't do that.

Three peer categories emerged: maritime and naval, military and veteran, museum and heritage. Match on who foundations have actually funded, not what they say they fund.

The Technical Problem Under the AI Problem

From the tech advisory side, I was thinking about three things.

First, how to minimize AI spend. The museum had zero budget, but even for Pivotal IQ's future clients, per-query API costs add up fast when you're processing thousands of foundations. We built a BAML-to-CLI adapter that routes all LLM inference through an Anthropic Max subscription instead of the API. Hundreds of dollars in would-be API costs became zero. That routing logic is modular — it's a reusable piece of infrastructure, not a one-off hack.

Second, creating enough volume of work for AI-driven analysis to even make sense. You can't just point Claude at "foundations" and get useful output. We had to discover and integrate six public data sources — IRS EO Master File, Statistics of Income, 990-PF filings, Grants.gov, IMLS grants, NPS Maritime Heritage — collect and merge them by EIN, enrich with geo scoring and confidence tagging, then export a normalized 34-column CSV as the integration contract between the data pipeline and the agent.

Third, exploring what in the data is actually worth throwing AI at. The cheap filtering — local vector embeddings with Nomic 1.5, running on a single machine — narrows a hundred thousand foundations to about five hundred candidates. Only then does the LLM touch it: classifying grantees, scoring mission fit, analyzing gaps, drafting proposal sections. Expensive analysis on cheap-filtered candidates. With more time, we probably would have gone deeper into the museum's own paper archives, but the public data alone produced useful results.

Looking at the architecture diagram now, it's a lot of moving pieces. But in the moment, reaching for vectors and embedding models to do the initial filtering felt like second nature. Not because I'd studied this academically — because I'd been immersed in the SF AI/ML startup ecosystem for long enough that the techniques were familiar from watching people demo them, attending launches, having founders pitch me on their tools. A year earlier, "doing AI/ML on IRS form data" would have sounded intimidating. But after enough exposure to the people building these tools — and enough learning by doing on my own projects — the pattern matching is just there. You see a problem shaped like "narrow a huge dataset by similarity" and you know that's embeddings. You see "extract structured meaning from messy text" and you know that's an LLM call. It's the kind of fluency you pick up by proximity, by being within biking distance of a hive of devrel spending and startup energy. Not everyone has that access, which is part of why I write these up.

Three Days, Ninety-Nine Tickets

I ran the build through lisa — ninety-nine tickets across nine dependency waves, two parallel Claude agent sessions. The data pipeline is Python. The grant discovery agent is TypeScript and React. A file-based integration contract sits between them: the pipeline exports, the agent imports, neither knows about the other's internals.

The results weren't spectacular. Ninety prospects out of a hundred thousand, twenty with draft proposal language. Useful, not transformative. The museum got a board-ready briefing document organized by tier, delivered at no cost when they couldn't proceed with the full engagement. That's a decent outcome for three days of work on a pro-bono client.

What Actually Came Out of It

The more interesting outcome is what's reusable. The peer salience approach, the zero-cost inference routing, the data pipeline — none of it is specific to the Jeremiah O'Brien. Swap in a different nonprofit's mission and peer categories and the same architecture produces a tailored analysis. Pivotal IQ can run this for other nonprofits, or share the technique with organizations that want to do their own discovery.

The board wanted an agent that writes grants. We didn't build that — it would have required a longer, more hands-on engagement than anyone had budget for. What we built was the precursory tooling: figure out who to write to, understand what they've funded before, and surface the gaps worth addressing. That turned out to be valuable on its own.

The full case study and live demo walks through the architecture, the pipeline stages, and the cost breakdown in detail. If you're interested in the workflow that made it possible to ship this in a weekend — concurrent agents, structured phases, auditable artifacts — that's the lisa story.