Problem
San Francisco's legacy business data was trapped in PDF application forms, making it difficult to discover and explore the city's historic businesses.
Solution
Data Pipeline
- Used LlamaParse to extract structured data from PDF applications
- Cleaned and normalized business information
- Built searchable database with rich metadata
User Experience
- Created engaging visual interface for browsing businesses
- Added filtering by neighborhood, business type, and year established
- Implemented search functionality
Impact
- Made civic data accessible and useful
- Demonstrated practical LLM applications for document processing
- Created reusable pipeline for similar civic tech projects
Tech Stack
- LlamaParse for PDF extraction
- Vector search for semantic queries
- Modern web framework for UI