SF Legacy Business Registry - Civic Tech

Full Stack Developer @ Hackathon Project 2025 (8 hours)

Built PDF-to-structured-data pipeline using LlamaParse to transform San Francisco legacy business applications into an engaging, searchable database.

LlamaParse PDF Processing Civic Tech Vector Search Full Stack

Problem

San Francisco's legacy business data was trapped in PDF application forms, making it difficult to discover and explore the city's historic businesses.

Solution

Data Pipeline

  • Used LlamaParse to extract structured data from PDF applications
  • Cleaned and normalized business information
  • Built searchable database with rich metadata

User Experience

  • Created engaging visual interface for browsing businesses
  • Added filtering by neighborhood, business type, and year established
  • Implemented search functionality

Impact

  • Made civic data accessible and useful
  • Demonstrated practical LLM applications for document processing
  • Created reusable pipeline for similar civic tech projects

Tech Stack

  • LlamaParse for PDF extraction
  • Vector search for semantic queries
  • Modern web framework for UI