Initializing Secure Connection...
InsightIndex dashboard with charts and TF-IDF rankings
Return to Core
PythonTF-IDFFastAPISQLiteSearch

InsightIndex Search Engine

Index PDFs, HTML, and code locally—TF-IDF ranking and a dashboard, no hosted search tax.

Problem

  • Finding information across personal docs + code shouldn’t require a hosted search stack.
  • You want relevance ranking and observability, not just keyword grep.

Build

  • Local crawler/indexer building an inverted index in SQLite.
  • TF‑IDF ranking + FastAPI endpoints + dashboard for stats and highlighted results.

Outcome

  • Fast local search over mixed file types with explainable ranking signals.
  • A dashboard to understand index coverage and query behavior.

The thread

InsightIndex is a personal search engine that crawls your local files—code, documents, web pages, and PDFs—and converts them into a structured inverted index stored in SQLite.

The engine implements a pure mathematical TF-IDF (Term Frequency–Inverse Document Frequency) ranking algorithm to surface the most relevant documents and snippets for each query.

On top of the core engine, a FastAPI backend and a dark glassmorphism dashboard visualize index statistics, TF-IDF rankings, and search results with highlighted context.

Architecture Overview

  • //Python indexing pipeline building an SQLite-based inverted index
  • //TF-IDF scoring for ranking documents and snippets
  • //FastAPI + Uvicorn backend exposing JSON search endpoints
  • //Vanilla JS/HTML/CSS dashboard with analytics and charts