agebenchmark-og-image.png

This is a detailed article about my project AgeBenchmark, which can be found on Github here. The live version can be found at agebenchmark.com.

You’re me. You’re 32. You just got a promotion you worked hard for. You’re feeling pretty good about yourself. Then someone at dinner drops: “Did you know Mozart had composed over 600 works by the time he died at 35?”

And just like that, your promotion feels… smaller.

We’ve all had that moment. The idle Wikipedia spiral where you realize Alexander the Great had conquered most of the known world by 30, or that Toni Morrison didn’t publish her first novel until she was 39. Depending on which direction you look, you either feel like a total underachiever or like you’ve got all the time in the world.

I kept having this experience and thought it would be fun to make a tool that helps you learn about notable figures this way. Enter your birthday, see where you stand. Get humbled or get inspired. Maybe both.

That’s AgeBenchmark.

Value Objective

AgeBenchmark is a web app where you enter your birthday and instantly see how your current age stacks up against the ages at which historical figures hit their biggest milestones. The core of the experience lives in three modes:

  • Reality Check shows people who accomplished major things before your current age.
  • Confidence Boost shows people who accomplished major things after your current age.
  • Age Twins shows what people accomplished at exactly your age.

You can filter by category (e.g., Sports, Science, Arts, Business, etc.) and shuffle for new results. Each card links out to Wikipedia so you can fall down the rabbit hole if you want.

Beyond the product itself, this project gave me the chance to work with a tech stack I’d been wanting to try. After building Auction AId with React and FastAPI, I wanted to move up a level of abstraction with Next.js, build a real data pipeline that uses an LLM for structured extraction, and deploy something that people outside my fantasy football league could actually use.

Requirements

AgeBenchmark is a consumer-facing web app, so the requirements skew toward user experience over complex business logic. I kept the scope intentionally tight for a first release.

Functional Requirements

User Interface

  • Enter a birthday via a date picker and see your calculated age
  • Toggle between Reality Check, Confidence Boost, and Age Twins modes
  • Filter results by category (Arts & Entertainment, Sports, Science & Innovation, Business & Entrepreneurship, Politics & Government, Military & Exploration, Social Impact)
  • Shuffle for new random results within the current filters
  • Click through to Wikipedia for any figure
  • Share your results on social media

Data Pipeline

  • Ingest and merge data from three source datasets
  • Deduplicate people across datasets with different naming conventions
  • Enrich figures with Wikipedia biographical data
  • Extract structured achievements using an LLM
  • Export processed data to the application’s database

Nonfunctional Requirements

Performance

  • Initial page load under 2 seconds
  • Filter and mode changes under 500ms
  • Shuffle results under 300ms

Cost

  • Free to host and run (Vercel free tier for frontend, Supabase free tier for database)

Project Overview

High-level Architecture

agebenchmark-architecture.svg

AgeBenchmark has two major pieces: a Python data pipeline that processes and enriches the dataset, and a Next.js frontend that serves the user experience.

The pipeline ingests CSVs from three public datasets, merges and deduplicates them, enriches each figure with Wikipedia data, uses Mistral’s LLM to extract structured achievements from bios, and exports everything to a Supabase PostgreSQL database.

The frontend is a Next.js app deployed on Vercel. It queries Supabase directly from the client using Row Level Security (read-only for anonymous users, writes only through the pipeline’s service role). The UI is built with shadcn/ui components on top of Tailwind CSS.

Data Sources

The dataset is built from three public sources:

  • Birthday Twins (Kaggle) — a dataset of famous birthdays with names, dates, and professions
  • Famous Birthdays (Kaggle) — another famous birthdays dataset with a different set of people and naming conventions
  • A Brief History of Human Time (Cross Verified, Sciences Po) — an academic dataset of over 2 million notable people throughout history, with visibility scores and occupational categories

Each source has different strengths. Birthday Twins and Famous Birthdays provide clean birthdates and names for well-known figures. Brief History of Human Time provides the scale (2.2M people) along with visibility scores that helps figure out who’s actually worth showing to users.

Key Design Decisions and Tradeoffs

Why Next.js

When I built Auction AId, I chose React specifically to build a foundational understanding of the framework before adding another layer of abstraction. That was the right call, and I’m glad I did it that way. But working with plain React also showed me where I was spending time on things that a framework like Next.js handles out of the box: routing, image optimization, server-side rendering, and deployment.

Static JSON to Supabase

AgeBenchmark started with the simplest possible data approach: a static figures.json file generated by the pipeline and bundled into the Next.js build. The frontend just loaded the JSON and filtered client-side.

This worked great early on. But two problems showed up as the dataset grew:

  1. The JSON file got large. With thousands of figures and multiple achievements per person, the browser had to download and hold the whole thing in memory. Not catastrophic, but noticeable on slower connections.
  2. Updating data meant redeploying. Every time the pipeline produced new figures, I had to rebuild and redeploy the frontend.

Supabase solved both problems. The free tier gives you a full PostgreSQL database, which is more than enough for this use case. The frontend queries only the data it needs for each view (bucketed by age range, filtered by category), so the client never loads the full dataset. The pipeline can export new figures to Supabase without touching the frontend at all.

Mistral for Achievement Extraction

The core challenge of AgeBenchmark’s data pipeline is turning biographical information into structured achievements: “Marie Curie won the Nobel Prize in Physics at age 36.” The source datasets don’t have this data. They have names and birthdates, but not what the person did or when they did it.

The pipeline solves this by pulling Wikipedia bios and feeding them to Mistral’s LLM for structured extraction. Why Mistral? The free tier gives you 40 requests per minute and 400 per day. That’s enough to process the dataset incrementally over time without spending a dime.

The extraction uses Pydantic schemas to validate the LLM’s output:

class Achievement(BaseModel):
    """Schema for a single achievement."""
    description: str = Field(..., max_length=200)
    year: Optional[int] = None
    age: Optional[int] = None

class ExtractionResult(BaseModel):
    """Schema for LLM extraction output."""
    achievements: list[Achievement] = Field(..., max_length=5)

Every LLM response gets validated against this schema before it’s accepted. If the model returns something that doesn’t fit, we skip that figure and move on rather than polluting the dataset. Results are cached to disk with a 90-day expiration, so re-running the pipeline doesn’t burn through the rate limit on figures we’ve already processed.

Implementation Deep-dives

Data Pipeline

The pipeline is a Python script that orchestrates six steps. It’s designed to run incrementally, so you can process the full dataset in batches over multiple runs rather than trying to do everything at once (important when you’re rate-limited to 400 LLM calls per day).

Step 1: Merge

  1. Build a master list from Birthday Twins + Famous Birthdays (~8k people)
  2. Fuzzy deduplicate between these two datasets using rapidfuzz
  3. Add all Brief History names not already in the master list
  4. O(1) hash-based lookup for Brief History metadata enrichment (visibility scores, gender)

The fuzzy matching uses token_sort_ratio from rapidfuzz, which handles name ordering differences (“Smith, John” vs. “John Smith”) well. The threshold is set at 85%, which I found struck the right balance between catching real duplicates and avoiding false matches.

result = process.extractOne(
    name,
    candidates,
    scorer=fuzz.token_sort_ratio,
    processor=normalize_name
)

Names go through a normalization step first: lowercase, remove punctuation, replace hyphens with spaces, collapse whitespace. This catches the easy cases before fuzzy matching needs to kick in.

Step 2: Filter

Validates birthdates and filters out records that don’t have enough data to be useful.

Step 3: Wikipedia Enrichment

For each figure, the pipeline hits the Wikipedia API to pull a biographical summary and then queries Wikidata for a precise birthdate (some source datasets only have year-level precision). Results are cached to disk, and the pipeline respects a conservative rate limit of 2 requests per second to avoid getting blocked.

Step 4: LLM Achievement Extraction

Each figure’s Wikipedia bio gets sent to Mistral with a prompt asking for their most notable achievements, the year each happened, and their age at the time. The response is validated against the Pydantic schema described earlier.

The pipeline handles rate limiting by sleeping between requests and backing off exponentially if it hits a 429. The disk cache means we only call the LLM once per figure (until the cache expires after 90 days).

Step 5: Categorization

Figures are assigned to one or more of seven categories (Arts & Entertainment, Sports, Science & Innovation, etc.). For figures that come from Brief History of Human Time, the occupational classifications in the source data inform the categorization. For others, the LLM handles it during the extraction step.

Step 6: Export

The final step transforms the processed DataFrame into the achievements schema and upserts to Supabase. It also maintains the legacy figures.json export for backwards compatibility. Each figure gets a deterministic ID (MD5 hash of name + birthdate), so re-running the pipeline updates existing records rather than creating duplicates.

The Frontend

The compare page is where users spend most of their time. It uses a split timeline layout: Reality Check on the left (or as a tab on mobile), Confidence Boost on the right, with Age Twins at the top.

Bucket-based Sampling

One of the trickier UI problems was making sure results felt diverse. If you just query “all achievements where age < user’s age” and sort by rank, you’d see a cluster of achievements right below your age and nothing from people who achieved things as children or teenagers.

The solution was bucketed sampling. The frontend defines age-range buckets (e.g., 0-5 years from your age, 5-10 years, 10-20 years, 20+ years) and queries each bucket separately from Supabase. Then it applies stratified sampling client-side to pull proportionally from each bucket. This means a 30-year-old sees achievements from ages 25-29 and ages 5-10, giving a much more interesting spread.

Social Sharing

Each results page generates a shareable “Your Age Story” summary that users can post to Twitter/X or Facebook. The share text stays within platform character limits and pulls a few achievements from the user’s results.

Key Results

AgeBenchmark is live at agebenchmark.com.

I’m happy with where it landed. More importantly, the way it’s built means I can keep improving the dataset and the experience without ripping anything apart.