Media Intelligence, Sports Data, and Predictive Product Systems
F1 Intelligence Hub
A motorsport intelligence system that treats Formula 1 as a structured data-and-decision problem, not just a content feed.
Context
Formula 1 creates a uniquely fragmented information environment. Official schedules, FIA rulings, session results, constructor form, technical narratives, and media speculation all move at different speeds. The product opportunity is not just better summarization. It is to build a system that can model race-week state, preserve provenance, and expose separate surfaces for facts, analysis, and forward-looking predictions.
Problem
Most F1 products either show raw scores or publish commentary, but very few connect official race state, source quality, editorial review, and predictive intelligence in one coherent experience. Users end up switching between timing pages, standings tables, media stories, and social posts to understand what changed and what is likely to happen next.
What I Built
- A season dashboard that unifies standings, race calendar, completed weekend results, team watch, article feed, performance charts, and next-race context
- An editorial intelligence layer that separates `official_update`, `race_result`, `analysis`, and `prediction` content instead of flattening all surfaces together
- A dedicated prediction analytics page with contender rankings, win and podium probabilities, feature importance, scenario analysis, constructor outlook, and methodology notes
- A hybrid prediction service that combines live in-app season context with historical race and qualifying data and explainable feature engineering
- A sync architecture with quick race-data refresh, full sync with article ingestion and editorial refresh, and resilient fallback behavior when external data sources fail
Notes
Project overview
F1 Intelligence Hub is designed as a real sports intelligence product, not a summarization feed.
The core shift is that the system now has three distinct layers:
Live race data + source ingestion
-> editorial and intelligence modeling
-> dashboard and prediction products
That distinction matters. The app models race-week context, preserves source boundaries, and serves a dedicated prediction experience that explains why the model likes one driver over another.
Application shape
Today the system is split into a Next.js frontend and a FastAPI backend.
The frontend now has three core experiences:
- the main season dashboard
- the editorial/admin workbench
- the dedicated prediction analytics page
The backend is responsible for:
- assembling dashboard payloads
- syncing OpenF1 session and standings data
- ingesting articles and rebuilding editorial clusters
- serving the prediction model output
- preserving fallback behavior when external calls fail
This gives the project a clean service boundary: the frontend renders product surfaces, and the backend owns state derivation, orchestration, and model output.
Prediction system
Instead of returning a static podium guess, the backend now constructs a hybrid winner model that blends:
- championship form
- constructor strength and car package quality
- season wins and podium conversion
- qualifying sharpness
- recent race form
- career wins and senior-level experience
- junior-league success
- track-fit assumptions
- tyre-management fit
- expected strategy alignment
- reliability and weather adaptability
The implementation supports two modes:
Fallback mode:
in-app season data -> engineered feature scoring -> lightweight calibrated model
ML mode:
in-app season data + historical race/qualifying data -> hybrid scoring -> contender probabilities
When the ML stack is available, the system trains from Ergast-compatible historical race and qualifying data and uses FastF1 for event-context enrichment. That gives the app a stronger training base without losing the explainability of the feature-engineered layer.
Why the predictions page matters
The dedicated predictions page is not just a visual add-on. It changes the product meaningfully because it makes the model inspectable.
The page shows:
- ranked contenders
- win and podium probabilities
- feature breakdown for the favorite
- feature-importance summaries
- constructor outlook
- scenario-matrix explanations
- methodology and source notes
This is important because prediction in sports products is usually either too shallow or too opaque. Here, the user can see both the forecast and the reasoning structure behind it.
Editorial and trust model
The editorial model remains a core part of the product and becomes even more important once predictions are introduced.
The system still enforces four distinct summary classes:
official_updaterace_resultanalysisprediction
That separation gives the product a trust hierarchy:
- official and result surfaces anchor the known state
- analysis surfaces add interpretation
- prediction surfaces are explicitly speculative
This is the right structure for a sports intelligence app because it prevents the product from blending authoritative information and model output into one ambiguous narrative.
Data model
The system is already shaped around explicit domain objects:
sourcesdocumentsentitiesclusterssummary_outputsdistribution_assets- season state, standings, weekends, and sync metadata
That model is what makes the project extensible. Because the objects are explicit, the same system can support:
- fan-facing dashboard experiences
- editorial review workflows
- prediction features
- future profile pages for drivers, teams, and circuits
This is one of the biggest reasons the project now has production value. The app is built on reusable domain structure, not page-specific glue code.
Production end state
The current codebase is still an MVP in some operational areas, but the target end state is now much clearer.
The production version of F1 Intelligence Hub should look like this:
Next.js frontend
-> CDN / managed frontend hosting
FastAPI backend
-> reverse proxy + structured logging + metrics
Postgres
-> source records, editorial state, weekend snapshots, model features, prediction history
Workers / schedulers
-> OpenF1 sync, article ingestion, feature materialization, model refresh
Model layer
-> versioned feature sets, trained classifiers, cached inference outputs, evaluation tracking
In that end state:
- Postgres replaces the JSON runtime cache as the system of record
- historical snapshots support backtesting and model evaluation
- syncs run in background jobs instead of request-triggered code paths
- prediction outputs can be persisted and compared over time
- editorial actions can be audited and role-gated
- source health and fetch failures can be monitored operationally
That is the right long-term shape because sports intelligence is really a stateful systems problem. Once the project handles live race data, editorial workflows, and model output together, a durable database and job architecture stop being “nice to have” and become part of the product itself.
Postgres as the backbone
Postgres is the most important production upgrade because it unlocks several capabilities at once:
- durable runtime state
- multi-user editorial workflows
- historical weekend snapshots
- feature-store style tables for model inputs
- prediction history for evaluation and calibration
- reliable deployment across multiple backend instances
A clean production schema would likely separate:
- raw source and document ingestion tables
- normalized entities and clusters
- editorial summaries and approval history
- weekend/session/race-state snapshots
- model-feature materializations
- prediction outputs and evaluation results
That would turn the current MVP into a real operational intelligence platform.
Model operations end state
The prediction system should eventually mature beyond on-request training or fallback scoring.
The production model workflow should look like:
Historical race + qualifying data
+ current season state
+ telemetry-derived features
-> feature pipeline
-> offline training
-> evaluation and calibration
-> versioned model artifact
-> API inference service
-> saved prediction outputs
That would support:
- reproducible retraining
- calibration tracking
- versioned model rollouts
- backtesting on prior weekends
- clearer comparisons between heuristic and learned models
In other words, the current product introduces the prediction surface, and the production end state turns it into a proper model-serving system.
Why this project has stronger production value now
What makes the project valuable is that the system ties together:
- live race-week state
- editorial trust boundaries
- reusable data structures
- explainable prediction outputs
- a realistic production architecture path
That makes it interesting both as a fan product and as a systems design project. It demonstrates product judgment, information architecture, backend orchestration, and the beginnings of an actual model-driven feature set.
Final state
The best way to describe the product is:
F1 Intelligence Hub is a source-aware Formula 1 intelligence platform that combines race-week aggregation, editorial separation, and predictive analytics into a single explainable product system.
It starts as a dashboard, but the real end state is broader:
- a trustworthy fan intelligence layer
- a reusable editorial operating surface
- a prediction product with transparent methodology
- a production-ready data platform backed by Postgres and scheduled jobs
That is the version of the project that feels durable, defensible, and expandable.