Knowledge Graph and Discovery Product

YourStory Startup Graph

Built a graph-based startup intelligence system to map founders, startups, investors, domains, and funding relationships.

Structured discovery Graph-backed product

Context

This was a three-month internship at YourStory and one of the earliest projects that changed how I thought about systems. I worked directly with the CEO on a startup intelligence product that treated startup coverage as structured discovery data rather than static articles. It was also the period that pushed me fully into coding and product-building as a real discipline.

Problem

Startup media is useful to read, but difficult to navigate systematically when users want to understand who invested where, which founders cluster together, or what adjacent companies exist within a domain.

What I Built

  • A graph database in Neo4j for startups, founders, investors, domains, and funding rounds
  • Relationship-driven discovery logic that made investor-style exploration more useful than flat search
  • Java service logic for ingestion, graph writes, and repeatable structured updates
  • An Inshorts-style daily product concept for startup news that generated compact updates using funding stage, amount raised, and related startup signals

Notes

Overview

During a three-month internship at YourStory, I worked on a problem that still feels current: how do you turn a stream of startup stories into something users can explore, not just read?

The answer I pursued was a graph-backed product. Rather than treating funding news as isolated text, I modeled startups, founders, investors, rounds, domains, and stories as connected entities. That shift turned the product from search into discovery.

Why a graph made sense

The startup ecosystem is naturally relational. Investors participate in rounds. Founders connect companies over time. Domains create clusters. Stories mention multiple entities at once. Once I framed the product that way, a graph was the cleanest representation of the actual problem.

At a practical level, the system needed to support questions like:

  • which investors are most active in a specific domain?
  • what similar founder or company clusters exist around this startup?
  • what rounds or entities connect to a recent story?

Those are relationship questions, not just text-search questions.

System shape

The pipeline conceptually looked like this:

Story input -> entity normalization -> graph upsert -> traversal/query layer -> editorial or discovery surface

That middle step, normalization, mattered more than anything else. Without canonical entity handling, a graph quietly becomes misleading. Different spellings of the same company or investor create duplicate nodes and distort the network.

The engineering lesson was that the graph itself was not the hard part. Trustworthy entity resolution was.

Product surface

The graph enabled a more investor-style product surface. A user could move from one company to its founders, to the investors in a round, to other companies that shared those investors or domain patterns. That kind of traversal feels obvious once it exists, but it is hard to fake with traditional article archives.

Alongside the graph work, I also built toward a short-form startup updates product inspired by compact news formats. The interesting part there was not just templating text. It was using structured fields like stage and amount to make daily updates fast, legible, and consistent.

What this project taught me

This was one of the first projects where I was not just coding a component. I was thinking about the full chain:

  • how the world should be modeled
  • how data should be normalized
  • how queries reflect user intent
  • how editorial output can be powered by structure rather than manual repetition

That combination of knowledge modeling, infrastructure, and user-facing product thinking has stayed with me ever since.

It also made the direction feel personal. This was the point where coding stopped feeling abstract and started feeling like the way I wanted to think and build.

Why it belongs on the site

The YourStory project is a useful bridge between backend and product engineering. It shows graph modeling, data discipline, query design, and a concrete product outcome rather than a purely internal system. It also reflects an early version of a theme that still shows up in my work: when information is modeled well, the product becomes more useful and more explainable.