Methodology

How the rankings are built

This page documents how the Top 100 list is constructed, what's in the data, and what's deliberately out.

Data sources

SourceWhat it givesLimitations
arXiv (math.NT, math.CO)Preprint-level: titles, abstracts, authors, dates, co-author graphBiased toward people who post preprints. Senior figures who publish only in journals are undercounted.
OpenAlexAuthor-level: paper count, citations, affiliations, countryConcept tagging is noisy in math; surname-only matching can misidentify.
zbMATH OpenCurated math review database using MSC classes 11N05, 11N35, 11N36Coverage of older non-Western mathematicians is the best of the three sources.
Math Genealogy ProjectAdvisor-student treesDissertation-era affiliations only; gaps for some non-Western mathematicians.

Pipeline

  1. arXiv pull: 17 search terms restricted to the math.NT and math.CO categories. Title-weighting: a match in the paper title counts at full weight, abstract-only at half.
  2. OpenAlex pull: phrase queries, author cap of 10 per work to remove physics megapapers. Title-weighting applied.
  3. zbMATH pull: documents tagged with the MSC classes 11N05, 11N35, 11N36.
  4. Merge and scoring: weighted order statistic with 70/20/10 on the best, middle, and worst of each researcher's three ranks. Lower combined score ranks higher.
  5. Estimating a missing rank: missing ranks are interpolated from the nearest ranked neighbours and shown in [square brackets].

What's not in this list