Choosing an ID Format at The New York Times

The Analytic Hierarchy Process is used to choose between four candidate identifier formats for the New York Times identity platform.

The New York Times Building. Photo by Ajay Suresh / CC BY 2.0.

Choosing an identifier format for a database is a consequental decision: once IDs are minted and propagated, changing the format takes significant effort. So when an engineering team at The New York Times set out to design a new centralised identity platform, they decided that picking the ID format was worth doing carefully.

In October 2022 NYT Open, the publication's engineering blog, described how that decision was made^[1]. The Times Identity team had recently encountered AHP in a StaffPlus NYC talk by Comcast Fellow John Riviello. Afterwards, they decided to trial the technique on a small but real engineering question: what identifier format should canonical Times reader IDs use? The team considered 5 candidates and ran a 90-minute group session using Comcast's open-source AHP webapp to evaluate them.

Luckily, the author published every input pairwise judgment as CSV alongside the resulting weights and scores. That gives us a complete reproduction target.

The Decision Model

The Goal

Choose an ID format for NYT user identifiers.

The Identity team needed a canonical format for the user IDs minted by their new identity platform. Once chosen, the format would be hard to change: every downstream service, every exported dataset, every analytics pipeline would carry the format forward. The decision is small in scope but large in shadow.

The Hierarchy

Decision hierarchy: the goal "Choose an ID format for NYT user identifiers" at the top, five criteria (Database Support, Developer UX, Distributed Uniqueness, Ordering, Randomness), and the four alternatives (UUID, Snowflake, Nano ID, XID) at the bottom.

The decision uses a flat two-level hierarchy. Five criteria sit directly under the goal, with the four alternatives compared against each criterion in turn. There are no subcriteria.

The Five Criteria

Database Support: whether database engines typically support the format natively, either as an explicit data type or via efficient binary representation.
Developer UX: how nice the format is to live with as a developer. Easy to copy and paste (double-clickable), broad library support across languages, manageable length. Not the same as "looks good" — the team explicitly broadened this criterion mid-discussion to include library and language support, which proved consequential.
Distributed Uniqueness: the ability to generate IDs concurrently across many nodes without collisions and without per-node configuration.
Ordering: whether IDs sort meaningfully, usually by time. Useful for range partitioning and chronological queries, less so for hash partitioning.
Randomness: how non-sequential the IDs are. A user whose ID is 93823 should not be able to guess that 93824 and 93825 are also valid IDs. Pure randomness defeats this kind of enumeration attack.

The five criteria pull in different directions. Distributed Uniqueness wants randomness, since random IDs collide vanishingly rarely. Randomness defeats Ordering, which wants a time component. Database Support tends to favour fixed-size binary formats. Developer UX favours short, copyable strings. There is no single format that wins on all five, which is exactly why the AHP is useful here.

The Four Alternatives

Alternative	Description
UUID	RFC 4122 universally unique identifier, typically v4 (random). 128 bits, conventionally rendered as `3c893566-c125-4741-a68a-33e91410b7e2`.
Snowflake	Twitter-originated 64-bit ID composed of a timestamp, a node ID, and a sequence counter. Time-ordered. Rendered as a decimal integer, e.g. `1542305793516605440`.
Nano ID	Compact URL-safe random string (default 21 characters from a 64-character alphabet), e.g. `V1StGXR8_Z5jdHi6B-myT`.
XID	A 12-byte / 20-character ID inspired by MongoDB's ObjectID, with a time component and a per-process counter, e.g. `9m4e2mr0ui3e8a215n4g`.

These are not close substitutes. UUID and Nano ID are essentially random; Snowflake and XID embed time. UUID is universally supported but ugly by default; Snowflake is fast and short but needs configuration; Nano ID is friendly but rare in databases; XID is none of the things any of the others is.

The Pairwise Comparisons

How the Comparisons Were Derived

Both the criteria comparisons and the alternative comparisons were arrived at by consensus. The Times Identity team gathered for around 90 minutes and worked through every pair in a single session, using Comcast's AHP webapp to record judgments and compute the resulting priorities live. Saaty's standard 1-9 intensity scale was used throughout, including even-numbered intermediate values (2, 4, 6) where the team felt a comparison sat between two named intensities.

There are 40 pairwise comparisons across 6 scorecards in this decision: one for the criteria themselves, and one for each of the five criteria against the four alternatives.

Criteria Comparisons: What Matters Most?

The team's first task was to weigh the five criteria against each other. Ten pairwise comparisons in total, with the following resulting priorities (decisionpoint.io, eigenvector method):

Criterion	Weight
Distributed Uniqueness	0.593
Developer UX	0.191
Randomness	0.084
Database Support	0.076
Ordering	0.056

Two things stand out.

Distributed Uniqueness dominates. The team scored it strongly or very strongly more important than every other criterion. The new platform was designed to be distributed across many nodes from day one, and an ID format that cannot guarantee uniqueness in that environment is unusable. The end result is that nearly 60% of the decision weight rides on this single criterion.

Developer UX is a distant second. This was the surprise that the team's own write-up highlights. Their original notes about Developer UX focused on whether the format is double-clickable and easy to copy-paste; in the session itself, the criterion was expanded to include library and language support, which moved its weight up sharply. That broader interpretation eventually became the deciding factor for the runner-up race.

The remaining three criteria — Randomness, Database Support and Ordering — share the residual ~22% of the weight roughly equally. None is decisive on its own.

Alternative Comparisons

For each criterion, the four candidates were compared pairwise, producing five 4×4 matrices and 30 comparisons in total.

Database Support. Snowflake and UUID lead, since both have well-established native or binary-friendly representations in major databases. XID is in the middle. Nano ID lags: its 21-character variable-alphabet string has no native database type and must be stored as a generic varchar, which limits indexing and partitioning options.

Developer UX. UUID dominates; strongly more important than Snowflake, and very strongly more important than Nano ID and XID. The team's reasoning was telling: UUID's default string representation is awful (the dashes break double-click selection), but its library and language support is universal in a way none of the others can match. Once the criterion was broadened to include support breadth, UUID's lead became unassailable.

Distributed Uniqueness. UUID wins this criterion too, with Nano ID a strong second. Snowflake suffers here despite being designed for distributed environments, because its uniqueness guarantee depends on each node having a unique configured node ID, and on the algorithm not yet having reached its 2080-or-so expiration date. Both are real operational liabilities; UUID and Nano ID have neither.

Ordering. Snowflake leads, with XID close behind, as both embed a time component. UUID and Nano ID, both essentially random, tie at the back. This is the only criterion on which UUID is genuinely weak.

Randomness. UUID wins, Nano ID is second, Snowflake and XID tie behind them. This is the inverse of Ordering: anything time-encoded is partially predictable, so Snowflake and XID lose ground.

Consistency

The overall consistency ratio across all six matrices is 0.05, rated "Excellent" by decisionpoint.io. The criteria matrix has a CR of 0.09, and the Developer UX alternative matrix likewise has 0.09 — both inside the conventional 0.10 "consistent" threshold but at the upper end of it. The remaining four alternative matrices range from 0.02 to 0.04. The slight tension at 0.09 is a healthy sign: the team made independent judgments rather than mechanically reproducing transitive ratios, which would have given an artificial CR of zero.

Results

Overall ranking bar chart: UUID first at 0.419, Nano ID second at 0.249, Snowflake third at 0.175, XID fourth at 0.156.

Overall Ranking

Rank	Alternative	decisionpoint.io Score	Article Score	Difference
1st	UUID	0.419	0.414	+0.005
2nd	Nano ID	0.249	0.244	+0.005
3rd	Snowflake	0.175	0.181	−0.006
4th	XID	0.156	0.161	−0.005

The ranking matches the article exactly, with all four scores within ±0.006. UUID wins comfortably. Nano ID takes second by a clear margin over Snowflake. XID finishes last.

The article highlights the Nano ID vs Snowflake order as a genuine surprise: the team had expected UUID and Snowflake to be the top two, and were not sure which would win. The actual result was clearer than that and went a different way. We can see why in the cross-tab below.

What Drove the Result?

The full results table shows where each alternative's score came from:

Alternative	Database Support	Developer UX	Distributed Uniqueness	Ordering	Randomness	Total
UUID	0.025	0.095	0.260	0.006	0.033	0.419
Snowflake	0.032	0.041	0.061	0.027	0.014	0.175
Nano ID	0.007	0.029	0.184	0.006	0.023	0.249
XID	0.011	0.027	0.087	0.018	0.014	0.156
Totals	0.076	0.191	0.593	0.056	0.084	1.000

UUID's win is driven by a single number: 0.260 on Distributed Uniqueness, by far the largest cell in the table. UUID is the strongest performer on the criterion that carries 59% of the decision weight, which is a near-decisive head start. It also leads on Developer UX (0.095), once that criterion is broadened to include library support. It is poor on Ordering (0.006), but Ordering is only 6% of the weight.

Snowflake's third place is the article's central surprise. Snowflake leads on Ordering (0.027 — the largest cell in that column) and is tied with UUID on Database Support, but it scores poorly on Distributed Uniqueness (0.061, lower in absolute terms than even Nano ID's 0.184). The node-configuration and expiration-date concerns penalised it on the criterion that mattered most, and that was enough to drop it below Nano ID.

Nano ID's second place comes from a single strong score — 0.184 on Distributed Uniqueness — that the team's other ratings were not strong enough to overturn. It is weakest on Database Support, but Database Support is small enough not to matter much.

XID has no criterion where it leads, which is why it finishes last.

Sensitivity Analysis

The result is robust at the top, contested in the middle.

The chart above tracks Distributed Uniqueness, the dominant criterion at 59%. UUID's line stays on top across the entire 0%-100% range; there is no weight at which any other alternative overtakes it. The three crossover circles are all rank changes lower down: Nano ID overtakes XID at low DU weight (around 7%), Nano ID overtakes Snowflake at moderate DU weight (around 37%), and Snowflake drops below XID at high DU weight (around 72%). The actual weight of 59% sits cleanly in the middle band, where the ranking reads UUID, Nano ID, Snowflake, XID.

We confirmed this analytically as well. Setting the Distributed Uniqueness weight to zero and redistributing the remaining 41% proportionally to the other four criteria, UUID still wins (~0.39 vs Snowflake ~0.28 vs XID ~0.17 vs Nano ID ~0.16). UUID leads or ties on four of the five criteria; only Ordering pushes back, and Ordering carries only 6% weight. There is no realistic reweighting that flips first place.

Stability by Criterion

Criterion	Weight	Critical Δ	Stability
Distributed Uniqueness	59%	±13%	Moderate
Developer UX	19%	±44%	Stable
Randomness	8%	—	Stable
Database Support	8%	±7%	Sensitive
Ordering	6%	±16%	Moderate

A dash in the Critical Δ column means there is no weight at which any rank changes; the criterion is fully aligned with the result.

What This Tells Us

UUID is uncrossable. None of the Critical Δs in the table refers to a flip at the top. They are all rank shuffles among the lower three.

Database Support is the only Sensitive criterion, but its impact is limited. Critical Δ is ±7%, and the criterion's actual weight is only 8%. The flip it triggers is between Snowflake and Nano ID for second place, because Snowflake leads on Database Support (local priority 0.426) by a much larger margin than Nano ID (0.095). Push the Database Support weight up by 7 points and Snowflake catches Nano ID; push it further and Snowflake passes.

Distributed Uniqueness only moderately stable, despite being aligned with the winner. That is because the criterion is large (59%), so even a 13-point reweighting produces a meaningful shift in the lower ranks. The flip it controls is, again, Nano ID vs Snowflake — driven from the other direction. Lower the DU weight enough and Nano ID's DU advantage fades, letting Snowflake catch up.

Developer UX is stable. ±44% before any rank changes, the largest buffer in the model. UUID dominates Developer UX with a local priority of 0.496, so increasing that criterion's weight further only widens UUID's lead.

The Practical Implication

For practical purposes, the team's decision to adopt UUIDs is supported by the model with high confidence. It is the rare AHP outcome where the winning alternative leads on every dimension that carries non-trivial weight, and is only weak on a criterion the team explicitly chose to weight low.

What Happened Next

Two outcomes followed directly from this exercise.

The Times committed to UUIDs for canonical user IDs. The article notes that the team is free to choose a more compact display representation. For example, Base62-encoding a 16-byte UUID gives a 22-character string like ifWIoI9ZU00gOqkgNrmE5B, which is double-clickable and not much longer than a Nano ID.

The team committed to using AHP again for the next major decision facing the platform: which distributed database to adopt. The author estimates that decision will involve more criteria, more options and more stakeholders, and may take a full day rather than 90 minutes, but reports that the team felt the diligence and consensus the AHP produces are worth the time investment. As the article puts it: "The team enjoyed the process, as it gave us a chance to closely examine the criteria and options, to better understand the features and trade-offs of each option, and to see our fine-grained analysis nicely captured by the algorithm."

What we do not know is whether the larger AHP exercise was eventually run, or what the database choice ultimately was. The article was published in October 2022 and no follow-up has appeared on NYT Open. If a follow-up does land, it would be a third instalment of unusual transparency.

Try It Yourself

This decision is published as a public, interactive model on decisionpoint.io. You can:

Explore the full decision: view the hierarchy, all 40 comparisons, and the detailed results.
Run your own sensitivity analysis: adjust any criterion weight and watch the ranking respond.
Create your own AHP decision: use the same tool to make your own multi-criteria decisions.

Methodology Notes

Eigenvector vs Geometric Mean

The Comcast AHP webapp the NYT team used computes priorities by taking the row geometric means of each pairwise comparison matrix and normalising them to sum to 1. decisionpoint.io uses Saaty's principal eigenvector method instead. The two are mathematically distinct but converge on the same answer for matrices that are perfectly consistent, and produce values within thousandths of each other for matrices with mild inconsistency.

This study is the first case study in which we have access to both an independently-published reference set of weights and the underlying input matrices, and the results are interesting. The drift at the criterion level is more visible than usual:

Criterion	Article (geometric mean)	decisionpoint.io (eigenvector)	Δ
Distributed Uniqueness	0.558	0.593	+0.035
Developer UX	0.207	0.191	−0.016
Randomness	0.092	0.084	−0.008
Database Support	0.083	0.076	−0.007
Ordering	0.060	0.056	−0.004

The two methods agree on order, but the eigenvector pulls weight toward the dominant row (Distributed Uniqueness) by 3.5 percentage points more than the geometric mean does. That is the expected direction: the principal eigenvector amplifies dominance in matrices with mild inconsistency, while the geometric mean treats each row's evidence symmetrically. Neither is "wrong"; they answer slightly different mathematical questions.

What is more striking is that the alternative-level drift is much smaller than the criterion-level drift:

Rank	Alternative	Article	decisionpoint.io	Δ
1st	UUID	0.414	0.419	+0.005
2nd	Nano ID	0.244	0.249	+0.005
3rd	Snowflake	0.181	0.175	−0.006
4th	XID	0.161	0.156	−0.005

Drift at the criterion level (±0.035) is roughly six times the drift at the alternative level (±0.006). The reason is that the rank orderings within each alternative matrix are stable across both methods, so when the criterion weights drift one way, the alternative scores drift the other way and the totals stay close. The model is more robust than the eigenvector-vs-geometric-mean disagreement at the top suggests.

For the Brazilian Navy and Wikipedia studies, criterion-level drift was negligible (≤0.001 and ≤0.003 respectively), so this comparison is the first time the difference between the two methods has had any visible effect on our reproductions. It does not change any of the rankings, but it is a good empirical note: when comparing AHP scores from different tools, the order is almost always reproducible; the precise magnitudes can differ by a few hundredths.

Consistency at the Upper Threshold

Two of the six matrices in this model — the Criteria matrix and the Developer UX alternative matrix — sit at CR = 0.09, just below the conventional 0.10 cutoff. Judgments at the upper edge of the conventional threshold are part of how genuine human consensus tends to look. A perfectly consistent matrix is often a sign that the inputs were derived (e.g. from numerical attribute data) rather than independently judged.

References

Wheeler, D.E. (2022, October 13). "Collective Decision-Making with AHP." NYT Open. https://open.nytimes.com/collective-decision-making-with-ahp-3ef819e5bc2a
Wheeler, D.E. (2022). AHP CSV data for the NYT user ID format decision. GitHub Gist. https://gist.github.com/theory/bdb77913e85f3e097d49e8e9155f91c9
Riviello, J. (2022). Collective Decision-Making with the Analytic Hierarchy Process. StaffPlus NYC. https://leaddev.com/talks/collective-decision-making-with-the-analytic-hierarchy-process
Saaty, T.L. (1990). "How to make a decision: The Analytic Hierarchy Process." European Journal of Operational Research, 48(1), 9-26.