Product-Market Fit Quantified: A Composite Score Using Retention Curves, NPS Decomposition, and Usage Depth

TL;DR: "You'll know product-market fit when you feel it" is advice that has burned through billions in venture capital. A composite PMF Score (0-100) combining retention curve analysis, NPS decomposition, and usage depth scoring replaces gut feeling with quantitative measurement -- any company with 6 months of user data can compute it, and it is more reliable than Sean Ellis's 40% rule, which captures only one dimension of the product-market relationship.

The Most Expensive Vague Concept in Business

Marc Andreessen wrote the blog post that launched a thousand pitch decks. In 2007, he described product-market fit as the moment when a product is in a good market and the product satisfies that market. His diagnostic was visceral: when PMF is happening, customers are pulling the product out of your hands. When it is not, everything feels heavy.

This description is directionally correct and operationally useless.

The problem is not that Andreessen was wrong. The problem is that a generation of founders, investors, and board members adopted a feeling-based definition for the most consequential binary in startup life: do we have product-market fit, or do we not? The answer to this question determines whether a company should hire aggressively or conserve cash, whether it should raise at aggressive valuations or accept down rounds, whether the founder should double down on the current product or pivot entirely. And bridging the gap between strategy and execution depends entirely on getting this diagnosis right.

And the standard operating procedure for answering it has been, roughly: ask the CEO how things feel.

This is not acceptable. Product-market fit is a measurable state. It has quantitative signatures across multiple dimensions, retention behavior, satisfaction distribution, usage patterns, and engagement depth. These signatures can be combined into a composite score that does not replace judgment but gives judgment something to work with other than intuition.

The framework proposed here synthesizes three measurement approaches: retention curve analysis, NPS decomposition, and usage depth scoring. Each captures a different facet of the product-market relationship. Together, they produce a composite PMF Score that ranges from 0 to 100 and has diagnostic properties that no single metric can provide.

Sean Ellis's 40% Rule: Useful Starting Point, Terrible Endpoint

In 2010, Sean Ellis proposed what became the most widely cited PMF heuristic: survey your users with the question "How would you feel if you could no longer use this product?" and if 40% or more answer "very disappointed," you have product-market fit.

The heuristic has several merits. It is dead simple to implement. It captures something real, the intensity of user attachment. And the 40% threshold, derived from Ellis's analysis of companies he had worked with at LogMeIn, Dropbox, and Eventbrite, proved to be a reasonable discriminator between products that succeeded and products that struggled.

But the 40% rule has accumulated enough mileage to reveal its limitations.

Limitation 1: Survey bias is structural. The users who respond to in-product surveys are disproportionately engaged. They are the users most likely to say "very disappointed" because they are the users who actually use the product regularly. The 40% threshold may therefore be measuring responder engagement rather than true product-market fit. A product with high engagement among a tiny niche and complete indifference from the broader market can score above 40%.

Limitation 2: The threshold is context-independent. Is 40% the right bar for a consumer social product, an enterprise security tool, and a B2B marketplace? The denominator problem alone makes this suspect. A product with 50 users where 25 say "very disappointed" is fundamentally different from a product with 50,000 users where 20,000 say the same thing, even though both clear the 40% threshold.

Limitation 3: It measures declared intent, not revealed preference. What users say they would feel and what they actually do when alternatives exist are different things. The gap between stated and revealed preference is well-documented in behavioral economics. A user who claims they would be "very disappointed" might churn the moment a competitor offers a 15% discount.

Limitation 4: It is a snapshot, not a trajectory. The 40% score tells you where you are. It says nothing about whether you are moving toward or away from PMF. A company at 38% and accelerating is in a fundamentally different position from one at 42% and decelerating.

Table 1: Sean Ellis Survey vs. Composite PMF Score, Structural Comparison

Dimension	Sean Ellis Survey	Composite PMF Score
Data source	Self-reported survey	Behavioral + survey + usage data
Sample bias	High (responder bias)	Low (uses full user base behavior)
Temporal resolution	Point-in-time snapshot	Continuous trajectory tracking
Context sensitivity	One threshold for all products	Weighted by business model
Leading vs. lagging	Lagging indicator	Mix of leading and lagging
Diagnostic depth	Binary (above/below 40%)	Multi-dimensional with component scores
Cost to implement	Low (single survey question)	Moderate (requires analytics infrastructure)

The Ellis survey remains a useful screening tool. If you are below 40%, you almost certainly do not have PMF. But the converse is not reliable. Being above 40% is necessary but not sufficient. A proper measurement framework must go deeper.

Retention Curves: The Strongest PMF Signal

If you could measure only one thing to assess product-market fit, measure retention.

The retention curve, the percentage of users who remain active over time after their first use, is the purest behavioral signal of whether a product is delivering enough value to sustain engagement. Survival analysis provides the rigorous statistical framework for modeling these curves, including the ability to handle censored observations and estimate hazard rates over time. Unlike surveys, retention cannot be gamed by enthusiasm or skewed by responder bias. It measures what users actually do, repeatedly, over weeks and months.

The critical diagnostic is the shape of the curve, not its starting level.

The retention curve for a cohort starting at time $t_0$ can be modeled as a shifted exponential decay with an asymptote:

R(t) = (R_0 - R_\infty) \cdot e^{-\lambda (t - t_0)} + R_\infty

where $R_0$ is the initial retention (typically 100% at $t_0$ ), $R_\infty$ is the asymptotic retention rate (the long-run steady state), and $\lambda > 0$ is the decay rate. The key diagnostic: if $R_\infty > 0$ , the curve flattens and PMF exists within that cohort.

A flattening curve indicates PMF. When the retention curve bends and stabilizes at some asymptote, even if that asymptote is 15% or 20%, it means a cohort of users has found enough value to make the product a persistent part of their behavior. These users are not leaving. The product has found its audience.

A continuously decaying curve indicates absence of PMF. When retention drops steadily toward zero with no flattening ( $R_\infty \approx 0$ ), no cohort of users has found sufficient value. Everyone is eventually leaving. The product may generate initial excitement but fails to sustain it.

Figure 1: Retention Curve Shapes, Four PMF States (User Cohort Retention %)

The Strong PMF curve flattens around 35%, meaning roughly one in three users who try the product become persistent users. The No PMF curve decays exponentially toward zero. But the diagnostic power lies in the middle cases. The Moderate PMF curve flattens at a lower asymptote, and the Weak PMF curve has a barely perceptible inflection before continuing its decline.

Here is a Python implementation that fits the retention curve model, extracts PMF diagnostics, and computes the retention component score:

import numpy as np
import pandas as pd
from scipy.optimize import curve_fit
 
def retention_model(t, R0, R_inf, lam):
    """Exponential decay with asymptote."""
    return (R0 - R_inf) * np.exp(-lam * t) + R_inf
 
def analyze_retention_curve(cohort_retention: pd.Series) -> dict:
    """Fit retention model and extract PMF diagnostics.
 
    Args:
        cohort_retention: Series indexed by week number (0, 1, 2, ...)
            with values as retention percentage (0-100).
    Returns:
        Dictionary with asymptotic retention, decay rate,
        time-to-flattening, and retention component score.
    """
    t = cohort_retention.index.values.astype(float)
    y = cohort_retention.values
 
    # Fit the model
    popt, _ = curve_fit(
        retention_model, t, y,
        p0=[100.0, 10.0, 0.3],
        bounds=([0, 0, 0.01], [100, 100, 5.0]),
        maxfev=5000,
    )
    R0, R_inf, lam = popt
 
    # Time to flattening: when curve is within 5% of asymptote
    if lam > 0:
        ttf = -np.log(0.05) / lam  # weeks
    else:
        ttf = float('inf')
 
    # Retention component score (0-100)
    arr_score = min(R_inf / 35 * 40, 40)    # 35% benchmark
    ttf_score = max(30 - (ttf - 4) * 3, 0)  # penalize slow
    score = arr_score + ttf_score
 
    return {
        "asymptotic_retention": round(R_inf, 1),
        "decay_rate": round(lam, 3),
        "time_to_flattening_weeks": round(ttf, 1),
        "retention_component_score": round(min(score, 70), 1),
    }
 
# Example usage
cohort = pd.Series(
    [100, 68, 52, 45, 41, 38, 36, 35, 35],
    index=range(9)
)
print(analyze_retention_curve(cohort))

Three specific retention metrics feed into the composite score:

1. Asymptotic retention rate (ARR). The level at which the retention curve stabilizes. For consumer products, an ARR above 25% is strong. For SaaS, above 85% monthly retention (which translates to roughly 15% annual churn) is the benchmark.

2. Time to flattening (TTF). How many weeks or months until the curve inflects and begins to stabilize. A shorter TTF means users discover the product's core value faster. A long TTF (say, 12+ weeks) suggests the value proposition is real but hard to access, an onboarding problem masquerading as a PMF problem.

3. Cohort-over-cohort improvement. Are newer user cohorts retaining better than older ones? If so, the product is improving its fit with the market over time. If newer cohorts retain worse despite a growing user base, the product may be expanding beyond its core audience prematurely.

NPS Decomposition: Going Beyond the Top-Line Score

Net Promoter Score has become the default satisfaction metric for technology companies. The calculation is well-known: subtract the percentage of detractors (scores 0-6) from the percentage of promoters (scores 9-10). The resulting number, ranging from -100 to 100, is supposed to indicate customer loyalty.

The standard NPS calculation collapses rich distributional information into a single scalar:

\text{NPS} = \frac{|\text{Promoters}|}{|\text{Total}|} - \frac{|\text{Detractors}|}{|\text{Total}|} = P_{\text{prom}} - P_{\text{det}}

The top-line NPS is nearly useless for PMF assessment. A company with 50% promoters and 40% detractors has an NPS of 10. A company with 50% promoters and 5% detractors has an NPS of 45. These are categorically different situations, but both are reduced to a single number that hides the distribution.

PMF assessment requires decomposing NPS into its constituent parts.

The promoter percentage indicates intensity of positive sentiment. For PMF measurement, the absolute percentage of promoters matters more than the NPS itself. A product with 60% promoters has strong pull regardless of how many detractors exist, the detractors may represent users outside the core use case rather than evidence of product failure.

The detractor percentage indicates the breadth of dissatisfaction. High detractor rates combined with high promoter rates suggest a polarizing product, one that excels for its target audience but fails for everyone else. This pattern often indicates PMF within a segment but not across the broader market.

The passive percentage (scores 7-8) is the overlooked middle. A high passive rate is the most ambiguous signal. These users are not dissatisfied enough to leave but not satisfied enough to recommend. They are the swing cohort, the users most likely to churn when a competitor offers something marginally better.

The distribution shape matters more than the mean. A bimodal distribution (peaks at both 9-10 and 0-3) tells a different story from a unimodal distribution centered at 7. The bimodal shape says the product is polarizing, it serves some users extraordinarily well and others poorly. The unimodal shape at 7 says the product is adequate for everyone but extraordinary for no one.

Figure 2: NPS Distribution Comparison, Three Companies with Similar NPS Scores (25, 15, 20)

Company A has an NPS of 25 (55 minus 30). Company B has an NPS of 15 (35 minus 20). Company C has an NPS of 20 (45 minus 25). By top-line NPS, Company A appears strongest. But Company B has the highest passive cohort at 45%, these users are convertible to promoters with product improvements, and the detractor base is the smallest. Company A's high promoter count is offset by a high detractor count, suggesting a polarization problem.

For the composite PMF Score, we extract four sub-metrics from NPS data:

Promoter intensity, percentage scoring 9-10.
Detractor severity, percentage scoring 0-3 (the extreme detractors, not just 0-6).
Passive convertibility, the percentage of passives who increased their score over time (a leading indicator).
Segment consistency, whether NPS scores vary by user segment, use case, or acquisition channel. High variance suggests PMF in some segments but not others.

Usage Depth Metrics: DAU/MAU, Feature Adoption, Session Frequency

Retention tells you whether users come back. NPS tells you how they feel about it. Usage depth tells you what they actually do when they are there.

Three usage metrics provide the foundation for depth assessment:

DAU/MAU Ratio. The ratio of daily active users to monthly active users measures the intensity of habitual engagement. A DAU/MAU of 50% means the average monthly user engages on roughly half the days in a month, a product woven into daily routine. A DAU/MAU of 10% means monthly users visit about three days per month, occasional engagement.

Benchmarks vary by category. Social media products aim for DAU/MAU above 50% (Facebook historically maintained 65%+). Productivity tools target 40-60%. B2B SaaS tools with weekly use cases may be healthy at 25-30%. The absolute number is less informative than the comparison against category-appropriate benchmarks.

Feature adoption breadth. How many of the product's core features does the average user engage with? A user who interacts with one feature out of ten has a narrow relationship with the product. A user who interacts with six out of ten has a deep one. Breadth of feature adoption correlates with switching costs, users who depend on multiple features are harder to dislodge.

The metric is typically expressed as the percentage of users engaging with more than N core features within a 30-day window. Products with strong PMF generally see 60%+ of active users engaging with three or more core features. Products without PMF often see usage concentrated in a single feature, with the rest ignored.

Session frequency and duration. How often users visit and how long they stay per visit. Frequency indicates habitual pull. Duration indicates depth of engagement per session. The product of frequency and duration, total time spent per user per period, is a proxy for the product's share of the user's attention budget.

The combination matters. High session frequency with low feature breadth suggests a utility product, used often for one thing. This can indicate PMF for a narrow use case. High feature breadth with low frequency suggests a comprehensive tool that users visit only when needed. The composite scoring weights both dimensions.

Building the Composite PMF Score

The composite PMF Score combines the three measurement domains, retention, NPS decomposition, and usage depth, into a single number between 0 and 100. The score is not an average. It is a weighted composite that accounts for the different signal qualities of each component.

The formula:

\text{PMF Score} = 0.40 \cdot R + 0.25 \cdot N + 0.35 \cdot U

where $R$ is the retention component score (0--100), $N$ is the NPS decomposition component score (0--100), and $U$ is the usage depth component score (0--100).

Retention receives the highest weight because it is behavioral, not self-reported, and because it is the hardest metric to manipulate. Usage depth receives the second-highest weight because it captures the intensity of the product relationship beyond mere return visits. NPS receives the lowest weight because it is survey-based and subject to the biases discussed earlier, but it still contributes because it captures subjective value perception that pure behavioral data misses.

Scoring each component (0-100 scale):

Retention Component:

Asymptotic retention rate: 0-40 points (scaled against category benchmark)
Time to flattening: 0-30 points (shorter = better)
Cohort improvement trend: 0-30 points (improving = better)

NPS Component:

Promoter intensity: 0-35 points
Detractor severity (inverse): 0-25 points
Passive convertibility: 0-20 points
Segment consistency: 0-20 points

Usage Depth Component:

DAU/MAU ratio: 0-30 points (scaled against category)
Feature adoption breadth: 0-40 points
Session frequency x duration: 0-30 points

Figure 3: Composite PMF Score, Five Company Profiles (Component and Composite Scores, 0-100)

The chart reveals patterns that a single metric would miss. Startup A scores low across all dimensions, a clear absence of PMF. Startup C scores high across all dimensions, clear PMF. But the interesting cases are in the middle.

Scale-up D shows strong retention and usage depth but weak NPS, users keep using the product out of necessity (perhaps it has become embedded in their workflow) but do not feel positively about it. This is "PMF by inertia", the product fits the market structurally but not emotionally. It is vulnerable to a competitor that provides equivalent functionality with a better experience.

Enterprise E shows strong retention with moderate NPS and good usage depth. The moderate NPS likely reflects variation across user segments, the product has strong PMF within its core segment but weaker fit as it expands. The component-level analysis reveals this; the composite alone would not.

The PMF Measurement Framework: Leading Indicators

The composite score measures where you are. Leading indicators tell you where you are going.

Most PMF metrics are lagging, they report what already happened. Retention curves require weeks of data to form. NPS requires time for user sentiment to stabilize. Usage depth metrics need enough sessions to represent a pattern rather than noise.

A complete PMF measurement framework must include leading indicators that anticipate changes in the composite score before those changes manifest in the lagging metrics.

Leading Indicator 1: Activation rate trajectory. The percentage of new users who complete a defined activation milestone (the "aha moment") within their first session or first week. If this rate is increasing cohort-over-cohort, PMF is likely strengthening even before retention data confirms it.

Leading Indicator 2: Organic acquisition ratio. The percentage of new users acquired through organic channels, word of mouth, direct traffic, organic search, versus paid channels. A rising organic ratio indicates that existing users are pulling new users in, which is one of the most reliable forward signals of PMF.

Leading Indicator 3: Time-to-value compression. How quickly new users reach their first value-generating action. If this time is decreasing across cohorts (because of product improvements, better onboarding, or clearer positioning), PMF is likely improving.

Leading Indicator 4: Feature request convergence. In early-stage products, feature requests are scattered across many directions, a sign that users want different things, which often means the product has not found its center of gravity. As PMF approaches, feature requests begin to cluster around a few themes. This convergence is a qualitative leading indicator that the market is telling you what it wants.

Leading Indicator 5: Expansion revenue velocity (for SaaS). The rate at which existing customers increase their spending, through seat expansion, plan upgrades, or additional module adoption. Rising expansion velocity is a leading indicator because it measures the depth of PMF: not just that users stay, but that they increase their commitment over time.

The framework operates at two temporal horizons:

Weekly pulse: Activation rate, time-to-value, organic acquisition ratio. These are fast-moving indicators that signal directional changes within days or weeks.

Monthly review: Full composite score calculation, cohort retention analysis, NPS decomposition, usage depth assessment. These are the structural metrics that confirm or refute the signals from the weekly pulse.

Rahul Vohra and the Superhuman PMF Engine

Rahul Vohra's approach at Superhuman represents the most rigorous public example of systematic PMF measurement in practice. His methodology, published in 2018, operationalized the Ellis survey into a continuous feedback engine that did more than measure PMF, it directed product development toward PMF.

The Superhuman process worked as follows. The team surveyed users with the Ellis question ("How would you feel if you could no longer use Superhuman?") and segmented the responses. Rather than simply tracking the 40% threshold, Vohra's team analyzed which users said "very disappointed" and what they valued most. Separately, they analyzed which users said "somewhat disappointed" or "not disappointed" and what those users were missing.

The insight was to focus product development on the gap between these two groups. The "very disappointed" users defined the core value proposition. The "somewhat disappointed" users indicated where that value proposition could extend. Product roadmap priority went to features that would convert "somewhat disappointed" users to "very disappointed" without alienating the existing promoters.

This approach is a precursor to the composite framework proposed here, with one critical addition: Vohra's method was survey-centric. It relied on what users said. The composite score supplements this with what users do, retention behavior and usage depth, to reduce the gap between declared and revealed preference.

Where Vohra's method excels is in its diagnostic specificity. By asking follow-up questions ("What type of people do you think would most benefit from Superhuman?" and "How can we improve Superhuman for you?"), the team extracted not just a PMF score but a product development roadmap derived from the score's components.

The limitation is scalability. The Superhuman method works well for products with hundreds or low thousands of users where each survey response can be read and categorized by a human. At scale, tens of thousands of users and above, the behavioral components of the composite score become necessary because they do not require active user participation.

When PMF Metrics Disagree With Each Other

The most instructive moments in PMF measurement are not when all metrics align. They are when the metrics diverge. Each pattern of disagreement has a specific diagnostic meaning.

Pattern 1: High retention, low NPS. Users keep coming back but do not feel good about it. This is the "necessary evil" pattern, common in enterprise software where switching costs are high and alternatives are limited. The product fits the market structurally (users need it) but not experientially (users do not enjoy it). This state is stable in the short term but vulnerable to disruption. Any competitor that matches the functionality with a better experience will attract the dissatisfied retained users rapidly.

Pattern 2: High NPS, low retention. Users love the product when they use it but do not use it often enough. This is the "love-but-forget" pattern, common in products that solve intermittent problems (travel planning, tax preparation, event management). The product has genuine PMF within the usage occasion but lacks a mechanism for habitual engagement. The strategic response is not to improve the product but to increase usage frequency through expansion into adjacent use cases or creation of ongoing engagement loops.

Pattern 3: High usage depth, low NPS and low retention. Users who remain engage deeply, but most users leave and satisfaction is low. This is the "power user island" pattern, the product has found a small group of highly engaged users while failing the broader market. The question is whether the power user segment is large enough to sustain a business, or whether the product needs to broaden its appeal.

Pattern 4: Strong metrics in one segment, weak in others. This is the most common pattern and the most frequently misinterpreted. Aggregate metrics hide segment-level variation. A product with a composite score of 50 might be a 75 in its core segment and a 25 everywhere else. The strategic response is not to improve the average, it is to double down on the strong segment and stop acquiring users in the weak ones.

Pattern 5: Improving retention, declining NPS. Paradoxical but real. The product is becoming more embedded in user workflows (improving retention) while user satisfaction declines. This typically signals that the product is becoming indispensable through integration and data lock-in, but the user experience is not keeping pace. The classic "too painful to leave, too frustrating to love" dynamic.

Each disagreement pattern points to a specific strategic response. This diagnostic power is the primary advantage of a multi-dimensional measurement approach over any single metric.

PMF Across Business Models: Marketplace vs. SaaS vs. Consumer

Product-market fit manifests differently across business models, and the composite score must be calibrated accordingly. A DAU/MAU ratio of 20% might signal strong PMF for a B2B procurement marketplace and alarming weakness for a consumer social product.

SaaS Products. Retention is measured as logo retention (do customers renew?) and net dollar retention (do they spend more over time?). The benchmarks are well-established: net dollar retention above 120% is considered strong PMF. NPS is typically surveyed at the account level. Usage depth is measured by seat utilization rate, feature adoption per seat, and the breadth of use cases served within each account. SaaS PMF is relatively straightforward to measure because the relationship is contractual and the usage data is comprehensive.

Consumer Products. Retention is measured as DAU or WAU cohort retention, the percentage of a signup cohort that remains active after N days or weeks. Consumer retention is more volatile than SaaS retention because there is no contract. NPS is harder to gather at scale because consumer users have less patience for surveys. Usage depth, session frequency, feature breadth, and time spent, becomes the dominant signal. Consumer PMF is harder to measure but also harder to fake.

Marketplace Products. PMF must be measured on both sides of the marketplace, and the two sides may be in different PMF states. A marketplace can have strong supply-side PMF (sellers love the platform and retain well) with weak demand-side PMF (buyers try it once and leave), or vice versa. The composite score should be calculated separately for each side and then combined with a weighting that reflects the relative strategic importance of each side at the current stage.

Additionally, marketplace PMF includes a dimension absent from SaaS and consumer: liquidity. A marketplace may have strong retention and usage depth on both sides but fail to achieve PMF because the matching frequency is too low, there are not enough transactions to satisfy both sides. Liquidity metrics (time to match, match success rate, transaction frequency) should supplement the composite score for marketplace contexts.

Table 3: PMF Measurement Calibration by Business Model

Metric	SaaS	Consumer	Marketplace
Primary retention measure	Net dollar retention	DAU/WAU cohort retention	Both-side cohort retention + liquidity
NPS collection method	Account-level quarterly survey	In-app micro-survey	Separate surveys per side
Usage depth emphasis	Seat utilization, feature adoption	Session frequency, time spent	Transaction frequency, match quality
Key PMF threshold	NDR > 120%	D30 retention > 25%	Liquidity ratio > 60%
Leading indicator	Expansion revenue velocity	Organic acquisition ratio	Supply-side retention + demand repeat rate
Common false positive	High retention from lock-in	High DAU from notifications	High GMV from subsidized transactions
Composite score calibration	Retention 45%, NPS 25%, Usage 30%	Retention 35%, NPS 20%, Usage 45%	Retention 35%, NPS 20%, Usage 25%, Liquidity 20%

PMF Decay: When You Had It and Lost It

PMF is not permanent. This is perhaps the most underappreciated fact in startup strategy. A company can achieve strong product-market fit and then lose it, gradually, and often without recognizing the loss until it has become severe.

PMF decay has five common causes:

1. Market evolution. The market moves and the product does not move with it. What users needed three years ago is not what they need today. Evernote had clear PMF in 2012 as a note-taking tool. By 2018, the market had shifted toward collaborative productivity, and Evernote's single-player model no longer fit the market's center of gravity.

2. Segment expansion beyond the core. A company with strong PMF in its initial segment expands to adjacent segments where the fit is weaker. The aggregate metrics decline, but the decline is attributed to "growing pains" rather than recognized as dilution of PMF. This is especially common post-Series B, when growth pressure pushes companies to expand their addressable market before they have expanded their product-market fit.

3. Product complexity creep. Features accumulate. The product that once solved a clear problem now tries to solve fifteen problems. New user activation drops because the product's value proposition becomes harder to grasp. Existing users are retained by inertia and sunk cost, but new cohort retention deteriorates.

4. Competitive displacement. A competitor enters with a product that better fits the market's current needs. PMF is relative, not absolute, a product that fits the market well can lose that fit when a better-fitting alternative appears. The market did not change; the competitive landscape did.

5. Pricing misalignment. The product's value delivery remains strong, but pricing moves out of alignment with willingness to pay. This can happen through price increases, through the commoditization of the product's core value (making users less willing to pay the same price), or through economic shifts that change the customer's budget constraints.

Figure 4: PMF Decay Over Time, Composite Score and Component Trajectory

The decay pattern in this example shows NPS declining first, followed by usage depth, with retention declining last. This sequence is typical. User sentiment deteriorates before behavior changes, people feel less positive about the product before they actually use it less, and they use it less before they actually leave. This lag structure means NPS decomposition serves as an early warning system for retention-level decay.

The composite score captures this decay when no single metric would trigger an alarm. In the chart, the company drops from a score of 72 (strong PMF) to 42 (marginal PMF) over six quarters. A company tracking only retention might not notice the problem until Q3 of Year 2, when retention drops below 60%. A company tracking the composite score would flag the decline starting in Q4 of Year 1, four quarters earlier.

Using the Composite Score for Fundraising and Resource Allocation

The composite PMF Score has two primary operational applications: communicating product progress to investors and guiding internal resource allocation.

For fundraising, the composite score provides a vocabulary that is more precise than narrative and more nuanced than a single metric. Rather than telling a Series A investor "we think we have product-market fit," a founder can present a composite score with component breakdowns, trend lines, and segment-level analysis.

Investors pattern-match against heuristics. A composite score gives them a structured pattern to evaluate. More importantly, the score's trajectory, improving, stable, or declining, communicates something that a single-point metric cannot: momentum. An investor comparing a company at PMF Score 55 and improving versus a company at PMF Score 65 and flat is making a fundamentally different assessment with the trajectory data than without it.

The component breakdown also helps investors understand risk. A company with strong retention and usage depth but moderate NPS has different risk characteristics from a company with strong NPS but moderate retention. The former is more durable (behavioral fit exists); the latter is more fragile (emotional fit exists but behavioral stickiness does not).

For resource allocation, the composite score provides decision rules for the perennial startup question: should we invest in growth or invest in product?

When the composite score is below 50, allocating resources to growth is premature. The product does not fit the market well enough for acquired users to stick. Growth spending will accelerate the top of the funnel while the bottom leaks. The rational allocation is: 80% product, 20% distribution.

When the composite score is between 50 and 70, the product fits part of the market. The question is which part. Resource allocation should focus on understanding segment-level variation in the composite score and deepening fit within the strongest segments before broadening to new ones. The allocation shifts to: 50% product, 50% distribution.

When the composite score exceeds 70, the product has established fit. Growth spending has high leverage because acquired users retain and expand. The allocation shifts to: 30% product, 70% distribution, with the product investment focused on maintaining and strengthening PMF rather than searching for it.

Table 4: Resource Allocation Framework by PMF Score Range

PMF Score Range	Stage	Product Investment	Growth Investment	Primary Objective
0-25	Pre-PMF	90%	10%	Discovery: find the core value proposition
25-50	Approaching PMF	75%	25%	Convergence: narrow the target and deepen the fit
50-70	PMF Established (Segment)	50%	50%	Expansion: extend fit to adjacent segments
70-85	Strong PMF	30%	70%	Scale: invest in growth with confidence in retention
85-100	Dominant PMF	25%	75%	Defend: maintain fit while scaling aggressively

The framework is not prescriptive in its exact percentages, these will vary by company stage, burn rate, and competitive context. But the directional logic is consistent: investment in growth should scale with evidence of product-market fit, and the composite score provides that evidence with more reliability than any single metric or executive intuition.

One additional application merits mention: the composite score can be used to evaluate M&A targets. An acquiring company can compute the composite PMF Score for a target's product to assess whether the reported growth is sustainable (backed by strong PMF) or fragile (driven by spending in the absence of PMF). Several growth-stage acquisitions in the 2021 era would have looked less attractive under composite PMF analysis, the growth was real but the underlying fit metrics were deteriorating.

Conclusion

Product-market fit is the most important concept in startup strategy and one of the least rigorously measured. The industry has relied on a combination of gut feeling, a single survey question, and narrative pattern matching to make decisions that determine whether companies live or die.

The composite PMF Score does not solve the measurement problem perfectly. No single number can capture the full complexity of the relationship between a product and its market. But it solves the problem far better than the alternatives by combining behavioral data (retention curves), subjective data (NPS decomposition), and engagement data (usage depth) into a structured framework that is both diagnostic and directional.

The framework's value lies not in the final number but in the decomposition. A PMF Score of 55 tells you less than the fact that retention is strong at 72 but NPS is weak at 38 and concentrated in a specific user segment. The components tell the story; the composite provides the summary.

Three principles should govern its application. First, measure continuously rather than episodically. PMF is a dynamic state, not a static achievement. Second, decompose always, never report the composite without the components. Third, calibrate for context, the same score means different things for a consumer social product, an enterprise SaaS platform, and a two-sided marketplace.

The companies that measure PMF rigorously will make better decisions about when to hire, when to spend, when to raise, and when to pivot. The companies that rely on feel will occasionally be right and will have no way of knowing whether their rightness was skill or luck.

Measurement does not guarantee success. But it does guarantee that failure, when it comes, will at least be instructive.

References

Andreessen, M. (2007). "The only thing that matters." Blog post, Pmarchive.
Ellis, S. (2010). "The Startup Pyramid." Blog post, Startup-Marketing.com.
Vohra, R. (2018). "How Superhuman Built an Engine to Find Product/Market Fit." First Round Review.
Chen, A. (2021). The Cold Start Problem: How to Start and Scale Network Effects. Harper Business.
Reichheld, F. (2003). "The One Number You Need to Grow." Harvard Business Review, 81(12), 46-54.
Olsen, D. (2015). The Lean Product Playbook: How to Innovate with Minimum Viable Products and Rapid Customer Feedback. Wiley.
Cagan, M. (2018). Inspired: How to Create Tech Products Customers Love. 2nd Edition, Wiley.
Reeves, M. and Deimler, M. (2011). "Adaptability: The New Competitive Advantage." Harvard Business Review, 89(7/8), 134-141.
Ries, E. (2011). The Lean Startup: How Today's Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses. Crown Business.
Lenny Rachitsky (2023). "What is good retention?" Newsletter analysis of retention benchmarks across product categories.
Casey Winters (2019). "Retention is King." Blog post on retention curve analysis and PMF assessment.
Brian Balfour (2018). "Product Market Fit is Not Enough." Reforge essays on the relationship between PMF, channel fit, and model fit.

5 replies

Michael Bergström3y ago

Starting with retention curves not surveys is the single most important point in this post. The 40% Sean Ellis 'very disappointed' score is a useful pulse but it's measuring affect, not behavior, and early-stage markets are full of people who say they love your product while churning at month 3. When we anchored PMF discussions on L28 retention plateaus instead, the conversations with our board got dramatically sharper.

Rebecca Chen3y ago

Good quantitative push. One thing I'd add from the academic side: there's interesting work by Eisenmann & Wang on 'false signals of PMF', teams that hit early retention and NPS thresholds but fail because their initial segment is too narrow for the venture math to work. A composite score is more robust than any single metric, but even composites can be gamed by optimizing for a small, low-scaling segment. TAM-weighted retention might be a useful addition.

Omar Siddiqui3y ago

the retention-curve flattening heuristic is real but notoriously hard to call in real time. you see a flattening at week 8, call PMF, then it drops again at week 20 because your activation was pulling from a cohort skewed toward early adopters. we kept a rolling 8-quarter view and only called PMF once we saw 3 consecutive cohorts plateau at similar levels. felt slow at teh time, saved us from at least one premature scaling push

Zehra Korkmaz3y ago

from the investor side, the issue isnt that founders dont know these metrics, its that founders use PMF as a narrative device in pitch meetings. a composite score is harder to spin. if this framework gets adoption at the VC level the signal-to-noise ratio in pitches improves dramatically

Felipe Andrade2y ago

one practical gap, for B2B with long deal cycles retention curves don't exist yet by the time you need to decide whether to raise the A round. had to triangulate with expansion revenue, champion NPS inside accounts, and qualitative 'must-have' signal. the quant framework is ideal. for B2B early stage you often have n<50 customers and the confidence intervals are too wide to be decision-useful. mixed-methods wins honestly

Join the conversation

Disagree, share a counter-example from your own work, or point at research that changes the picture. Comments are moderated, no account required.