Value-Based Pricing Operationalized: A Measurement Framework

TL;DR: Value-based pricing is talked about more than it is done, because the gap between the slogan and the operational workflow is wide. Doing it properly requires a research stack (conjoint analysis for trade-off structure, Van Westendorp price-sensitivity meter for absolute price boundaries, in-product willingness-to-pay tests for revealed preference) plus a pricing operating cadence (initial survey, in-market validation, ongoing iteration). Survey-based and revealed-preference methods produce systematically different numbers; the gap is informative, not a failure of either method. Value-based pricing fails predictably in commodity categories, regulated industries, and two-sided markets, and pretending otherwise produces pricing decisions that look rigorous and act like guesses.

A note on named sources and companies. Stephan Liozu, Madhavan Ramanujam, Klaus Schmidt, and McKinsey are cited from their public writing and research. Operating examples (Adobe, Atlassian, Slack, Salesforce) appear as well-known archetype illustrations. Quantitative figures from "advisory work" come from anonymized partner engagements with mid-market SaaS and consumer-subscription operators in the same archetypes, not from the named companies.

The Phrase Versus the Practice

Most teams that say they do value-based pricing do not, in any operationally meaningful sense. They have read Ramanujam and Tacke's Monetizing Innovation (2016), they have heard Stephan Liozu present on value-based pricing transformations at industry conferences, and they have internalized the slogan: price to the value the customer gets, not to the cost of production or to a competitor's number. The slogan is correct. The execution gap is large.

In advisory work, the gap typically looks like this. The product team has agreed that value-based pricing is the right framework. The pricing page lists tiers with feature differentiation. The sales team has discount authority. The pricing decision was made roughly two years ago in a quarter-long pricing project that produced a deck, a tier structure, and a set of "pricing principles." Since then, the pricing has not changed materially, the willingness-to-pay assumptions have not been re-measured, no Van Westendorp study has been run, no choice-based conjoint has been fielded, and the pricing decisions that come up (a new feature tier, a competitive response, a renewal-pricing call) are made by gut. This is not value-based pricing. It is a one-time pricing project that became a stable price list, with a value-based label retrofitted onto it.

The honest pattern is more demanding. Value-based pricing as an operational practice means measuring willingness to pay continuously, segmenting by value-driver, testing price levels in market, and updating the price list on a cadence that reflects the changing competitive and product landscape. It is closer to a pricing function than a pricing project. The companies that do this well (the Adobes, the Atlassians, the few mid-market SaaS operators that have built pricing capacity into the product organization rather than treating it as a one-off marketing exercise) have made it look easy. What is hidden underneath the appearance of ease is several measurement workflows, an operating cadence, and a willingness to act on imperfect data.

The Research Stack: Conjoint, Van Westendorp, and Revealed Preference

The three measurement workflows that, together, constitute a credible value-based pricing operation are choice-based conjoint analysis, the Van Westendorp price-sensitivity meter, and in-market revealed-preference testing. None of the three is sufficient on its own; together they triangulate a pricing decision that is grounded in something other than the pricing committee's intuition.

Choice-based conjoint analysis (CBC) is the workhorse for measuring trade-off structure. Customers are presented with sets of product configurations, each defined by levels on several attributes (feature bundle, support level, price), and asked to choose their preferred configuration from each set. The resulting choice data is analyzed (typically with a multinomial logit or hierarchical Bayes model) to estimate the relative utility weight customers place on each attribute level. The output is a utility function for each respondent that can be used to simulate market share under any candidate price-and-feature configuration, including configurations that were not directly tested.

The Green and Wind paper that popularized conjoint analysis in marketing research, New Way to Measure Consumers' Judgments (Harvard Business Review, 1975), is the most-cited entry point, but the modern best-practice reference is Orme's Getting Started with Conjoint Analysis (now in its fourth edition), and Sawtooth Software's documentation for CBC remains the practical implementation guide most pricing practitioners actually use. The technical details (number of choice sets, design efficiency, number of levels per attribute) matter, but the conceptual contribution of CBC is that it forces respondents to trade off rather than to rate, which produces preference data that is closer to actual purchase behavior than rating-scale methods.

The Van Westendorp price-sensitivity meter (PSM) is the workhorse for measuring absolute price boundaries. Respondents are asked four questions about a defined product:

At what price would this product be so expensive you would not consider buying it (Too Expensive)?
At what price would this product be expensive but you would still consider it (Expensive)?
At what price would this product be a bargain, a great buy for the money (Cheap)?
At what price would this product be so cheap you would question its quality (Too Cheap)?

The four resulting price distributions cross at characteristic points. The intersection of "Too Cheap" and "Expensive" defines the Point of Marginal Cheapness (PMC). The intersection of "Too Expensive" and "Cheap" defines the Point of Marginal Expensiveness (PME). The intersection of "Cheap" and "Expensive" defines the Indifference Price Point (IPP). The intersection of "Too Cheap" and "Too Expensive" defines the Optimal Price Point (OPP). The acceptable price range is the band between PMC and PME, with the IPP and OPP as internal reference points within it.

The PSM is technically simple, which is both its strength and its weakness. It is fast to field, easy to analyze, and produces an interpretable price-range output. It is also vulnerable to several known biases: the price thresholds respondents report are sensitive to the order of the questions, the framing of the product description, the respondent's familiarity with the category, and the gap between stated and actual purchase intent. Klaus Schmidt's work on willingness-to-pay measurement, including his contributions to the methodology literature on PSM, has documented these biases in detail, and the practical guidance is to treat PSM results as a starting range rather than a precise number, and to validate against either CBC or revealed-preference data.

Revealed-preference testing (in-market A/B testing of price levels) is the workhorse for validating the survey-based findings against actual purchase behavior. A subset of visitors sees a candidate price (the PSM's optimal price point, or the CBC-derived optimum), another subset sees the current price, and the resulting conversion and revenue differences are measured. This is the gold-standard methodology because it captures actual purchase behavior rather than stated intent, but it has constraints (sample-size requirements, the operational complexity of running different prices simultaneously, the legal and brand-perception risk of being seen to price-discriminate) that make it impossible in some categories.

Value-Based Pricing Research Methods, Comparative Properties (Advisory Partner Engagements, 2022-2025)

Method	Captures	Output	Cost (rough)	Time to Run	Common Pitfalls
Choice-based conjoint (CBC)	Trade-off structure, feature-price utility	Per-respondent utility weights, market simulator	$58K to $142K typical (B2B), $24K-$72K (B2C)	8 to 12 weeks	Attribute selection bias, sample-frame issues
Van Westendorp PSM	Acceptable price range, optimal price point	PMC, PME, IPP, OPP price levels	$14K-$48K (panel cost dominates)	4 to 7 weeks	Order effects, product-description framing
Gabor-Granger	Demand curve at discrete price points	Price-volume relationship	$11K-$38K	4 to 6 weeks	Single-price-question bias, no trade-off captured
In-market A/B price test	Revealed preference at tested prices	Conversion lift, revenue per visitor	Engineering + analyst time, no panel cost	9 to 14 weeks for adequate power	Sample size, ethics/brand risk
Sales-conversation pricing	Anchoring and negotiation pattern	Discount distribution, win rate by price	Embedded in sales cost (≈ $0.8K-$2.6K per deal analyzed)	Continuous	Selection bias toward closable deals

The four methods have systematic biases that point in different directions. Survey methods (CBC, PSM, Gabor-Granger) tend to overstate willingness to pay relative to actual behavior, because the survey context removes the budget pressure and the alternative-considerations that constrain actual purchase decisions. Revealed-preference tests capture the right behavior but are typically run on a narrow set of price levels and do not generalize easily to unexplored regions of the price space. Sales-conversation data captures negotiation dynamics but is biased by the selection of which deals get to the negotiation stage. Used together the four methods triangulate; used individually each is misleading.

The Survey Versus Revealed-Preference Gap

The systematic gap between stated and revealed willingness to pay is one of the most-documented phenomena in pricing research. It has been measured across categories (consumer goods, software, services, regulated products), across methods (PSM vs A/B test, CBC vs A/B test, survey-based purchase intent vs actual purchase), and across decades (the original work goes back to the 1970s contingent-valuation literature in environmental economics).

The size of the gap is large enough to matter. In several published meta-analyses, the average stated willingness to pay overstates revealed willingness to pay by 30-60% across categories. The gap is larger for products with higher hedonic content (luxury goods, entertainment) and smaller for utilitarian goods (commodities, infrastructure). The gap is also larger when the survey context is divorced from purchase-decision context and smaller when the survey context closely mimics the purchase environment.

The Murphy et al. meta-analysis covers environmental economics studies, but the underlying mechanism (the hypothetical bias in contingent valuation) generalizes to pricing research because the cognitive operation is the same: respondents are being asked to value something they are not actually purchasing in the moment of the survey, which removes a class of constraints that operate in actual purchase decisions.

The implications for the pricing operating cadence are concrete. Survey-based numbers from PSM and CBC should be treated as upper bounds on willingness to pay, not as point estimates. The right way to use them is to define a candidate price range from the survey work and then test the range in market, with the in-market result treated as the operational price input. The Adobe and Atlassian pricing organizations are known to operate this way, with the conjoint and PSM work setting the candidate-range and the in-market tests confirming the operating price.

Stated vs Revealed Willingness to Pay, Composite Across Product Categories (Advisory Partner Operators, 2022-2025 Sample)

The composite indexes revealed values to stated values of 100 across categories. The pattern that matters is the differential by category: charity and hedonic categories show the largest gaps (revealed WTP is roughly 34-57% of stated), utilitarian and commodity categories show the smallest (revealed is 82-91% of stated). For a value-based pricing project in a hedonic-leaning category (premium consumer subscription, lifestyle brands), discounting stated WTP by 40-50% before setting an operating price is consistent with the published meta-analytic patterns.

Contrary to the Conventional View

Conventional view

More sophisticated survey methods (CBC instead of PSM) eliminate the stated-versus-revealed gap.

What the evidence shows

The CBC methodology reduces hypothetical bias somewhat because the choice-based format mimics purchase decisions more closely than direct-rating methods do. It does not eliminate the bias. Published comparisons of CBC-derived WTP against in-market price test results consistently show that CBC overstates revealed WTP by approximately 15 to 30 percent across categories, which is smaller than the 30 to 60 percent gap typical of direct-rating methods but still material. The right operational use of CBC is to take the model output and apply a category-specific deflator before using the number for pricing decisions, with the deflator calibrated against past in-market validation work. Treating CBC output as a point estimate is a common and expensive mistake.

A Concrete Operating Workflow

The abstract description of "use CBC plus PSM plus in-market validation" is not actionable. The concrete operating workflow that I have seen work in mid-market SaaS and consumer-subscription contexts looks roughly as follows.

The initial pricing study runs at a roughly quarterly cadence for products that are early-stage and at an annual cadence for products that are mature. The deliverable is not "the new price" but "the candidate price range with confidence intervals and assumptions."

Define the pricing decision and the product configuration. This sounds trivial. It is not. Most failed pricing projects fail at this step because the team has not agreed on what is being priced (the base product, the upgrade tier, the bundle, the annual versus monthly plan), who the target buyer is, or what the comparison set is. The clearest deliverable from this step is a one-page brief that names the product configuration, the buyer persona, the competitor set, and the decision the pricing study is intended to inform.
Run the qualitative pre-work. Before fielding any survey instrument, the team interviews 8-15 prospective or current customers to understand the value drivers, the competitive comparison set the customer actually uses (which is usually different from the one the product team assumed), and the willingness-to-pay anchors that customers reference. The qualitative phase informs the attribute selection for the CBC and the product-description framing for the PSM.
Field the survey work. A CBC with 6-10 attributes (price always one of them, the others being feature bundles, support level, integration depth, contract length, and 1-2 product-specific dimensions) and a PSM on the chosen product configuration. The CBC requires 200-400 respondents for B2B and 500-1000 for B2C to produce stable utility estimates. The PSM can run with 100-300 respondents depending on the desired precision.
Analyze and produce the candidate range. The CBC simulator produces market-share projections under different price-and-feature configurations. The PSM produces the PMC-to-PME range with the IPP and OPP markers. The two methods are reconciled into a candidate price range (typically wider than either method alone suggests), with assumptions and known biases documented explicitly.
Validate in market. A subset of traffic sees a price within the candidate range (often the PSM's OPP or the CBC's revenue-maximizing point, deflated by 20-30% for the hypothetical bias). The validation runs for long enough to reach statistical significance on conversion and revenue per visitor (typically 8-16 weeks). The in-market result is treated as the operating price.
Operate and revisit. The price is set, the operating cadence (quarterly review for early-stage, annual for mature) is scheduled, and the willingness-to-pay measurement is rerun on cadence to detect drift. Most value-based pricing operations fail at this step because the team treats the initial study as the end of the work rather than as the establishment of the operating capability.

The Value-Based Pricing Operating Loop

Loading diagram...

The workflow described above is what differentiates pricing as a function from pricing as a project. Most teams do steps 1-4 once, ship the result, and consider the work done. The teams that do value-based pricing operationally do all six steps, on cadence, and treat the in-market validation as the only data that matters for the actual operating decision.

From Experience

Advising a mid-market SaaS operator on its pricing-research program, 2024-2025.

The team had previously commissioned a single CBC study from a research vendor roughly two years earlier, at a cost of around $87K all-in, and had set its pricing tiers based on the output. By the time we started, the pricing felt off (win rates were trending down at the same price point, and the competitive set had shifted), but no fresh measurement work had been done. We re-fielded a smaller PSM and a streamlined CBC, this time internally rather than through a vendor, and the cost dropped to roughly $22K-$28K per round at a much faster turnaround. The discipline we put in place was a roughly six-monthly re-measurement, with the survey work informing a candidate range and an in-product A/B test confirming the operating price. The first three cycles produced one upward price move (good), one downward move on a stalled tier (also good), and one no-change confirmation that the existing price was within the operating range (also useful, because it ruled out the team's instinct to discount). The discipline mattered more than any single measurement; what was missing before was not the research method but the operating loop.

Where Value-Based Pricing Breaks Down

The honest part of the value-based pricing literature is the acknowledgement that the framework does not apply universally. Several categories are structurally hostile to value-based pricing, and forcing it produces decisions that look rigorous and behave badly.

The first category is commodities. A pure commodity (raw materials, fungible inputs, transactional infrastructure where switching costs are minimal) has a price determined by the marginal cost of the most efficient producer, not by the value the customer extracts. Trying to value-base price a commodity means either pricing above the market clearing level and losing the sale, or pricing at the market level and discovering that the value-based framework was a slogan for what was actually a market-following strategy. The honest answer for commodities is that the framework does not apply, and the pricing decision is about positioning (where on the cost curve is your operation) rather than about customer value capture.

The second category is two-sided markets and marketplaces. In a marketplace (Uber, Airbnb, eBay, payment networks), the price on one side is structurally tied to the volume and price on the other side, and unilateral value-based pricing on one side ignores the cross-side network effects that determine the optimal pricing structure. Marketplace pricing typically uses asymmetric subsidy structures (subsidize the side with the higher price elasticity to grow liquidity, capture value on the other side) and dynamic balancing rather than value-based capture. The pricing literature on two-sided markets (Rochet and Tirole's 2003 paper, Evans and Schmalensee's work) addresses this explicitly. Trying to value-base price one side of a marketplace without modeling the cross-side effects produces pricing that destroys network liquidity.

The third category is regulated industries. Healthcare pricing (in many countries), utility pricing, pharmaceutical pricing in most developed markets, financial-services pricing under various regulatory regimes, all have pricing constraints that are imposed externally and that override what value-based pricing would set. The right framework for these categories is regulatory-compliance pricing within the constrained range that the regulator allows, with value-based methods used (if at all) to identify the price within the allowed range rather than the underlying willingness to pay.

The fourth category is products where the value to the customer is highly uncertain and only revealed after extended use. Enterprise software products with deployment cycles measured in years, professional services where the deliverable is co-created with the client, products with strong learning-curve effects, all of these have the property that the customer cannot credibly estimate the value at the time of purchase, which means survey-based WTP measurement captures a guess rather than a value estimate. The pricing approach for these categories is often outcome-based (price as a share of measured outcomes rather than as a flat fee) or staged (low entry price with usage-based or expansion-based pricing as the value is revealed), which has the value-based spirit but uses different operational mechanics.

When Value-Based Pricing Applies and When It Does Not

Category Type	Value-Based Pricing Fit	Operational Pattern	Better Alternative
Differentiated B2C subscription	Strong fit	CBC + PSM + in-market test	None
Mid-market B2B SaaS	Strong fit	Full research stack + cadence	None
Enterprise B2B with deployment cycles	Moderate fit	Value-engineering models + outcome-based contracts	Hybrid value + outcome
Commodity goods/services	Poor fit	Market-following pricing	Cost-curve positioning
Two-sided marketplaces	Poor fit	Asymmetric subsidy + dynamic balancing	Network-effect pricing
Regulated industries (healthcare, utilities)	Constrained fit	Within regulatory range only	Compliance + within-range optimization
Heavily competitive saturated category	Limited fit	Competitive benchmarking + differentiation premium	Competition-anchored with premium math
Bundled offerings	Moderate fit, complex	CBC on bundles, choice-modeling	Bundle-architecture optimization

The Practical Math: Conjoint Output to Pricing Decision

The bridge from a conjoint analysis output to an actual pricing decision is the part of the workflow that is most often skipped. The output of a CBC is a per-respondent utility function. The pricing decision needs to translate that into a recommended price. The translation step matters, and it is not automatic.

The standard approach uses the CBC's market simulator to project market share under each candidate price level, holding other attribute levels fixed. The simulation generates a demand curve (estimated market share as a function of price), and from the demand curve the analyst computes revenue (price times projected share times market size) and contribution (revenue minus variable cost times projected share). The recommended price is typically the contribution-maximizing point on the curve, with adjustments for strategic considerations (penetration pricing, competitive response, brand positioning).

The math is straightforward but has several places where it goes wrong in practice. The simulator's projections of share depend on the assumption that the choice set in the simulator matches the choice set the actual buyer faces in the market, which is rarely fully true. The contribution maximum on the modeled curve is often a sharp peak that does not survive sensitivity analysis (small changes in assumed costs or competitor prices move the peak by 10-20%). The output is also typically presented as a point estimate when it should be presented as a range with explicit confidence intervals.

CBC Simulator Output: Demand and Contribution Curves from a Mid-Market SaaS Engagement (2024)

The chart shows the typical shape of a CBC-derived demand and contribution curve from a 2024 advisory engagement. The contribution peak in this example sits at the $129 price point, with $149 a close second. The naive recommendation from a CBC simulator is to price at $129. The operationally-aware recommendation is to define a range ($129 to $149) and validate in market, because the simulator's confidence in distinguishing those two points is typically lower than the simulator's point output suggests, and the in-market behavior may differ from the modeled output by more than the gap between the two points.

A second pricing-decision-from-CBC consideration is the structure of the price ladder. A single price point answer ignores the segmentation that the CBC utility data actually supports. Most pricing decisions are really about how to design a tiered price ladder (three to five tiers spanning different feature bundles and price points) rather than a single price. The CBC data can support a tier-design exercise that identifies which feature-and-price combinations create the most efficient revenue capture across the customer base, and the tier design is often more consequential than the within-tier price level.

The conjoint output is the input to a pricing decision, not the decision itself. The teams that get value out of conjoint analysis are the ones that treat the simulator output as the starting point of a discussion about ladder design, segment targeting, and strategic positioning, not the ones that treat the contribution peak as the answer.

The Pricing Function Versus the Pricing Project

The deepest distinction in the value-based pricing literature, well-articulated in Stephan Liozu's work on pricing organization design, is the distinction between pricing as a project (a one-time engagement that produces a price decision) and pricing as a function (an ongoing organizational capability that produces price decisions continuously). The slogan version of value-based pricing is usually a pricing project. The operational version is a pricing function.

The differences are organizational. A pricing function has dedicated headcount (a pricing manager or pricing team, depending on company size), a defined cadence (quarterly or annual price reviews, with measurement cadence underneath), an instrumentation layer (the systems that track WTP, win rates, discount distributions, competitive intelligence), and a decision-making authority (who can change a price, and on what evidence). A pricing project produces a deck and a price list and goes away.

The McKinsey pricing benchmark research, which has been published in various forms over the last decade (2010 McKinsey & Company report on Pricing, recent updates in the McKinsey Quarterly on pricing as a margin-improvement lever), consistently finds that companies with dedicated pricing functions deliver materially higher revenue growth and margin expansion than companies that treat pricing as a project. The published numbers vary, but the patterns are consistent across industries: dedicated pricing functions correlate with 1 to 5 percentage points of margin improvement over multi-year periods, with the wider distributions in industries where pricing complexity is higher.

The organizational design choice (when to invest in a pricing function rather than running pricing as a project) is itself a pricing decision in disguise. The investment in a pricing function pays off in proportion to the company's pricing complexity (more SKUs, more buyer segments, more competitive dynamics, more frequent product changes) and the company's revenue base (the absolute dollars at stake from a one-percent margin improvement). For a $5M ARR company with a single-tier product, a pricing project annually is probably sufficient. For a $100M ARR company with three product lines and a global customer base, a pricing function is almost certainly underweight at one or two FTE and probably justifies a small team.

What To Do If You Are Starting Now

The practical recommendation for a team starting from scratch (no current pricing-research capability, no dedicated pricing headcount, a pricing decision that needs to be made in the next two quarters) is more pragmatic than the full operating workflow above.

Run a streamlined PSM first. A PSM with 150-300 respondents from a representative customer panel can be fielded in 4-6 weeks for roughly $14K-$24K (panel cost is the dominant line item) and will give a candidate price range that is materially better than the gut-feel alternative. Treat the output as a range, not a point estimate, and apply a 20-30% deflator to the OPP for the hypothetical bias.

Validate the PSM-derived candidate price in market with an A/B test on a subset of traffic. If the volume is too low for a clean A/B (which is common for B2B SaaS with under 10,000 monthly trial starts), run a phased rollout instead (price A for the first month, price B for the second month, with caveats about the time-series confound). The validation phase is where the survey-versus-revealed gap gets reconciled.

If the PSM-and-validation produces a confident price, set it and move on. If the price decision feels insufficient (multiple tiers, complex bundle, segment-dependent), commission a CBC. The CBC is expensive (often $58K-$142K all-in including vendor and analyst time) and slow (8-12 weeks), and it should be commissioned only when the simpler methods do not produce a confident answer. Do not commission a CBC as the default; commission it as the escalation.

Schedule the next cycle. Whatever the initial price decision, put a calendar entry for the next pricing review six to twelve months out, with a defined trigger (the next PSM round, or an in-market test, or a competitive event). The single biggest predictor of whether value-based pricing actually delivers margin improvement is whether the team revisits the price on a cadence rather than treating the first decision as final.

The full research stack (CBC plus PSM plus in-market validation, on cadence, with a dedicated pricing function) is the gold standard. It is also expensive, slow, and beyond the resources of most mid-market operators. The streamlined version (PSM plus A/B validation, repeated on cadence) captures most of the value at a fraction of the cost. The most expensive pricing decision is the one made by gut with no measurement at all, and the second most expensive is the one made by a sophisticated research method and then never revisited.

Key Takeaways

Value-based pricing is talked about more than it is operationalized. The gap is not in the research methods (which are mature) but in the operating cadence and the willingness to act on measurement-driven price changes.
The credible research stack is choice-based conjoint (CBC) for trade-off structure, Van Westendorp PSM for absolute price boundaries, and in-market A/B testing for revealed-preference validation. None of the three is sufficient on its own.
Stated willingness to pay systematically overstates revealed WTP by 30-60% across categories. The gap is the "hypothetical bias" documented in the contingent-valuation literature and is informative, not a methodological failure.
The operational workflow is six steps: define the decision, qualitative pre-work, fielded survey, candidate range, in-market validation, operating cadence. Most teams do the first four and skip the last two.
Value-based pricing fails predictably in commodities (no value capture above marginal cost), two-sided markets (cross-side network effects dominate), regulated industries (price set externally), and uncertain-value products (customer cannot estimate value at purchase).
The CBC output (utility function and market simulator) is the input to a pricing decision, not the decision itself. The output is best used to define tier structure and candidate price ranges rather than to identify point-estimate price recommendations.
The McKinsey pricing benchmark research consistently associates dedicated pricing functions (capability) with materially better margin outcomes than pricing as a project (deliverable). The organizational investment is itself a pricing decision.
For teams starting from scratch, the streamlined workflow (PSM plus A/B validation, on cadence) captures most of the value of the full research stack at a fraction of the cost. The full stack (CBC plus PSM plus validation, with a dedicated pricing function) is the gold standard but is justified only at meaningful scale.
The most expensive pricing decision is the one made by gut with no measurement. The second most expensive is the one made by sophisticated measurement and then never revisited. The discipline is the operating loop, not the method.