TL;DR: Value-based pricing is talked about more than it is done, because the gap between the slogan and the operational workflow is wide. Doing it properly requires a research stack (conjoint analysis for trade-off structure, Van Westendorp price-sensitivity meter for absolute price boundaries, in-product willingness-to-pay tests for revealed preference) plus a pricing operating cadence (initial survey, in-market validation, ongoing iteration). Survey-based and revealed-preference methods produce systematically different numbers; the gap is informative, not a failure of either method. Value-based pricing fails predictably in commodity categories, regulated industries, and two-sided markets, and pretending otherwise produces pricing decisions that look rigorous and act like guesses.
A note on named sources and companies. Stephan Liozu, Madhavan Ramanujam, Klaus Schmidt, and McKinsey are cited from their public writing and research. Operating examples (Adobe, Atlassian, Slack, Salesforce) appear as well-known archetype illustrations. Quantitative figures from "advisory work" come from anonymized partner engagements with mid-market SaaS and consumer-subscription operators in the same archetypes, not from the named companies.
The Phrase Versus the Practice
Most teams that say they do value-based pricing do not, in any operationally meaningful sense. They have read Ramanujam and Tacke's Monetizing Innovation (2016), they have heard Stephan Liozu present on value-based pricing transformations at industry conferences, and they have internalized the slogan: price to the value the customer gets, not to the cost of production or to a competitor's number. The slogan is correct. The execution gap is large.
In advisory work, the gap typically looks like this. The product team has agreed that value-based pricing is the right framework. The pricing page lists tiers with feature differentiation. The sales team has discount authority. The pricing decision was made roughly two years ago in a quarter-long pricing project that produced a deck, a tier structure, and a set of "pricing principles." Since then, the pricing has not changed materially, the willingness-to-pay assumptions have not been re-measured, no Van Westendorp study has been run, no choice-based conjoint has been fielded, and the pricing decisions that come up (a new feature tier, a competitive response, a renewal-pricing call) are made by gut. This is not value-based pricing. It is a one-time pricing project that became a stable price list, with a value-based label retrofitted onto it.
The honest pattern is more demanding. Value-based pricing as an operational practice means measuring willingness to pay continuously, segmenting by value-driver, testing price levels in market, and updating the price list on a cadence that reflects the changing competitive and product landscape. It is closer to a pricing function than a pricing project. The companies that do this well (the Adobes, the Atlassians, the few mid-market SaaS operators that have built pricing capacity into the product organization rather than treating it as a one-off marketing exercise) have made it look easy. What is hidden underneath the appearance of ease is several measurement workflows, an operating cadence, and a willingness to act on imperfect data.
The Research Stack: Conjoint, Van Westendorp, and Revealed Preference
The three measurement workflows that, together, constitute a credible value-based pricing operation are choice-based conjoint analysis, the Van Westendorp price-sensitivity meter, and in-market revealed-preference testing. None of the three is sufficient on its own; together they triangulate a pricing decision that is grounded in something other than the pricing committee's intuition.
Choice-based conjoint analysis (CBC) is the workhorse for measuring trade-off structure. Customers are presented with sets of product configurations, each defined by levels on several attributes (feature bundle, support level, price), and asked to choose their preferred configuration from each set. The resulting choice data is analyzed (typically with a multinomial logit or hierarchical Bayes model) to estimate the relative utility weight customers place on each attribute level. The output is a utility function for each respondent that can be used to simulate market share under any candidate price-and-feature configuration, including configurations that were not directly tested.
The Green and Wind paper that popularized conjoint analysis in marketing research, New Way to Measure Consumers' Judgments (Harvard Business Review, 1975), is the most-cited entry point, but the modern best-practice reference is Orme's Getting Started with Conjoint Analysis (now in its fourth edition), and Sawtooth Software's documentation for CBC remains the practical implementation guide most pricing practitioners actually use. The technical details (number of choice sets, design efficiency, number of levels per attribute) matter, but the conceptual contribution of CBC is that it forces respondents to trade off rather than to rate, which produces preference data that is closer to actual purchase behavior than rating-scale methods.
The Van Westendorp price-sensitivity meter (PSM) is the workhorse for measuring absolute price boundaries. Respondents are asked four questions about a defined product:
- At what price would this product be so expensive you would not consider buying it (Too Expensive)?
- At what price would this product be expensive but you would still consider it (Expensive)?
- At what price would this product be a bargain, a great buy for the money (Cheap)?
- At what price would this product be so cheap you would question its quality (Too Cheap)?
The four resulting price distributions cross at characteristic points. The intersection of "Too Cheap" and "Expensive" defines the Point of Marginal Cheapness (PMC). The intersection of "Too Expensive" and "Cheap" defines the Point of Marginal Expensiveness (PME). The intersection of "Cheap" and "Expensive" defines the Indifference Price Point (IPP). The intersection of "Too Cheap" and "Too Expensive" defines the Optimal Price Point (OPP). The acceptable price range is the band between PMC and PME, with the IPP and OPP as internal reference points within it.
The PSM is technically simple, which is both its strength and its weakness. It is fast to field, easy to analyze, and produces an interpretable price-range output. It is also vulnerable to several known biases: the price thresholds respondents report are sensitive to the order of the questions, the framing of the product description, the respondent's familiarity with the category, and the gap between stated and actual purchase intent. Klaus Schmidt's work on willingness-to-pay measurement, including his contributions to the methodology literature on PSM, has documented these biases in detail, and the practical guidance is to treat PSM results as a starting range rather than a precise number, and to validate against either CBC or revealed-preference data.
Revealed-preference testing (in-market A/B testing of price levels) is the workhorse for validating the survey-based findings against actual purchase behavior. A subset of visitors sees a candidate price (the PSM's optimal price point, or a CBC-simulated optimum), another subset sees the current price, and the resulting conversion and revenue differences are measured. This is the gold-standard methodology because it captures actual purchase behavior rather than stated intent, but it has constraints (sample-size requirements, the operational complexity of running different prices simultaneously, the legal and brand-perception risk of being seen to price-discriminate) that make it impossible in some categories.
Value-Based Pricing Research Methods, Comparative Properties
| Method | Captures | Output | Cost (rough) | Time to Run | Common Pitfalls |
|---|---|---|---|---|---|
| Choice-based conjoint (CBC) | Trade-off structure, feature-price utility | Per-respondent utility weights, market simulator | $50K to $150K for B2B, $20K to $80K for B2C | 8 to 12 weeks | Attribute selection bias, sample-frame issues |
| Van Westendorp PSM | Acceptable price range, optimal price point | PMC, PME, IPP, OPP price levels | $15K to $50K | 4 to 8 weeks | Order effects, product-description framing |
| Gabor-Granger | Demand curve at discrete price points | Price-volume relationship | $10K to $40K | 4 to 6 weeks | Single-price-question bias, no trade-off captured |
| In-market A/B price test | Revealed preference at tested prices | Conversion lift, revenue per visitor | Operational complexity | 8 to 16 weeks for adequate power | Sample size, ethics/brand risk |
| Sales-conversation pricing | Anchoring and negotiation pattern | Discount distribution, win rate by price | Embedded in sales cost | Continuous | Selection bias toward closable deals |
The four methods have systematic biases that point in different directions. Survey methods (CBC, PSM, Gabor-Granger) tend to overstate willingness to pay relative to actual behavior, because the survey context removes the budget pressure and the alternative-considerations that constrain actual purchase decisions. Revealed-preference tests capture the right behavior but are typically run on a narrow set of price levels and do not generalize easily to unexplored regions of the price space. Sales-conversation data captures negotiation dynamics but is biased by the selection of which deals get to the negotiation stage. Used together the four methods triangulate; used individually each is misleading.
The Survey Versus Revealed-Preference Gap
The systematic gap between stated and revealed willingness to pay is one of the most-documented phenomena in pricing research. It has been measured across categories (consumer goods, software, services, regulated products), across methods (PSM vs A/B test, CBC vs A/B test, hypothetical purchase intent vs actual purchase), and across decades (the original work goes back to the 1970s contingent-valuation literature in environmental economics).
The size of the gap is large enough to matter. In several published meta-analyses, the average stated willingness to pay overstates revealed willingness to pay by 30-60% across categories. The gap is larger for products with higher hedonic content (luxury goods, entertainment) and smaller for utilitarian goods (commodities, infrastructure). The gap is also larger when the survey context is divorced from purchase-decision context and smaller when the survey context closely mimics the purchase environment.
The Murphy et al. meta-analysis covers environmental economics studies, but the underlying mechanism (the hypothetical bias in contingent valuation) generalizes to pricing research because the cognitive operation is the same: respondents are being asked to value something they are not actually purchasing in the moment of the survey, which removes a class of constraints that operate in actual purchase decisions.
The implications for the pricing operating cadence are concrete. Survey-based numbers from PSM and CBC should be treated as upper bounds on willingness to pay, not as point estimates. The right way to use them is to define a candidate price range from the survey work and then test the range in market, with the in-market result treated as the operational price input. The Adobe and Atlassian pricing organizations are known to operate this way, with the conjoint and PSM work setting the candidate-range and the in-market tests confirming the operating price.
The composite chart is illustrative, with revealed values indexed to stated values of 100 across categories. The pattern that matters is the differential by category: charity and hedonic categories show the largest gaps (revealed WTP is 35-58% of stated), utilitarian and commodity categories show the smallest (revealed is 80-92% of stated). For a value-based pricing project in a hedonic-leaning category (premium consumer subscription, lifestyle brands), discounting stated WTP by 40-50% before setting an operating price is consistent with the published meta-analytic patterns.
A Concrete Operating Workflow
The abstract description of "use CBC plus PSM plus in-market validation" is not actionable. The concrete operating workflow that I have seen work in mid-market SaaS and consumer-subscription contexts looks roughly as follows.
The initial pricing study runs at a roughly quarterly cadence for products that are early-stage and at an annual cadence for products that are mature. The deliverable is not "the new price" but "the candidate price range with confidence intervals and assumptions."
-
Define the pricing decision and the product configuration. This sounds trivial. It is not. Most failed pricing projects fail at this step because the team has not agreed on what is being priced (the base product, the upgrade tier, the bundle, the annual versus monthly plan), who the target buyer is, or what the comparison set is. The clearest deliverable from this step is a one-page brief that names the product configuration, the buyer persona, the competitor set, and the decision the pricing study is intended to inform.
-
Run the qualitative pre-work. Before fielding any survey instrument, the team interviews 8-15 prospective or current customers to understand the value drivers, the competitive comparison set the customer actually uses (which is usually different from the one the product team assumed), and the willingness-to-pay anchors that customers reference. The qualitative phase informs the attribute selection for the CBC and the product-description framing for the PSM.
-
Field the survey work. A CBC with 6-10 attributes (price always one of them, the others being feature bundles, support level, integration depth, contract length, and 1-2 product-specific dimensions) and a PSM on the chosen product configuration. The CBC requires 200-400 respondents for B2B and 500-1000 for B2C to produce stable utility estimates. The PSM can run with 100-300 respondents depending on the desired precision.
-
Analyze and produce the candidate range. The CBC simulator produces market-share projections under different price-and-feature configurations. The PSM produces the PMC-to-PME range with the IPP and OPP markers. The two methods are reconciled into a candidate price range (typically wider than either method alone suggests), with assumptions and known biases documented explicitly.
-
Validate in market. A subset of traffic sees a price within the candidate range (often the PSM's OPP or the CBC's revenue-maximizing point, deflated by 20-30% for the hypothetical bias). The validation runs for long enough to reach statistical significance on conversion and revenue per visitor (typically 8-16 weeks). The in-market result is treated as the operating price.
-
Operate and revisit. The price is set, the operating cadence (quarterly review for early-stage, annual for mature) is scheduled, and the willingness-to-pay measurement is rerun on cadence to detect drift. Most value-based pricing operations fail at this step because the team treats the initial study as the end of the work rather than as the establishment of the operating capability.
The Value-Based Pricing Operating Loop
The workflow described above is what differentiates pricing as a function from pricing as a project. Most teams do steps 1-4 once, ship the result, and consider the work done. The teams that do value-based pricing operationally do all six steps, on cadence, and treat the in-market validation as the only data that matters for the actual operating decision.
Where Value-Based Pricing Breaks Down
The honest part of the value-based pricing literature is the acknowledgement that the framework does not apply universally. Several categories are structurally hostile to value-based pricing, and forcing it produces decisions that look rigorous and behave badly.
The first category is commodities. A pure commodity (raw materials, fungible inputs, transactional infrastructure where switching costs are minimal) has a price determined by the marginal cost of the most efficient producer, not by the value the customer extracts. Trying to value-base price a commodity means either pricing above the market clearing level and losing the sale, or pricing at the market level and discovering that the value-based framework was a slogan for what was actually a market-following strategy. The honest answer for commodities is that the framework does not apply, and the pricing decision is about positioning (where on the cost curve is your operation) rather than about customer value capture.
The second category is two-sided markets and marketplaces. In a marketplace (Uber, Airbnb, eBay, payment networks), the price on one side is structurally tied to the volume and price on the other side, and unilateral value-based pricing on one side ignores the cross-side network effects that determine the optimal pricing structure. Marketplace pricing typically uses asymmetric subsidy structures (subsidize the side with the higher price elasticity to grow liquidity, capture value on the other side) and dynamic balancing rather than value-based capture. The pricing literature on two-sided markets (Rochet and Tirole's 2003 paper, Evans and Schmalensee's work) addresses this explicitly. Trying to value-base price one side of a marketplace without modeling the cross-side effects produces pricing that destroys network liquidity.
The third category is regulated industries. Healthcare pricing (in many countries), utility pricing, pharmaceutical pricing in most developed markets, financial-services pricing under various regulatory regimes, all have pricing constraints that are imposed externally and that override what value-based pricing would set. The right framework for these categories is regulatory-compliance pricing within the constrained range that the regulator allows, with value-based methods used (if at all) to identify the price within the allowed range rather than the underlying willingness to pay.
The fourth category is products where the value to the customer is highly uncertain and only revealed after extended use. Enterprise software products with deployment cycles measured in years, professional services where the deliverable is co-created with the client, products with strong learning-curve effects, all of these have the property that the customer cannot credibly estimate the value at the time of purchase, which means survey-based WTP measurement captures a guess rather than a value estimate. The pricing approach for these categories is often outcome-based (price as a share of measured outcomes rather than as a flat fee) or staged (low entry price with usage-based or expansion-based pricing as the value is revealed), which has the value-based spirit but uses different operational mechanics.
When Value-Based Pricing Applies and When It Does Not
| Category Type | Value-Based Pricing Fit | Operational Pattern | Better Alternative |
|---|---|---|---|
| Differentiated B2C subscription | Strong fit | CBC + PSM + in-market test | None |
| Mid-market B2B SaaS | Strong fit | Full research stack + cadence | None |
| Enterprise B2B with deployment cycles | Moderate fit | Value-engineering models + outcome-based contracts | Hybrid value + outcome |
| Commodity goods/services | Poor fit | Market-following pricing | Cost-curve positioning |
| Two-sided marketplaces | Poor fit | Asymmetric subsidy + dynamic balancing | Network-effect pricing |
| Regulated industries (healthcare, utilities) | Constrained fit | Within regulatory range only | Compliance + within-range optimization |
| Heavily competitive saturated category | Limited fit | Competitive benchmarking + differentiation premium | Competition-anchored with premium math |
| Bundled offerings | Moderate fit, complex | CBC on bundles, choice-modeling | Bundle-architecture optimization |
The Practical Math: Conjoint Output to Pricing Decision
The bridge from a conjoint analysis output to an actual pricing decision is the part of the workflow that is most often skipped. The output of a CBC is a per-respondent utility function. The pricing decision needs to translate that into a recommended price. The translation step matters, and it is not automatic.
The standard approach uses the CBC's market simulator to project market share under each candidate price level, holding other attribute levels fixed. The simulation generates a demand curve (estimated market share as a function of price), and from the demand curve the analyst computes revenue (price times projected share times market size) and contribution (revenue minus variable cost times projected share). The recommended price is typically the contribution-maximizing point on the curve, with adjustments for strategic considerations (penetration pricing, competitive response, brand positioning).
The math is straightforward but has several places where it goes wrong in practice. The simulator's projections of share depend on the assumption that the choice set in the simulator matches the choice set the actual buyer faces in the market, which is rarely fully true. The contribution maximum on the simulated curve is often a sharp peak that does not survive sensitivity analysis (small changes in assumed costs or competitor prices move the peak by 10-20%). The output is also typically presented as a point estimate when it should be presented as a range with explicit confidence intervals.
The chart illustrates the typical shape of a CBC-derived demand and contribution curve. The contribution peak in this illustrative example is at the $129 price point, with $149 a close second. The naive recommendation from a CBC simulator is to price at $129. The operationally-aware recommendation is to define a range ($129 to $149) and validate in market, because the simulator's confidence in distinguishing those two points is typically lower than the simulator's point output suggests, and the in-market behavior may differ from the simulated behavior by more than the gap between the two points.
A second pricing-decision-from-CBC consideration is the structure of the price ladder. A single price point answer ignores the segmentation that the CBC utility data actually supports. Most pricing decisions are really about how to design a tiered price ladder (three to five tiers spanning different feature bundles and price points) rather than a single price. The CBC data can support a tier-design exercise that identifies which feature-and-price combinations create the most efficient revenue capture across the customer base, and the tier design is often more consequential than the within-tier price level.
The conjoint output is the input to a pricing decision, not the decision itself. The teams that get value out of conjoint analysis are the ones that treat the simulator output as the starting point of a discussion about ladder design, segment targeting, and strategic positioning, not the ones that treat the contribution peak as the answer.
The Pricing Function Versus the Pricing Project
The deepest distinction in the value-based pricing literature, well-articulated in Stephan Liozu's work on pricing organization design, is the distinction between pricing as a project (a one-time engagement that produces a price decision) and pricing as a function (an ongoing organizational capability that produces price decisions continuously). The slogan version of value-based pricing is usually a pricing project. The operational version is a pricing function.
The differences are organizational. A pricing function has dedicated headcount (a pricing manager or pricing team, depending on company size), a defined cadence (quarterly or annual price reviews, with measurement cadence underneath), an instrumentation layer (the systems that track WTP, win rates, discount distributions, competitive intelligence), and a decision-making authority (who can change a price, and on what evidence). A pricing project produces a deck and a price list and goes away.
The McKinsey pricing benchmark research, which has been published in various forms over the last decade (2010 McKinsey & Company report on Pricing, recent updates in the McKinsey Quarterly on pricing as a margin-improvement lever), consistently finds that companies with dedicated pricing functions deliver materially higher revenue growth and margin expansion than companies that treat pricing as a project. The published numbers vary, but the patterns are consistent across industries: dedicated pricing functions correlate with 1 to 5 percentage points of margin improvement over multi-year periods, with the wider distributions in industries where pricing complexity is higher.
The organizational design choice (when to invest in a pricing function rather than running pricing as a project) is itself a pricing decision in disguise. The investment in a pricing function pays off in proportion to the company's pricing complexity (more SKUs, more buyer segments, more competitive dynamics, more frequent product changes) and the company's revenue base (the absolute dollars at stake from a one-percent margin improvement). For a $5M ARR company with a single-tier product, a pricing project annually is probably sufficient. For a $100M ARR company with three product lines and a global customer base, a pricing function is almost certainly underweight at one or two FTE and probably justifies a small team.
What To Do If You Are Starting Now
The practical recommendation for a team starting from scratch (no current pricing-research capability, no dedicated pricing headcount, a pricing decision that needs to be made in the next two quarters) is more pragmatic than the full operating workflow above.
Run a streamlined PSM first. A PSM with 150-300 respondents from a representative customer panel can be fielded in 4-6 weeks for $15K-$25K and will give a candidate price range that is materially better than the gut-feel alternative. Treat the output as a range, not a point estimate, and apply a 20-30% deflator to the OPP for the hypothetical bias.
Validate the PSM-derived candidate price in market with an A/B test on a subset of traffic. If the volume is too low for a clean A/B (which is common for B2B SaaS with under 10,000 monthly trial starts), run a phased rollout instead (price A for the first month, price B for the second month, with caveats about the time-series confound). The validation phase is where the survey-versus-revealed gap gets reconciled.
If the PSM-and-validation produces a confident price, set it and move on. If the price decision feels insufficient (multiple tiers, complex bundle, segment-dependent), commission a CBC. The CBC is expensive (often $50K-$150K including vendor and analyst time) and slow (8-12 weeks), and it should be commissioned only when the simpler methods do not produce a confident answer. Do not commission a CBC as the default; commission it as the escalation.
Schedule the next cycle. Whatever the initial price decision, put a calendar entry for the next pricing review six to twelve months out, with a defined trigger (the next PSM round, or an in-market test, or a competitive event). The single biggest predictor of whether value-based pricing actually delivers margin improvement is whether the team revisits the price on a cadence rather than treating the first decision as final.
The full research stack (CBC plus PSM plus in-market validation, on cadence, with a dedicated pricing function) is the gold standard. It is also expensive, slow, and beyond the resources of most mid-market operators. The streamlined version (PSM plus A/B validation, repeated on cadence) captures most of the value at a fraction of the cost. The most expensive pricing decision is the one made by gut with no measurement at all, and the second most expensive is the one made by a sophisticated research method and then never revisited.
Key Takeaways
- Value-based pricing is talked about more than it is operationalized. The gap is not in the research methods (which are mature) but in the operating cadence and the willingness to act on measurement-driven price changes.
- The credible research stack is choice-based conjoint (CBC) for trade-off structure, Van Westendorp PSM for absolute price boundaries, and in-market A/B testing for revealed-preference validation. None of the three is sufficient on its own.
- Stated willingness to pay systematically overstates revealed WTP by 30-60% across categories. The gap is the "hypothetical bias" documented in the contingent-valuation literature and is informative, not a methodological failure.
- The operational workflow is six steps: define the decision, qualitative pre-work, fielded survey, candidate range, in-market validation, operating cadence. Most teams do the first four and skip the last two.
- Value-based pricing fails predictably in commodities (no value capture above marginal cost), two-sided markets (cross-side network effects dominate), regulated industries (price set externally), and uncertain-value products (customer cannot estimate value at purchase).
- The CBC output (utility function and market simulator) is the input to a pricing decision, not the decision itself. The output is best used to define tier structure and candidate price ranges rather than to identify point-estimate price recommendations.
- The McKinsey pricing benchmark research consistently associates dedicated pricing functions (capability) with materially better margin outcomes than pricing as a project (deliverable). The organizational investment is itself a pricing decision.
- For teams starting from scratch, the streamlined workflow (PSM plus A/B validation, on cadence) captures most of the value of the full research stack at a fraction of the cost. The full stack (CBC plus PSM plus validation, with a dedicated pricing function) is the gold standard but is justified only at meaningful scale.
- The most expensive pricing decision is the one made by gut with no measurement. The second most expensive is the one made by sophisticated measurement and then never revisited. The discipline is the operating loop, not the method.
Read Next
- Pricing Strategy
Dynamic Pricing Fairness Audits: A Practitioner Method for Pre-Launch and Continuous Review
Dynamic pricing systems can drift into discrimination without anyone in the team intending it. The audit method is borrowed from credit modeling, adapted for pricing CI/CD, and made boring enough to run every release.
- Pricing Strategy
Pricing Experimentation Without the Legal Risk: An Operator Framework for Defensible A/B Tests
Price A/B tests are not, by themselves, illegal. Most of the legal risk lies in how the cohorts are formed, what data is used, and what the team can show a regulator a year later. This is the framework that survives the question.
- Pricing Strategy
Currency Localization and Willingness-to-Pay Differentials
Local-currency presentation moves willingness to pay by 5 to 15% in tested field experiments. The math behind PPP adjustment, the operational complexity, and where the easy framing breaks down for B2B and tax.
The Conversation
Be the first to weigh in
Join the conversation
Disagree, share a counter-example from your own work, or point at research that changes the picture. Comments are moderated, no account required.