The data
Original data, citable and open.
17 datasets produced or analyzed for the essays. Each is available as structured JSON, fully described with methodology, sample size, and license.
Loss Aversion Ratios by Stake Level
JSON →Empirical loss aversion coefficient λ observed in marketplace pricing experiments, decomposed by transaction stake level. Shows that λ is not constant (textbook 2.25) but varies systematically with stake magnitude and user platform investment.
- Sample size
- ~14.1M total observations across buckets
- Collected
- 2024-06/2025-05
- License
- CC-BY-4.0 for cited figures
Churn Windows by Discount-Type Subscriber Cohort
JSON →Observed cancellation concentration around billing dates across subscription cohorts, decomposed by estimated beta (present-bias) parameter. Shows that 52–68% of annual churn events in consumer subscriptions occur within 7 days of a billing date — consistent with the beta-delta hyperbolic discounting model's prediction of billing-day regret.
- Sample size
- ~2.4M subscriber-months
- Collected
- 2022-01/2024-12
- License
- CC-BY-4.0 for aggregate figures
Ladder-Up vs Ladder-Down SaaS Pricing Conversion
JSON →Direct A/B test of ladder-up (start on free or starter, prompt to upgrade) vs ladder-down (start on premium trial, prompt to downgrade) pricing paths across 9 SaaS products. Ladder-down converts 31–58% more paying users, driven by endowment-effect-induced resistance to losing premium features.
- Sample size
- ~182K signups across 9 products
- Collected
- 2024-Q2/Q3
- License
- CC-BY-4.0 for aggregate figures
Used in · endowment-effect-saas-pricing
Platform Entry Threshold by Complementor Category Share
JSON →Observed relationship between a complementor category's share of platform transaction volume and the probability that the platform enters the category within 24 months. Entry becomes likely once a category exceeds ~8% of platform volume.
- Sample size
- 412 categories across 4 platforms
- Collected
- 2018-2024
- License
- CC-BY-4.0 for cited figures
Used in · platform-cannibalization-dynamics
Vertical SaaS Market Concentration and Multi-Homing
JSON →Market concentration (top-3 share) and supplier-side multi-homing rates across 40 vertical SaaS markets. Winner-take-most is the exception, not the rule: only 22% of vertical markets exhibit top-3 share above 70%, and those markets also show low multi-homing.
- Sample size
- 40 vertical markets
- Collected
- 2024
- License
- CC-BY-4.0 for synthesis
Used in · winner-take-most-multi-homing-vertical-saas · two-sided-network-effects-dead
MTA Reported ROAS vs Experimental (Incrementality) ROAS
JSON →Side-by-side comparison of ROAS reported by multi-touch attribution systems versus ROAS estimated via randomized geo-lift experiments for the same channels and periods. MTA systematically overstates ROAS by 2.4–6.5x, with the gap widest for retargeting and display.
- Sample size
- 6 published studies, 22 channel-study combinations
- Collected
- 2015-2024
- License
- CC-BY-4.0 for synthesis
Used in · multi-touch-attribution-causal-inference-dag · unified-measurement-architecture-mmm-mta-experimentation
Bayesian MMM — Channel Saturation and Adstock Parameters
JSON →Posterior estimates of adstock half-life and saturation parameters (Hill function) for eight paid-media channels from a privacy-first Bayesian marketing mix model. Reveals that TV has the longest decay (12-week half-life) while search has the shortest (under 1 week).
- Sample size
- 156 weeks × 180 DMAs × 8 channels
- Collected
- 2022-01/2024-12
- License
- CC-BY-4.0 for cited figures
Used in · marketing-mix-modeling-privacy-first-era · unified-measurement-architecture-mmm-mta-experimentation
CausalImpact Lift from a B2B Content Program
JSON →Bayesian structural time series (CausalImpact) estimate of the causal lift on organic traffic from launching a dedicated 36-article B2B content program. Non-branded organic captures only 38% of total SEO impact; the remaining 62% flows through branded search and direct traffic.
- Sample size
- 104 weeks, 8 control variables, 6 outcomes
- Collected
- 2023-Q1/2025-Q1
- License
- CC-BY-4.0 for cited figures
Used in · causal-impact-seo-branded-search · compounding-advantage-content-moats-seo
Cohort LTV/CAC and Payback by Acquisition Channel
JSON →Acquisition-cohort unit economics for a consumer SaaS business, decomposed by channel. Exposes the aggregation fallacy: the rolled-up 3.1x LTV/CAC hides a channel portfolio with individual ratios ranging from 0.8x (brand-misaligned display) to 8.4x (organic referral), with very different payback profiles.
- Sample size
- ~18,400 customers across 7 channels in Q1 2023 cohort
- Collected
- 2023-01/2025-01
- License
- CC-BY-4.0 for cited figures
Used in · cohort-based-unit-economics · clv-control-variable-bid-strategies
Test Duration Reduction from Bayesian vs Frequentist A/B Testing
JSON →Head-to-head comparison of decision latency between Bayesian posterior-probability testing and classical frequentist fixed-sample testing across 48 production experiments. Median time-to-decision dropped 36% under Bayesian methodology with no increase in downstream product regret.
- Sample size
- 48 experiments, ~29M visitor-sessions total
- Collected
- 2024-Q1/2025-Q1
- License
- CC-BY-4.0 for aggregate figures
Used in · bayesian-ab-testing-practice
Cox Proportional Hazards — SaaS Churn Covariates
JSON →Fitted hazard ratios for ten covariates on 18-month SaaS subscriber survival. Feature usage depth and onboarding completion dominate (hazard ratios 0.34 and 0.41 respectively); price tier and annual billing have smaller but significant effects. Shows that churn is primarily a product-engagement phenomenon, not a pricing phenomenon.
- Sample size
- 82,450 subscribers, 14,212 churn events
- Collected
- 2023-07/2025-01
- License
- CC-BY-4.0 for cited figures
Learning-to-Rank Revenue Lift by Objective Function
JSON →Incremental revenue per session from different ranking objective functions on an e-commerce search result page. Revenue-weighted composite (relevance × margin × projected LTV) outperforms pure relevance ranking by 23% in GMV per session, with neutral effect on relevance perception.
- Sample size
- ~14.2M search sessions, 4 variants
- Collected
- 2024-08/2024-10
- License
- CC-BY-4.0 for cited figures
Used in · search-ranking-revenue-optimization-l2r
Transformer Product Embeddings — CTR Lift vs Collaborative Filtering
JSON →CTR and downstream conversion lift from replacing a matrix-factorization collaborative filter with transformer-based session embeddings (BERT4Rec-style). Transformer embeddings lift CTR by 18–32% across cold-start, returning-user, and category-diverse segments.
- Sample size
- ~6.2M users, 4 segments
- Collected
- 2024-10/2024-12
- License
- CC-BY-4.0 for cited figures
Used in · transformer-product-embeddings-ecommerce · cold-start-problem-few-shot-learning
Uplift Modeling — Persuadable Share by Customer Segment
JSON →Share of customers falling into each of the four uplift quadrants (sure-thing, persuadable, lost-cause, do-not-disturb) for a promotional email campaign, decomposed by customer segment. Only 18% of the audience is genuinely persuadable; 64% of promotional budget is historically wasted on the other three groups.
- Sample size
- ~1.6M customers, 4-segment decomposition
- Collected
- 2024-Q3/Q4
- License
- CC-BY-4.0 for cited figures
Used in · personalized-promotion-uplift-modeling
Cost per Attention Second by Media Format
JSON →CPAS (cost per attention second) computed across 12 digital and traditional media formats from eye-tracking and dwell-inferred attention data. Display banners — the cheapest format on CPM — are the most expensive on attention. Connected-TV and audio invert the traditional CPM-based ROI ranking.
- Sample size
- ~120M measured impressions across 7 studies
- Collected
- 2022-03/2024-11
- License
- CC-BY-4.0 for cited figures
Creative Fatigue Decay by Impression Band
JSON →Relative response (click-through and post-click conversion) as the same creative is shown repeatedly to the same audience, segmented by audience-frequency decile. Fatigue onset is earlier than industry convention assumes — entropy-based detection flags decay 2–4 weeks before CTR collapse.
- Sample size
- ~3.8B impressions, 14 campaigns
- Collected
- 2024-Q2/2025-Q1
- License
- CC-BY-4.0 for cited figures
Content Moat — Traffic per Article as Archive Grows
JSON →Traffic per article as a niche content archive grows from 1 to 200+ articles. Per-article traffic COMPOUNDS with archive size (network effect via internal linking + topical authority), not flat-linear — 50th article gets 2.8× the traffic of the 1st article for identical quality.
- Sample size
- 8 sites, 1,420 articles tracked
- Collected
- 2022-01/2025-01
- License
- CC-BY-4.0 for cited figures
Used in · compounding-advantage-content-moats-seo