The Mood Index: Reading Affect, Compulsivity, and Identity Signals in Cosmetics E-commerce Baskets

TL;DR: Cosmetics is structurally different from other consumer e-commerce categories because four clinical psychology mechanisms operate at unusually high intensity at the same time: affect regulation, compulsive buying disorder, identity construction, and social comparison. Each mechanism leaves a distinct fingerprint in checkout data, from late-night single-item premium baskets to return-then-repurchase loops to brand-tribe persistence. Standard RFM segmentation does not surface this variance because it was not designed to. The Mood Index, proposed here as a three-component construct (affect, compulsivity, identity), is a research direction rather than a validated metric, and it forces the segmentation question into an ethical frame that pure conversion optimization avoids: are you regulating the customer's affect, or amplifying it.

A note on retailer names. Sephora, Watsons, and Gratis appear throughout this essay as well-known examples of three distinct operating archetypes: a global prestige-beauty specialty retailer, a cross-category health-and-beauty format with strong pharmacy adjacency, and a mid-market mass-distribution beauty retailer. The quantitative figures attributed to each archetype come from advisory engagements with anonymized partner operators that match those archetypes, not from Sephora, Watsons, or Gratis themselves. The named retailers' positioning, loyalty programs, and category mix are described from public reporting and competitive observation, which is enough to characterize the archetype without speaking for the company.

The Clinical Category Problem

Of all the consumer e-commerce verticals, cosmetics is the one where the cleanest analytical frameworks fail most reliably. A category that on the shelf looks like simple consumer-packaged-goods replenishment behaves, at the basket level, like four overlapping clinical phenomena layered on top of each other. The textbook RFM segmentation, the cohort retention curves, the price-elasticity playbooks that work fine for grocery, fashion, electronics, all of them flatten when applied to cosmetics. The variance they leave on the table is not noise. It is structure. Specifically, it is the structure of human affect, identity, and impulse control, which the consumer-packaged-goods vocabulary was never built to describe.

The gap matters more than analysts often admit. When a beauty retailer tells me that a 40-segment loyalty program is producing diminishing returns, the conversation almost always ends in the same place: the segments are correct in their RFM coordinates and wrong in everything else. They sort customers by what was bought. They cannot sort customers by why.

Why is the variable that cosmetics analysts have to recover, because in cosmetics, why dominates what. A 22-year-old buying her first mid-tier serum after a breakup, a 47-year-old stress-shopping prestige skincare at 1 a.m. on Tuesday, a 31-year-old methodically rebuilding her morning routine after a dermatologist consultation. These three customers may produce identical RFM cells. They are entirely different problems for product, CRM, and pricing teams to solve.

The problem is not new. Clinical psychology has been writing about cosmetics, body image, and self-presentation for half a century. Faber and O'Guinn (1992) developed the Compulsive Buying Scale partly because cosmetics and clothing were the categories where compulsive shoppers concentrated. Atalay and Meloy (2011) published the foundational paper on retail therapy, with cosmetics, apparel, and accessories accounting for the majority of the sample's mood-repair purchases. Belk (1988) located the cosmetics counter as one of the canonical sites where the extended self is constructed through possession and consumption. Festinger's (1954) social comparison framework was extended to beauty contexts by Richins (1991), Henderson-King and Henderson-King (1997), and a generation of follow-up work since.

What e-commerce data did, between roughly 2014 and today, was give us the first opportunity to test these clinical models at the level of millions of decisions, billions of sessions, and tens of billions of impressions. The clinical literature studied small samples in lab conditions or with clipboard interviews. The marketplaces hold the population-scale ground truth. In advisory work with three large beauty operators across the United States and Europe, the consistent finding was this: the four clinical mechanisms produce highly distinctive behavioral signatures at the session and basket level. Once you know what to look for, the signal-to-noise ratio is dramatic. Once you know which mechanism dominates a given customer's pattern, the right CRM action becomes obvious.

This essay is the synthesis. The first half maps four clinical mechanisms onto the behavioral signals they generate in cosmetics e-commerce data. The middle pulls back to compare three distinct formats, Sephora, Watsons, and Gratis, whose basket distributions reflect three different mixes of those mechanisms. The third part introduces the Mood Index as a three-component construct that operationalizes the clinical view. The closing section is the part most analytics teams are not equipped to discuss: when affect-state segmentation is and is not ethical, and how to tell the difference.

Four Mechanisms Operating in Cosmetics

The relative population weight of these mechanisms is not equal. The chart below summarizes the published prevalence ranges, expressed as a share of the adult female population in cosmetics-buying age cohorts. The bars are not additive (the same individual can sit at the intersection of multiple mechanisms), and the social-comparison cohort is the broadest because the underlying psychological process is universal even if its commercial expression is heterogeneous.

Approximate Prevalence Indicators in the Cosmetics-Buying Adult Female Population (See Table for Sources and Caveats)

The four "active" mechanism rows above (affect regulation, identity, social comparison, the functional retail-therapy population) come from a different evidence base than the clinical-diagnosis rows. The latter are anchored to the National Comorbidity Survey Replication and the Veale BDD review; the former are practitioner estimates derived from how often each mechanism shows up as the dominant driver in advisory partner data, and should be read as orders of magnitude rather than precision figures. Two clinical-diagnosis caveats: the eating-disorders row is omitted from this chart because it does not stack onto the same axis cleanly (the lifetime numbers are larger than the 12-month numbers used elsewhere), and the depression and anxiety figures are 12-month adult-population rates, not the cosmetics-buying-cohort rates, which would require separate cohort estimation.

1. Affect Regulation: Shopping as Mood Repair

The cleanest behavioral economics result on shopping and mood is the Atalay and Meloy (2011) demonstration that unplanned purchases produce reliable, measurable improvements in self-reported affect, with effects detectable hours after the purchase. Rick, Pereira, and Burson (2014) replicated the finding and extended it: the mood improvement is robust to controls for wealth, demographics, and pre-existing mood, and it is concentrated in categories that signal personal indulgence rather than utility. Cosmetics, by virtue of being the most universally accepted indulgence category in modern consumer culture, sits at the center of the empirical distribution.

Mick and DeMoss (1990) had documented the qualitative side of this two decades earlier. Self-gifts, in their phenomenological study, fell into four contexts: reward, therapy, holiday, and incentive. Therapy gifts, the ones explicitly purchased to repair a low affect state, were dominated by cosmetics and personal-care items in the female sample. The pattern survived the transition from physical retail to digital. Atalay and Meloy's 2011 sample, drawn from online commerce, showed cosmetics as the most-cited category for self-described retail therapy purchases.

What does affect regulation look like in checkout data? In partner data we have observed patterns consistent with the published literature. Affect-driven purchases cluster in the tail hours of the day, between 22:00 and 02:00 local time, when stress levels are typically higher and impulse control is lower. They tend to be single-item baskets, because affect repair does not require a full basket, just the ritual of acquisition. They skew toward premium-tier products, even from customers whose history is mid-tier, because the perceived self-gift signal of a more expensive product is part of what generates the mood lift. And they show a characteristic post-hoc pattern: a return within seven days, then a re-purchase of the same or substitute item within thirty, suggesting the buyer cycled through the affective benefit of acquisition, the post-purchase doubt, the loss of the object, and the renewed desire for the same regulatory effect.

The relevant operational question is not how to suppress these purchases but how to design the post-purchase experience so that the customer's after-thirty-days affect is actually higher rather than just churn-and-repurchase. That is a design problem, not an analytics one, but it requires the analytics infrastructure to recognize the pattern in the first place.

2. Compulsive Buying Disorder

Compulsive buying is the clinical edge case of affect-driven shopping. Faber and O'Guinn (1992) developed the screening instrument that became the field's standard. Across multiple replications in the United States, Germany, France, and South Korea, the prevalence estimate for clinically significant compulsive buying in adult women settles around 5 to 8 percent. DeSarbo and Edwards (1996) further differentiated two clusters within the compulsive-buying population: a higher-arousal "impulse" cluster, where the disorder is closer to behavioral addiction, and a lower-arousal "reactive" cluster, where the disorder functions primarily as a chronic affect-regulation strategy gone wrong.

Workman and Paper (2010) reviewed the cross-disciplinary literature and emphasized that compulsive buyers are not a homogeneous group. The behavior shows different neuroeconomic signatures in different subtypes. What unifies them, in the cosmetics context specifically, is a recognizable basket-level pattern.

In partner data, the compulsivity signature looks like the following. The frequency-to-value ratio is inverted relative to the rest of the customer base: high transaction count, low average basket value, with a long right tail of occasional large baskets. The return rate is elevated, often two to three times the platform median, and a substantial fraction of returns are followed by a substitute-item repurchase rather than a refund withdrawal. Payment methods oscillate, particularly between credit and debit, in ways that suggest budget-tracking failures, with sporadic use of buy-now-pay-later in periods that would otherwise correlate with budget tightness. Overlap between the compulsive-return cohort and the compulsive-repurchase cohort exceeds 60 percent, where the platform-wide overlap is typically below 15.

There is an important constraint here. The CBD population is a clinical population, not a marketing one, and the right ethical posture for a beauty retailer is to identify these patterns and treat them with care rather than to optimize for their frequency. The Faber and O'Guinn screener exists precisely so that clinical researchers can identify the population for treatment, not so that retailers can identify it for targeting. We will return to this in the closing section.

3. Identity Construction: The Extended Self

Belk's (1988) extended-self framework remains the most powerful theoretical lens for understanding why cosmetics, of all categories, generates the loyalty patterns it does. Belk argued that possessions function as extensions of the self, and the categories where this is most pronounced are the ones where the object is intimate, ritualized, and visible. Cosmetics is intimate (applied to the body), ritualized (used at fixed times of day), and visible (the social presentation of self). It hits all three criteria more directly than any other consumer category.

Belk's 2013 update, "Extended self in a digital world," extended the argument to digital and increasingly hybrid identities. Cosmetics, in this update, occupies an unusual position: the products are physical, but the identity work they do is increasingly social-media-mediated, which means the brand affiliation matters as much as the chemistry. A customer who buys mid-tier mascara at Sephora is not, primarily, buying a different mascara from the one she could buy at a drugstore. She is buying the mascara plus the self-presentation of being a Sephora customer.

The behavioral signature of identity construction is the most stable of the four. It produces tight category sequencing (the same products in the same order across replenishment cycles), sustained brand-tribe loyalty within sub-categories (a mascara loyalist may move freely across foundation brands but never deviate on mascara), and anniversary-driven replenishment rather than mood-driven replenishment, which means the timing is predictable from prior purchase dates rather than from external events. The customer in identity-construction mode is the one a retailer most wants to retain, because she is the lowest-cost-to-serve, highest-LTV segment.

The catch is that identity construction is not stable across the customer's life. Major life events, breakups, career transitions, weight changes, illness diagnoses, all of them rupture the extended self and trigger a phase where the identity-construction patterns get temporarily overlaid with affect-regulation patterns. A retailer with no clinical lens looks at this and sees "customer churned." A retailer with the lens sees a customer in transition who needs different CRM during the transition than after.

The transition dynamics matter operationally. The diagram below sketches the typical state machine for a beauty customer over a multi-year horizon. Most customers begin in identity construction once they settle on a routine. Life events trigger transitions into affect regulation (mood repair) or, in a smaller fraction, into compulsive buying. Recovery and stabilization usually return the customer to identity construction, sometimes with a new product set reflecting a new self. The CRM treatment that fits one state is wrong for the others.

Typical state transitions for a beauty customer over a multi-year horizon

Loading diagram...

Festinger's (1954) social comparison theory is the oldest of the four mechanisms in continuous use, and the one that has been most thoroughly extended into beauty research specifically. Richins (1991) showed that exposure to idealized images in advertising reliably depresses self-evaluation in the female audience. Henderson-King and Henderson-King (1997, Journal of Applied Social Psychology) demonstrated the same effect in a controlled study, with the size of the response moderated by the participant's existing body satisfaction. The 2010s and 2020s extended the framework into social-media contexts: Tiggemann and Slater (2014), Fardouly et al. (2015), and a long subsequent literature established that the comparison effect intensifies when the comparison target is a peer rather than a celebrity, because peers feel more proximate and therefore more reachable.

For e-commerce data, the relevant downstream pattern is that social-comparison-driven purchases are mediated by reviews and user-generated content rather than by branded product pages. The behavioral signature includes very short conversion latencies (under 90 seconds from product page to cart) for products that the customer arrived at via a "trending in reviews" or "most-loved" surface, longer latencies for products arrived at via merchandising. The post-purchase pattern is also distinctive: reviewers are themselves drawn into the comparison loop, posting their own UGC, which creates the cascading social-comparison cycles that platforms like Sephora explicitly engineer with the Beauty Insider Community.

This is the mechanism with the most ambiguous ethics. Social comparison can drive purchases that the customer is genuinely glad she made (she discovered a product she would not have found via merchandising, it works for her, she is happy with it). It can also drive purchases the customer would, on reflection, prefer not to have made. The same UGC infrastructure produces both outcomes. The empirical question, which the cosmetics industry has barely begun to study, is the ratio.

Comorbidities and Adjacent Clinical Territory

The four mechanisms above are the mainstream clinical phenomena that any cosmetics segmentation has to engage with. But the territory does not end there, and a serious analyst should know what sits adjacent to it, both because it informs the population-level prevalence of the four mechanisms and because it shows where the limits of behavioral data lie.

Body dysmorphic disorder (BDD) is the most-studied adjacent condition. Veale, Gledhill, Christodoulou, and Hodsoll (2016, Body Image) systematically reviewed prevalence and reported a weighted estimate of 1.9 percent in adult community samples, rising to 9.2 percent in cosmetic dermatology outpatients, 11.3 percent in general dermatology outpatients, and 13.2 percent in general cosmetic surgery patients. Phillips, in her 2009 book Understanding Body Dysmorphic Disorder (Oxford University Press), describes the cosmetic and skincare consumption patterns of BDD patients, who tend to over-spend in narrow product categories (skin-clearing, anti-aging, blemish concealment) without ever reaching subjective satisfaction. The behavioral signature in checkout data is concentrated repurchase in narrow sub-categories with consistently low post-purchase satisfaction (where measured), but the diagnosis cannot be inferred from purchase data alone, and pretending it can is both clinically reckless and ethically dangerous.

Eating disorders co-occur with cosmetic compulsion in some populations, particularly in the categories that load on appearance management. Hudson, Hiripi, Pope, and Kessler (2007, Biological Psychiatry), in the National Comorbidity Survey Replication, reported lifetime DSM-IV prevalence in women of 0.9 percent for anorexia nervosa, 1.5 percent for bulimia nervosa, and 3.5 percent for binge eating disorder, with substantial comorbidity across other disorders. The cosmetic-compulsion linkage is observational rather than causal, but it is robust enough that beauty retailers should not target appearance-anxiety messaging at any segment that screens positive on these patterns. The harm-benefit ratio is bad.

Major depressive disorder is the largest of the adjacent conditions by population prevalence: 12-month prevalence of major depressive disorder in U.S. adults is 6.7 percent (Kessler et al., 2003, JAMA), rising somewhat in adult women specifically. Depression interacts with shopping behavior in complex ways. Some depressed patients show withdrawal from consumption; others show affect-regulatory shopping that shades into compulsivity; the modal pattern depends on depression subtype, comorbid anxiety, and medication state. The behavioral fingerprint in cosmetics specifically is heterogeneous, which is why the Mood Index is calibrated for the four cleaner mechanisms above rather than for depression directly.

Anxiety disorders overlap with affect-regulation shopping. Generalized anxiety disorder lifetime prevalence in U.S. adults is 5.7 percent, with women showing 12-month rates around 3.4 percent (Kessler et al., 2005). The behavioral signature is mood-regulatory shopping that is visually indistinguishable from non-clinical retail therapy in checkout data. Disambiguating the two requires panel data with mental-health survey instruments, which retailers do not typically have access to.

The implication is that the four mechanisms in the Mood Index are not the whole story. They are the four cleanest signals that can be reliably extracted from session-level behavioral data alone. The rest of the clinical territory is real and matters, but it sits beyond what behavioral data can responsibly diagnose. The right operational posture is humility about that limit.

Prevalence of Clinical Conditions With Documented Cosmetics-Spending Linkage (US Adult, Lifetime Where Indicated)

Condition	Prevalence	Cosmetics-Spending Signature	Source
Compulsive buying disorder	Point: 5.8% (US); 6.9% (German sample, both genders)	High frequency, elevated returns, payment oscillation	Faber & O Guinn (1992, JCR); Mueller et al. (2010)
Body dysmorphic disorder	Community: 1.9%; cosmetic dermatology: 9.2%; general dermatology: 11.3%; cosmetic surgery: 13.2%	Concentrated narrow-category repurchase, low post-purchase satisfaction	Veale et al. (2016, Body Image)
Major depressive disorder	12-month: 6.7%; lifetime: 16.6% (US adults)	Heterogeneous: withdrawal in some, affect-regulatory in others	Kessler et al. (2003, JAMA)
Generalized anxiety disorder	Lifetime: 5.7%; 12-month: 2.7% (women: 3.4%)	Affect-regulatory shopping, visually indistinguishable from non-clinical retail therapy in checkout data	Kessler et al. (2005)
Eating disorders (women, lifetime)	Anorexia: 0.9%; bulimia: 1.5%; binge eating: 3.5%	Co-occurrence with compulsivity in appearance-management sub-categories	Hudson et al. (2007, Biological Psychiatry)
Subclinical retail-therapy shoppers	Estimated 30 to 50% of adult women (functional, non-pathological)	Hour-of-day clustering, single-item premium baskets, stable post-purchase satisfaction	Atalay & Meloy (2011); Rick et al. (2014)

Mapping Mechanisms to Behavioral Signals

The four mechanisms above generate distinguishable session and basket signatures. The table below collects the patterns we have observed in advisory engagements with three large beauty operators, cross-referenced against the published clinical literature where applicable.

Behavioral Signatures of the Four Clinical Mechanisms in Cosmetics E-commerce

Behavioral Signal	Underlying Mechanism	Observed Detection Pattern	Operating Use
Hour-of-day cluster (22:00-02:00)	Affect regulation	Daily distribution of basket-creation timestamps, normalized per customer	Trigger high-touch follow-up rather than upsell
Single-item premium-tier outlier in mid-tier history	Affect regulation	Per-basket price tier vs trailing 90-day median price tier	Surface relevant aftercare content, not adjacent products
Return within 7 days followed by repurchase within 30	Affect regulation OR CBD	Sequential basket events on the same SKU or close substitute	Suppress aggressive remarketing, escalate human review
High frequency, low average basket, elevated return rate	Compulsive buying disorder	Frequency >2x platform median + return rate >2x platform median	Reduce push-CRM cadence, do not target with promotional emails
Payment-method oscillation week-over-week	Compulsive buying disorder	Distinct payment methods across consecutive baskets	Treat as low-confidence segment, fall back to organic touchpoints
Tight category sequencing across replenishment cycles	Identity construction	Same SKU sequence in same order per replenishment window	Predict next-replenishment timing, send pre-emptive restock
Brand-tribe loyalty within sub-category	Identity construction	Brand share within sub-category >85% over trailing 365 days	Premium loyalty perks tied to specific brand, not category
Conversion latency <90s after UGC entry	Social comparison	Time from product page first view to add-to-cart, segmented by entry surface	Show genuine reviews, suppress staged influencer content
Anniversary-driven replenishment	Identity construction	Days-since-prior-purchase clustered around stable mode	Pre-emptive replenishment reminder, not discount
Long basket build (15+ minutes session) ending in single low-tier item	Compulsive buying OR social comparison	Session duration vs basket value, with bounce-back risk indicator	Save state, do not interrupt with push notification

The mapping is not one-to-one. Several signals are consistent with more than one mechanism, and disambiguation requires looking at the customer's longitudinal pattern rather than any single basket. The return-then-repurchase loop, for instance, is a near-universal signal of affect-regulatory cycling, but in a customer with otherwise CBD-consistent patterns, it is more likely a CBD manifestation. The model that has worked best in practice is a hierarchical one: compute mechanism-specific scores at the customer level, then make CRM decisions based on the dominant mechanism plus its intensity.

The causal structure looks like the diagram below. Mechanisms generate behaviors; behaviors get observed as signals in checkout data; signals inform operating decisions. The arrows are not symmetric, and the operating decision is what ultimately closes the loop back to the customer's affect.

Clinical mechanism to operating decision, with the feedback loop closed

Loading diagram...

The feedback loop in the diagram is the part that most analytics teams underweight. The CRM decision the company makes in response to a signal does not just optimize for next-period conversion. It also feeds back into the customer's underlying affect state, either dampening or amplifying the mechanism that generated the signal in the first place. A customer in an affect-regulation cycle who receives an aggressive remarketing push at the moment of post-purchase regret receives the company's tacit endorsement of the cycle. Over a year, the cycle deepens, the LTV looks great, and the customer's affective experience of the brand worsens.

From Experience

A 2024 advisory engagement with a multi-brand beauty retailer in Western Europe

The firm's CRM team had built a 28-cell loyalty segmentation that was producing strong incremental revenue. We layered the four-mechanism scores on top of their RFM cells and found that one of their highest-LTV cells, what they called "VIP Returners", was almost entirely composed of customers who scored above the 80th percentile on the compulsivity index. The team's instinct, before the analysis, was to lean into the cell with deeper personalization and earlier-window discounts. After the analysis, they cut the push-CRM cadence to that cell by half and shifted the remaining touches to aftercare content, with no discount component. Twelve-month revenue from the cell came in 4 percent below the prior year, and the customer-reported satisfaction scores in that cell rose by twelve points. The team treated this as a win. Many CRM teams in their position would not have.

Three Formats, Three Demand Structures

The reason it is hard to write about cosmetics e-commerce as a single category is that the category, at the format level, fragments into businesses with sharply different mixes of the four mechanisms. Sephora, Watsons, and Gratis, the three formats most often discussed together in cosmetics analytics conversations, are not the same business. Their basket distributions reflect three distinct demand structures.

The Prestige Archetype (Sephora as the Public Example)

The prestige archetype is the aspirational, identity-heavy format. Sephora is its most-recognized public exemplar. The Beauty Insider program is among the most studied loyalty programs in the industry, partly because of its scale and partly because it is explicitly designed to ratify identity rather than just to drive frequency. The program's tiering, the Birthday Gift ritual, the Insider Community, and the in-store experience all stack toward making the customer's relationship to the brand an identity-construction phenomenon rather than a transactional one. The patterns described below are observed in advisory partner data from operators in this archetype, including but not limited to retailers that resemble Sephora's positioning.

The behavioral consequence is that the prestige archetype's basket distribution is dominated by the identity-construction mechanism. In advisory partner data drawn from operators in this archetype, and consistent with the publicly visible patterns at Sephora itself, prestige-archetype baskets show:

Brand share concentrations in the high-90s within sub-categories like fragrance, skincare-prestige, and color-prestige.
Tight replenishment rhythms (the customer who restocks her premium moisturizer every 67 days does it again at 67 days the next time, and again the time after).
Substantial UGC and review-driven discovery in the trial-and-expansion phases, but not in the replenishment-and-loyalty phases.
Premium-tier dominance: the median prestige-archetype basket price is materially above the category median across all three of the major Western European markets in our sample.

Affect-regulation patterns exist in the prestige archetype but are diluted by the identity-construction signal. Compulsive-buying patterns are present but concentrated in a smaller cohort than at the mass-market formats. Social-comparison patterns are highly active in the early customer lifecycle (which Sephora ratifies through Beauty Insider acquisition) and then attenuate as the customer settles into her replenishment routines.

The Cross-Category Archetype (Watsons as the Public Example)

The cross-category archetype sits in a different operational world. The format is health-and-beauty, with pharmacy and over-the-counter healthcare alongside cosmetics. Watsons exemplifies this archetype: across most of its Asian markets and increasingly in its European footprint, baskets routinely combine functional skincare with non-cosmetics products like vitamins, OTC medications, and personal-care basics. The cross-category mix reshapes the basket signature.

The behavioral consequence is that the cross-category-archetype distribution is dominated by utilitarian replenishment with a meaningful but smaller affect-regulation component. In advisory partner data from operators in this archetype, the signature looks like:

Larger basket sizes by item count, smaller by value.
Functional-skincare brands (Cetaphil, La Roche-Posay, CeraVe) over color cosmetics.
Cross-category co-purchase patterns: cleanser + sunscreen + multivitamin + cotton pads in one basket.
Lower hour-of-day clustering: the basket distribution flattens because the utility component happens during normal waking hours when the customer ran out of something.
Lower return rate, lower premium-tier outlier rate, lower brand-tribe loyalty within color cosmetics specifically.

Compulsive-buying patterns in the cross-category archetype exist but look different from those in the prestige archetype. The compulsive cohort here is more likely to over-stock functional categories (eight tubes of cleanser when one would do) than to purchase emotionally significant identity goods. Identity-construction patterns are present but tend to concentrate around skincare routines rather than around color-cosmetics brand tribes.

The Mid-Market Archetype (Gratis as the Public Example)

The mid-market archetype is the middle of the three. Mass-market positioning, broad national footprint, more cosmetics-pure than the cross-category archetype but less prestige-led than the high-end archetype. Gratis is the public example most readers will recognize. The mid-market basket distribution sits between the two extremes with a distinctive tilt toward replenishment.

Median basket value below the prestige archetype, item count below the cross-category archetype.
Mid-tier brand mix dominates. Color cosmetics carry a higher share of basket than in the cross-category archetype but a lower share than in the prestige archetype.
The largest cohort by purchase volume is "regular replenishment plus opportunistic discovery", a pattern that is more affect-driven than the prestige-archetype identity routines but more controlled than the high-frequency CBD signature.
Discount sensitivity is materially higher than in the prestige archetype.

The mid-market archetype customer is, in mechanism terms, the one for whom the four mechanisms are most evenly distributed. No single mechanism dominates her purchase pattern. The implication for CRM is the inverse of what most retailers assume: the mid-market customer is not the easiest to model, she is the hardest, because the variance she contributes is split across mechanisms rather than concentrated in one.

The hour-of-day distribution makes the archetype differences vivid. The chart below summarises the shapes observed in advisory partner data drawn from operators in each of the three archetypes. The named retailers (Sephora, Watsons, Gratis) are the canonical public examples of each archetype, not the data sources for the curves.

Hour-of-Day Basket Creation Distributions, Three Cosmetics-Retail Archetypes (Advisory Partner Sample)

Three things stand out. The prestige archetype (which Sephora exemplifies in the public market) has a pronounced late-evening peak around 22:00, consistent with a customer base where affect regulation and identity-rituals concentrate after work. The cross-category archetype (which Watsons exemplifies) is much flatter, peaking during normal commercial hours and falling steeply after 22:00, consistent with a utilitarian replenishment pattern. The mid-market archetype (which Gratis exemplifies) sits between them, with a smaller late-evening rise than the prestige curve but a more pronounced one than the cross-category curve.

The numerical fingerprint extends beyond hour-of-day. The summary below collects core basket-level metrics across the three archetypes, drawing on advisory-engagement observations from anonymized operating partners across Western Europe and North America between 2022 and 2025. The named retailers serve as canonical public examples of each archetype, not as the data sources for the figures. Numbers are normalized so the cross-category archetype baseline equals 1.00 on each metric. The ratios should be read as directional patterns from a small partner sample, not as benchmarks; sample sizes vary by metric and partner, and where a ratio crosses 1.00 in an unexpected direction (the prestige archetype shows a higher cancellation rate than the cross-category archetype, for example) that is the data point worth pausing on rather than the headline ones.

Basket-Level Fingerprints, Three Cosmetics-Retail Archetypes (Cross-category archetype baseline = 1.00, advisory partner sample)

Metric	Prestige (Sephora-type)	Cross-category (Watsons-type, baseline)	Mid-market (Gratis-type)	Underlying Mechanism Driver
Median basket value	2.37x	1.00	1.31x	Identity construction vs utility
Median items per basket	0.58x	1.00	0.81x	Cross-category bundling at the cross-category format
Late-night basket share (22:00 to 02:00)	2.84x	1.00	1.69x	Affect regulation
Brand share within sub-category (top brand)	1.92x	1.00	1.18x	Identity construction
Return rate (whole-basket)	1.81x	1.00	1.39x	Affect regulation + compulsivity
Return-then-repurchase overlap (28-day window)	2.41x	1.00	1.94x	Compulsivity
Days between repurchase (median)	0.73x	1.00	1.08x	Identity-driven anniversary timing
UGC-attributed conversions share	3.12x	1.00	1.55x	Social comparison
Cart abandonment rate	1.27x	1.00	0.94x	Higher cart deliberation in prestige (price thresholds)
Customer service contact rate per order	1.42x	1.00	0.86x	Higher engagement intensity in prestige
Order-to-fulfillment latency tolerance (cancel rate when delayed)	1.18x	1.00	0.93x	Lower tolerance in prestige (gift / event timing)
Discount-code redemption share	0.41x	1.00	1.36x	Higher price elasticity in mid-market and cross-category

The numbers tell most of the story the prose has been telling, but with two complications worth surfacing. The prestige archetype is heavier on value, fewer-items, identity-driven (concentrated brand share, faster repurchase, high UGC share). The cross-category archetype is functional. The mid-market archetype sits between. Where the picture gets more interesting is the bottom four rows. The prestige archetype shows a meaningfully higher cart-abandonment rate, more customer-service contacts per order, and lower tolerance for fulfillment delays than the cross-category archetype, all of which signal that prestige customers are more deliberate, more demanding, and more impatient even when they end up converting at higher rates and values. The discount-code redemption share is the inverse pattern: prestige customers use codes much less than mid-market or cross-category customers do, partly because prestige is less promotional but partly because identity-driven baskets are less price-elastic. None of these four metrics is what most cosmetics analyses report on, and they are worth instrumenting because they swing CRM design more than the headline RFM numbers do.

Beyond the basket-level fingerprint, the four mechanisms also produce distinguishable cyclicality patterns over the calendar year. The chart below tracks the share of weekly transactions attributed to each mechanism across the calendar year, normalized to the annual mean. Affect-regulation peaks visibly in late autumn and around the post-holiday January depressive trough; identity-construction is steadier with a Q4 gift-driven peak; compulsivity tracks payday cycles and shows weak seasonality otherwise; social comparison is the most amplified by external events (campaigns, viral product launches, peer-influencer cycles).

Mechanism-Attributed Transaction Share Across the Calendar Year (Indexed, Annual Mean = 100)

The January spike on the affect-regulation curve is the part most CRM teams under-instrument for. Post-holiday emotional letdown, weight-related self-evaluation pressure following holiday eating, and the seasonal-affective component combine into a roughly 24-percent above-baseline share for affect-regulatory shopping in early January, with a secondary effect persisting through February. The corresponding signal for retailers is that January is the worst month to over-message customers showing high compulsivity scores, even though it is the historically heaviest promotional month in the calendar. Most retailers do the opposite of the right thing by default.

The archetype-level differences in mechanism mix are what make a single cosmetics-segmentation playbook unworkable. The optimal CRM cadence for a prestige-archetype customer is not optimal for a cross-category-archetype customer, and a recommendation system that learns the prestige-archetype basket dynamics will badly mis-rank the cross-category-archetype basket dynamics.

The lipstick effect is the cleanest macro-level evidence that cosmetics demand is not driven by utility. If the category were utility-driven, it would shrink in recessions like other discretionary categories. Hill et al.'s 2012 paper formalized what Estée Lauder's Leonard Lauder had reportedly observed during the 2001 recession: the category co-moves negatively with consumer sentiment, because the affect-regulation and self-presentation functions of cosmetics matter more, not less, when other affect-regulation channels are constrained.

A Proposed Framework: The Mood Index

The Mood Index is a three-component construct proposed here to make the four-mechanism analysis operationally tractable. It has not been validated against an external mental-health instrument or against a longitudinal customer-outcomes panel; what follows is a framework specification, not a measurement claim. The construct is meant to summarize, at the customer level, the dominant clinical mechanism driving the customer's behavior over a defined trailing window, typically 180 days. The empirical question of how much variance the construct actually recovers, against what comparators, is a research direction rather than a published finding.

The Three Components

The affect score captures the degree to which the customer's purchases pattern as mood-repair events rather than as planned acquisitions. Inputs include: share of baskets created in the 22:00 to 02:00 window relative to the customer's overall hour-of-day distribution, share of baskets that are single-item, share of baskets in which the price tier is materially above the customer's trailing-90-day median, and the rate of return-then-repurchase events. Each input is normalized to a 0-to-1 scale by reference to the platform-level distribution, then averaged with weights informed by the strength of each input's correlation with self-reported retail-therapy behavior in customer-research panels.

The compulsivity score captures the degree to which the customer's purchases pattern as compulsive-buying-disorder-consistent. Inputs include: trailing 180-day frequency relative to the customer-base distribution, return rate relative to the customer-base distribution, payment-method oscillation count, and the overlap fraction between the customer's return events and the same-customer's repurchase events on the same or substitute SKU. Compulsivity score above the 80th percentile is the trigger that tells the CRM system to suppress aggressive remarketing, not to maximize it.

The identity score captures the degree to which the customer's purchases pattern as extended-self construction. Inputs include: brand share concentration within sub-categories, replenishment timing regularity, the smoothness of the category sequence across replenishment cycles, and the rate of organic (non-promoted) discovery. The identity score, unlike the other two, is the score retailers most want to maximize, because identity-construction customers are the highest-LTV, lowest-CAC, and most-stable cohort.

A fourth component, a social-comparison score, was experimented with in early formulations but did not stabilize across markets. Social-comparison signals are highly entry-surface dependent, and the same UGC behavior can indicate either healthy peer-driven discovery or unhealthy peer-driven dissatisfaction. The current Mood Index treats social comparison as a moderator that gets applied to the other three scores rather than as a fourth standalone score.

Operationalization

The minimum viable implementation needs three input tables: a basket-level table with timestamps, item count, basket value, and price tier; a returns table with the same identifiers; and an entry-surface table that records how each session reached the product page. Most beauty operators have all three already. The Mood Index then computes per-customer trailing-180-day scores for the three components and refreshes them weekly.

Two practical decisions matter more than they look. The first is the choice of normalization population. Platform-wide normalization makes the scores comparable across customers; cohort-specific normalization (by age, market, or tenure) makes them more accurate within cohorts but less comparable across them. The implementation that has worked best in practice is platform-wide, with cohort-specific decile bins surfaced as additional metadata for the CRM team.

The second is the threshold above which a CRM action is triggered. Mood Index components above the 80th percentile have produced the cleanest behavioral signal in our experience. Above the 90th percentile, the signal is so strong that the CRM logic should default to suppression rather than activation. The 80-to-90 band is the part where the most commercial value sits, because it is large enough to be meaningful and below the threshold where the appropriate response is to step back.

What the Mood Index Does Not Capture

The construct is calibrated for the four mechanisms in cosmetics. It is not a general-purpose mental-health screen. It does not detect body dysmorphic disorder, which is documented in the cosmetics-and-skincare-spending literature but requires different inputs (typically image-based and self-report measures that retailers do not have access to). It does not detect eating disorders, which co-occur with cosmetic compulsion in some populations but require clinical evaluation. It is not a substitute for the Faber and O'Guinn Compulsive Buying Scale in a clinical setting.

It is what it claims to be. A behaviorally grounded summary of the dominant clinical mechanism in a customer's commercial relationship with a beauty retailer. Used as designed, it improves CRM decisions and customer satisfaction. Used outside its design envelope, it produces false confidence about clinical states it cannot measure.

A Decision Framework for Product and CRM Teams

The Mood Index is only as useful as the operating discipline that surrounds it. The framework below has worked in three advisory engagements as a starting structure for product and CRM teams adopting the construct.

A seven-step rollout for a beauty retailer with existing RFM segmentation, basket-level event data, and a CRM system that supports differentiated cadence by segment.

Validate input data. Confirm that basket-level timestamps are local (not UTC) and complete to the second. Confirm that returns events have a foreign key to the originating basket. Confirm that entry-surface attribution covers at least 85 percent of sessions. Without these three, the Mood Index will produce noise, not signal.
Compute baseline platform distributions. For each of the inputs to the three component scores, compute the platform-wide distribution. These become the normalization reference for individual customer scores. Refresh quarterly.
Score the existing customer base. Compute trailing-180-day affect, compulsivity, and identity scores for every customer with at least three baskets in the window. Customers below the threshold get assigned a baseline score equal to the platform median.
Cross-tabulate against current segmentation. Map the new scores against the existing RFM cells. The cells where the new scores show high variance are the cells where the existing segmentation is leaving variance on the table.
Define CRM treatment by score, not by RFM. Replace the existing CRM cadence-by-RFM-cell with a cadence-by-Mood-Index-component-score. Compulsivity score above 80th percentile triggers suppression. Affect score above 80th triggers high-touch aftercare. Identity score above 80th triggers loyalty perks.
Hold-out test for 90 days. Run a randomized 80-20 split where the 20 percent control group continues on the existing RFM-based cadence. Track twelve-month revenue, return rate, and customer-reported satisfaction in both groups. Do not optimize for revenue alone.
Refine the suppression logic. The suppression decisions are where the most ethical and commercial value lives. Calibrate the suppression threshold quarterly based on the joint trajectory of revenue and customer-reported satisfaction. The right threshold is the one where the satisfaction curve continues rising even when the revenue curve flattens.

The framework's most counter-intuitive element is the suppression logic in step five. Most CRM teams optimize for activation. The Mood Index forces a posture where some customers receive less marketing rather than more, because their purchase patterns indicate that more marketing is not in their long-term interest. The teams that have implemented this discipline have, in our experience, reported higher customer-reported satisfaction at the cost of slightly lower short-term revenue. They have also reported lower churn and lower complaint rates two years out, which suggests the long-term LTV math comes out ahead.

The Ethical Line

The Mood Index, like any segmentation construct that uses affect-state proxies, lives at the intersection of legitimate personalization and exploitation. The empirical literature does not pick a side on which a beauty retailer should optimize. The choice is the company's, and it is made every time the CRM system makes an automated decision about which customer to message at which time with which offer.

Contrary to the Conventional View

Conventional view

A/B-validated CRM lifts are wins for both the company and the customer

What the evidence shows

In longitudinal follow-up of advisory-partner CRM tests across cosmetics operators, a non-trivial share of treatments that produce statistically significant short-term revenue lifts in standard A/B tests show signs that look like compulsivity amplification when followed up over twelve months. The signs include rising return rates within the treated cohort, declining customer-reported satisfaction scores, and elevated cancellation rates in the loyalty program. The signal is not yet quantified with a published, retailer-blinded estimate; the partner samples are small, the follow-up windows are short, and the satisfaction instruments vary across operators. The qualitative pattern is consistent enough that it should be tested rigorously, but a precise headline percentage would over-promise what the data currently supports. The treatments that produce cleaner wins, where revenue rises and satisfaction does not fall, are almost always treatments aimed at the identity-construction segment rather than at the affect-regulation or compulsivity segments. The lift, in those cases, comes from making the loyalist's existing routine slightly easier rather than from inducing additional purchases.

The implication is uncomfortable even without a sharp percentage. If a CRM team optimizes purely for incremental revenue under standard A/B-test methodology, the team risks systematically over-treating the customers who are most behaviorally vulnerable, because those customers respond most reliably to marketing pressure. The same A/B-test discipline that the data-driven marketing literature praises as the antidote to opinion-driven decision-making is, in cosmetics specifically, capable of producing marginal commercial wins that come at the customer's affective expense. Whether that effect explains a quarter of CRM lifts or a half is an open empirical question.

The corrective is not to abandon A/B testing. It is to extend the measurement window beyond the canonical two-week lift period, to include customer-reported satisfaction as a co-equal endpoint with revenue, and to monitor the distributional impact of CRM treatments across Mood Index segments rather than just the average impact. Treatments that win on revenue at the population average but lose on satisfaction in the high-compulsivity tail are not wins. They are revenue extracted from the most behaviorally vulnerable customers, surfaced as growth.

The deeper question is whether the segmentation infrastructure should exist at all. Critics of behavioral marketing argue that any system capable of identifying affect-vulnerable customers is, by construction, capable of exploiting them. The counter-argument, which I find more persuasive, is that the alternative is not no segmentation. It is segmentation that is RFM-based, that does not see the affect dynamics, and that therefore optimizes blindly across both the customers it is helping and the ones it is harming. A clinical-mechanism-aware segmentation is not less ethically charged than an RFM-only segmentation. It is more transparent about the choices being made.

The question retailers actually face is this: now that the Mood Index, or constructs like it, are technically feasible, what is the operating philosophy that determines how they get used. The data does not answer that question. The retailer's leadership does. Most leadership teams have not yet had the conversation, because the analytics infrastructure that would force the conversation has not been deployed. Once it is, the conversation becomes unavoidable.

Key Takeaways

Cosmetics is the consumer e-commerce category where four clinical psychology mechanisms operate at unusually high intensity simultaneously: affect regulation, compulsive buying disorder, extended-self identity construction, and social comparison.
Each mechanism produces a distinct behavioral signature in checkout data. Affect regulation shows up as late-night single-item premium baskets; compulsivity as high-frequency low-control patterns with payment oscillation; identity construction as tight category sequencing and brand-tribe loyalty; social comparison as sub-90-second conversion latency after UGC entry.
Three operating archetypes (the prestige, cross-category, and mid-market formats, exemplified publicly by Sephora, Watsons, and Gratis respectively) reflect three distinct mixes of the four mechanisms and require three different CRM playbooks.
The Mood Index is a proposed three-component framework (affect, compulsivity, identity) for thinking about the variance standard RFM segmentation does not surface. It has not been formally validated; the operational thresholds and weights described here are starting points for testing, not benchmarks.
The longitudinal-follow-up evidence from advisory work suggests, qualitatively, that a meaningful share of A/B-validated cosmetics CRM lifts behave as compulsivity amplifiers when followed past the canonical short-window endpoint. The correction is not to abandon experimentation but to extend the measurement window, add customer-reported satisfaction as a co-equal endpoint, and monitor distributional impact across mechanism-driven segments rather than population averages.