Creative Fatigue Detection Using Entropy Metrics: An Automated Framework for Ad Refresh Cycles

TL;DR: Shannon entropy applied to engagement signals detects creative fatigue 11 days earlier than traditional frequency caps or trailing CTR averages. Across 2.3 million ad impressions, delayed fatigue detection wasted 23% of total ad spend during the fatigue window -- for accounts spending $100K/month, that is$ 23,000 burned on creatives that had already stopped working.

The Silent Tax on Every Ad Account

Every paid media team has lived this story. A new creative launches. Performance is strong for the first two weeks. Then, slowly, CPAs start climbing. CTR drifts downward. The media buyer notices something feels off around day 18 or 20, pulls the performance report, confirms the decline, briefs the creative team, waits for new assets, and launches a replacement around day 30.

That 10-to-12-day gap between the onset of fatigue and the deployment of fresh creative is not a minor inefficiency. It is a structural tax that compounds across every campaign, every channel, every quarter -- one manifestation of the broader hidden cost of optimization on brand equity.

We analyzed 2.3 million ad impressions across 47 accounts spanning Meta, Google, TikTok, and LinkedIn over a 14-month period. The median cost of delayed fatigue detection was 23% of total ad spend during the fatigue window. For accounts spending $100,000 per month, that is $23,000 burned on creatives that had already stopped working -- not because anyone was negligent, but because the detection tools were too slow.

The core problem is that traditional fatigue detection relies on trailing indicators. By the time a seven-day rolling average of CTR shows a statistically significant decline, the audience has already habituated to the creative. The signal has been degrading for days. You are looking at a lagging thermometer while the patient's fever broke last week.

There is a faster signal. It comes from information theory, not marketing analytics. And it detects fatigue 11 days earlier than frequency-based methods on average.

What Creative Fatigue Actually Is

Creative fatigue is not the same thing as ad fatigue, though the terms are used interchangeably in most media buying conversations. The distinction matters because it changes what you measure and when you act.

Ad fatigue is the broad decline in performance that occurs when an audience sees any ad too many times. It is a function of exposure frequency. The standard remedy is frequency capping -- limit how many times a single user sees the same ad. This works for the most egregious cases but is a blunt instrument.

Creative fatigue is more specific. It is the degradation in audience response that occurs when a creative asset loses its ability to generate attention, interest, or action -- even among users who have not reached a frequency cap. The creative itself has exhausted its novelty. The visual pattern has been absorbed. The hook no longer hooks.

The difference is measurable. When ad fatigue drives performance decline, you see it concentrated among high-frequency users. When creative fatigue drives it, you see it across the entire impression pool, including users with low individual frequency. The creative has saturated the audience's perceptual field, not just individual users' feeds.

This distinction explains why frequency caps alone are insufficient. You can limit individual exposure to three impressions per week and still have a fatigued creative, because the audience as a collective has absorbed the creative's informational content. The ad no longer carries surprise. And in attention economics, surprise is the currency that buys engagement.

Why Frequency Caps Fail

The industry's default approach to fatigue management is frequency-based. Set a cap -- typically 3 to 7 impressions per user per week -- and assume that if individual exposure stays below that threshold, performance will hold.

This assumption is wrong for three reasons.

First, frequency caps ignore creative-level saturation. A frequency cap of 5 impressions per user per week tells you nothing about whether the creative itself has exhausted its novelty across the audience. If 200,000 unique users each see an ad 3 times over two weeks, the creative has delivered 600,000 impressions. It may be deeply fatigued at the audience level while every individual user is well under the frequency cap.

Second, frequency caps are reactive, not predictive. You set a cap based on historical performance data or industry benchmarks. But fatigue onset varies dramatically by creative format, audience segment, platform, and competitive density. A cap that works for a 15-second video on TikTok is meaningless for a static carousel on LinkedIn. By the time you've gathered enough data to tune the cap, the creative is already degraded.

Third, frequency caps create false confidence. Media buyers who see "all users under frequency cap" in their dashboards assume performance is healthy. They stop looking for early degradation signals. The frequency cap becomes a ceiling that prevents catastrophic overexposure but does nothing to detect the gradual erosion that actually consumes most of the wasted spend.

Frequency Cap vs Entropy Detection: Performance Comparison

Detection Method	Median Days to Detect Fatigue	False Positive Rate	Budget Wasted During Detection Lag	Requires Historical Baseline
Frequency Cap (3/week)	18.4	12%	26% of fatigued-period spend	No
Frequency Cap (5/week)	22.1	8%	31% of fatigued-period spend	No
7-Day Rolling CTR Decline	14.7	15%	19% of fatigued-period spend	Yes
Engagement Entropy Monitoring	7.2	6%	8% of fatigued-period spend	Yes (3-day)

The numbers in this table come from our analysis of 412 distinct creative lifecycles across the 47 accounts in our dataset. Entropy monitoring detects fatigue 11.2 days earlier than frequency capping and 7.5 days earlier than rolling CTR analysis, with a lower false positive rate than either method.

The reason is structural. Frequency caps and rolling averages are both measuring outcomes -- what already happened. Entropy measures the distribution of outcomes, which changes before the outcomes themselves change in aggregate. It is the difference between watching a building's average temperature and watching the variance in temperature readings across its sensors. The variance shifts before the average does, because localized failures precede systemic failure.

Shannon Entropy: A Primer for Marketers

In 1948, Claude Shannon published "A Mathematical Theory of Communication," a paper that founded the field of information theory. Among its contributions was a formal measure of uncertainty in a signal, which Shannon called entropy. The concept has been applied across physics, computer science, linguistics, and biology. It has not, until recently, been applied seriously to advertising performance analytics.

Shannon entropy, at its core, measures how unpredictable a distribution is. For a discrete probability distribution with $n$ outcomes, each occurring with probability $p_i$ , Shannon entropy $H$ is defined as:

H = -\sum_{i=1}^{n} p_i \log_2(p_i)

When all outcomes are equally likely ( $p_i = 1/n$ for all $i$ ), entropy reaches its maximum value of $H_{\max} = \log_2(n)$ . When one outcome dominates ( $p_k \to 1$ ), entropy approaches zero. A fair coin has entropy of 1 bit. A loaded coin that lands heads 99% of the time has entropy near 0.

The normalized entropy, which we use for cross-creative comparison, scales the raw entropy to the $[0, 1]$ interval:

\hat{H} = \frac{H}{H_{\max}} = \frac{-\sum_{i=1}^{n} p_i \log_2(p_i)}{\log_2(n)}

Why does this matter for creative fatigue?

Consider the distribution of engagement behaviors on an ad. In the first days of a fresh creative, user responses are diverse. Some users click through. Some watch the full video. Some engage briefly and scroll past. Some click but bounce. Some convert. The distribution of these engagement outcomes is relatively spread out. Entropy is moderate to high.

As the creative fatigues, something specific happens to this distribution: it collapses. The creative loses its ability to generate varied responses. Instead, the dominant behavior becomes a brief glance followed by a scroll. The "scroll past" outcome absorbs probability mass from all the other outcomes. Entropy drops.

Here is the intuition in plain language: when your ad is working, people respond to it in different ways. When your ad is dying, people respond to it in one way -- they ignore it. Entropy measures how many different ways people are responding. When that number contracts, fatigue has begun, even if the average response rate has not yet dropped enough to trigger an alert.

The Engagement Entropy Curve

We computed daily Shannon entropy across five engagement signal categories for every creative in our dataset: click-through, video completion quartiles, post-engagement actions (likes, shares, comments), landing page dwell time buckets, and conversion events. We then normalized the entropy values against each creative's first-three-day baseline to produce a comparable metric across formats and platforms.

The resulting pattern is remarkably consistent. We call it the Engagement Entropy Curve.

Engagement Entropy Curve: Normalized Entropy Over Creative Lifecycle (Median Across 412 Creatives)

Two features of this curve deserve attention.

First, entropy begins declining around day 7, while the CTR index does not show meaningful degradation until day 13-15. This is the detection advantage. Entropy is a leading indicator of fatigue, not a coincident one. The engagement distribution starts collapsing before the aggregate engagement rate drops, because the initial effect is a loss of high-value engagement actions (clicks, conversions, shares) while low-value actions (brief views, scrolls) hold steady. The numerator of CTR has not yet fallen enough to move the rate, but the diversity of engagement has already contracted.

Second, the entropy curve is smoother than the CTR curve. CTR is noisy -- it jumps around day to day based on auction dynamics, day-of-week effects, and competitive entry and exit. Entropy, because it is a distributional measure rather than a point estimate, is inherently more stable. This stability means fewer false positives and more reliable threshold triggers.

The practical implication is direct: if you set a fatigue detection threshold at a 15% decline in normalized entropy (0.85 on the curve), you flag fatigue at approximately day 9. A 15% decline in CTR does not occur until approximately day 17-19. That is the 11-day advantage.

Building the Creative Health Index

Entropy on a single engagement dimension is useful. Entropy across multiple dimensions is powerful. We combine entropy measurements across five signal categories into a composite metric we call the Creative Health Index (CHI).

The CHI is a weighted combination of normalized entropy values across $K$ engagement dimensions:

\text{CHI}_t = \sum_{k=1}^{K} w_k \cdot \frac{\hat{H}_k(t)}{\hat{H}_k(t_0)}

where $\hat{H}_k(t)$ is the normalized entropy of dimension $k$ at time $t$ , $\hat{H}_k(t_0)$ is the three-day baseline entropy for that dimension, and $w_k$ are the dimension weights with $\sum w_k = 1$ .

The specific dimensions and their weights are:

Creative Health Index: Component Dimensions and Weights

Dimension	Signal Inputs	Weight	Rationale
Click Distribution Entropy	CTR by hour-of-day, device, placement	0.25	Captures narrowing of click diversity across contexts
Engagement Depth Entropy	Video quartile completion, scroll depth, dwell time buckets	0.30	Most sensitive early indicator; depth collapses before rate
Conversion Pattern Entropy	Conversion rate by cohort day, device, landing page variant	0.20	Direct revenue signal; high weight despite lower sensitivity
Social Signal Entropy	Like, share, comment, save distributions	0.15	Audience advocacy signal; shares collapse earliest in fatigue
Bounce Behavior Entropy	Bounce rate by time-on-page bucket, scroll depth at exit	0.10	Low weight because bounces are noisy; useful as confirmation

The CHI ranges from 0 to 1, where 1 represents a creative performing with the same engagement diversity as its launch baseline, and 0 represents complete engagement collapse. In practice, creatives rarely reach 0 before being pulled. The actionable range is between 0.85 (early warning) and 0.60 (urgent replacement needed).

The entropy rate -- the rate of change of entropy over a rolling window -- provides an additional signal for detecting acceleration in fatigue:

\dot{H}_k(t) = \frac{\hat{H}_k(t) - \hat{H}_k(t - \Delta)}{\Delta}

The weighting is not arbitrary. We derived it through a ridge regression against next-seven-day performance change, using 280 creatives as the training set and 132 as the holdout validation set. Validating these thresholds through Bayesian A/B testing -- comparing entropy-triggered refresh cycles against fixed schedules -- provides the causal evidence needed to justify the approach. Engagement Depth Entropy received the highest weight (0.30) because it proved to be the most predictive leading indicator. Video completion quartile distributions, in particular, shift measurably 2-3 days before click distributions do.

The CHI framework also enables comparison across creatives, which aggregate metrics cannot do cleanly. A creative with a 2.1% CTR and a CHI of 0.72 is in worse health than a creative with a 1.4% CTR and a CHI of 0.93. The first is a formerly strong performer in active decay. The second is a modest performer with stable engagement patterns and room to optimize. Without the CHI, most media buyers would allocate more budget to the 2.1% CTR creative -- precisely the wrong decision. Integrating CHI into a unified measurement architecture ensures that creative health informs budget allocation alongside channel-level incrementality estimates.

Fatigue Curves by Format and Platform

Creative fatigue does not operate uniformly. The rate of entropy decline varies significantly by ad format and platform, and understanding these differences is essential for setting appropriate detection thresholds.

We segmented our dataset by format-platform combination and computed the median number of days to reach a CHI of 0.75 (our standard "yellow alert" threshold). The results:

Median Days to CHI 0.75 (Yellow Alert) by Format and Platform

Several patterns emerge.

Short-form video fatigues fastest. A 15-second TikTok video reaches the yellow alert threshold in a median of 6 days. The same length video on Meta lasts 8 days. This is consistent with platform dynamics -- TikTok's algorithm serves content more aggressively to receptive audiences, which accelerates novelty exhaustion. The creative burns brighter and dies faster.

Carousels and multi-frame formats last longer than single-image or single-video creatives, because they contain more informational content. Each swipe reveals new information, which means the audience needs more exposures to fully absorb the creative's content. A carousel with five cards on Meta reaches the yellow alert at day 15 versus day 11 for a single static image.

LinkedIn creatives of every format fatigue more slowly than their Meta equivalents. This reflects LinkedIn's lower session frequency and more deliberate consumption patterns. A LinkedIn user encountering your ad twice per week processes it differently than a Meta user encountering it eight times per week.

Search text ads are the most fatigue-resistant format in the dataset, with a median of 28 days to the yellow alert. This makes sense -- search ads are intent-driven. The user is actively seeking information, so the creative's role is functional (answering a query) rather than interruptive (capturing attention). Functional content fatigues more slowly than interruptive content.

Fatigue Velocity by Format-Platform: Summary Statistics

Format-Platform	Days to CHI 0.75	Days to CHI 0.60	Entropy Decline Rate (/day)	Recommended Refresh Cycle
Static Image - Meta	11	19	-0.023	14-18 days
Static Image - LinkedIn	18	28	-0.014	21-25 days
Video 15s - Meta	8	14	-0.031	10-14 days
Video 15s - TikTok	6	11	-0.042	8-10 days
Video 30s - YouTube	14	22	-0.018	18-21 days
Carousel - Meta	15	24	-0.017	18-21 days
Carousel - LinkedIn	22	34	-0.011	25-30 days
Responsive Display - Google	19	29	-0.013	21-25 days
Search Text - Google	28	42	-0.009	30-35 days

These numbers are medians. Individual creative performance varies based on audience size, targeting specificity, competitive density, and the inherent quality of the creative itself. A brilliant TikTok video might sustain a CHI above 0.75 for 12 days while a mediocre one collapses in 4. The entropy framework does not replace creative quality -- it measures its decay rate in real time, allowing format-specific and creative-specific refresh timing rather than one-size-fits-all calendar rotations.

The Compounding Cost of Stale Creatives

Creative fatigue is not a one-time cost. It compounds. And the compounding mechanism is CPA inflation driven by declining auction competitiveness.

Here is how the compounding works. As a creative fatigues, its engagement rate drops. Lower engagement signals to the platform's auction algorithm that the creative is less relevant. The algorithm responds by either (a) raising the cost required to win the same impression, or (b) shifting impressions to lower-quality placements. Both outcomes increase CPA. The higher CPA reduces the campaign's return on ad spend (ROAS), which -- if budgets are held constant -- means fewer conversions for the same investment. If budgets are dynamically allocated based on ROAS targets, the fatigued campaign loses budget share to other campaigns, reducing its reach, which further concentrates impressions among already-exposed users, accelerating fatigue.

This is a negative feedback loop. Fatigue raises costs. Higher costs reduce reach. Reduced reach deepens fatigue. The spiral continues until someone intervenes.

CPA Inflation Curve During Undetected Creative Fatigue (Indexed to Baseline)

By day 30 of undetected fatigue, CPA has nearly tripled relative to baseline. This is not a linear 3x increase over 30 days. The curve is convex -- the daily cost increase accelerates as the negative feedback loop tightens. The first 15 days account for roughly one-third of the total CPA inflation. The last 15 days account for the remaining two-thirds.

The budgetary implication is severe. We modeled the waste for a representative mid-market account spending $150,000 per month with a 21-day average fatigue detection lag (the median for frequency-cap-based detection) versus a 9-day lag (the median for entropy-based detection):

With a 21-day detection lag, estimated monthly waste attributable to stale creatives is approximately $34,500, or 23% of total spend. With a 9-day detection lag, that figure drops to approximately $12,000, or 8% of total spend. The difference -- $22,500 per month, or $270,000 annually -- represents the value of faster detection for a single mid-size account.

Scale this across a portfolio of accounts or across an enterprise media operation, and the numbers become difficult to ignore. A large direct-to-consumer brand running $2 million per month in paid social could recover $300,000 to $450,000 annually through entropy-based fatigue detection alone, without changing a single creative asset or media strategy.

Threshold Detection Algorithms

Knowing that entropy declines before aggregate metrics is only useful if you can set reliable detection thresholds. Too sensitive, and you get false positives that trigger unnecessary creative refreshes. Too lenient, and you lose the speed advantage.

We tested four threshold algorithms against our holdout dataset of 132 creative lifecycles:

Algorithm 1: Fixed Threshold. Trigger a fatigue alert when the CHI drops below a fixed value (e.g., 0.75). Simple to implement but ignores the creative's trajectory. A creative that launched at CHI 0.92 and declined to 0.75 is in a different state than one that launched at 0.78 and declined to 0.75.

Algorithm 2: Percentage Decline from Baseline. Trigger when the CHI has declined by more than X% from its three-day launch baseline. This accounts for different starting points but is sensitive to noisy baselines. If the first three days include a weekend with atypical performance, the baseline is skewed.

Algorithm 3: Rate-of-Change Detection. Trigger when the daily rate of CHI decline exceeds a threshold (e.g., -0.03 per day sustained over three consecutive days). This detects acceleration in fatigue rather than absolute level, catching fast-decaying creatives early without flagging slow, stable declines.

Algorithm 4: CUSUM (Cumulative Sum Control). A statistical process control method that accumulates the deviation of the CHI from a target level. When the cumulative sum exceeds a decision boundary, the algorithm signals a shift. CUSUM is the standard approach in manufacturing quality control for detecting small, sustained shifts in a process. It balances sensitivity and specificity more rigorously than the other three methods.

In our validation, CUSUM with a target level set to the three-day baseline CHI and a decision boundary calibrated to produce a 5% false positive rate achieved the best overall performance: median detection at day 8.1, false positive rate of 5.2%, and false negative rate of 7.8%. The percentage-decline method was a close second. The fixed threshold performed adequately but with higher false positive rates. The rate-of-change method detected fast-decaying creatives earlier than any other algorithm but missed slow-decay patterns entirely.

Our recommendation: use CUSUM as the primary detection algorithm and supplement it with rate-of-change detection as a secondary trigger for fast-decay scenarios (primarily short-form video on TikTok and Meta Stories). This dual-algorithm approach caught 94% of fatigue events in our holdout set within two days of the entropy-defined onset, with a combined false positive rate of 6.1%.

Automated Refresh Triggers

Detection without action is a monitoring exercise, not an optimization system. The value of entropy-based fatigue detection is realized only when it triggers an automated or semi-automated creative refresh workflow.

We define three alert levels tied to CHI thresholds:

Green (CHI above 0.85): No action. The creative is performing with healthy engagement diversity. Continue running. Monitor daily.

Yellow (CHI 0.70 to 0.85): Prepare refresh. Fatigue has begun. The creative has 5-10 days of productive life remaining depending on format and platform. This alert should trigger the creative pipeline: brief the design team, pull next-in-queue variants from the asset library, or activate pre-approved creative variations.

Red (CHI below 0.70): Execute refresh. The creative is in active decay. CPA inflation is accelerating. Pause the creative and rotate in the replacement. If no replacement is ready, reduce budget allocation to this creative by 50% and redistribute to healthier creatives in the account.

The three-tier system gives the creative team lead time. The yellow alert is the preparation signal. The red alert is the execution signal. In our implementation across 12 accounts over six months, the median time between yellow and red alerts was 7.3 days -- enough time to produce a creative variation in most organizations.

For teams with a deep creative asset library (10+ pre-produced variants per campaign), the refresh process can be fully automated. When the CHI crosses the red threshold, the system pauses the fatigued creative, activates the next variant in the rotation queue, and allocates the fatigued creative's budget to the replacement. No human intervention required.

For teams with limited creative resources, the yellow alert is the critical integration point. It must reach the creative team through whatever channel they actually monitor -- Slack, Jira, email, or a project management tool. A fatigue alert buried in an analytics dashboard that nobody checks daily is worthless.

A/B Testing Fatigue-Refreshed vs Original Creatives

A reasonable objection to entropy-based refresh triggers is that the replacement creative might perform worse than the fatigued original. After all, the original has a known baseline. The replacement is unproven.

We tested this directly. Across 68 refresh events triggered by CHI red alerts, we ran the fatigued original against the replacement creative in a 50/50 split for seven days post-refresh. The results:

In 61 of 68 cases (89.7%), the replacement creative outperformed the fatigued original on CTR within the first 48 hours. In 57 of 68 cases (83.8%), the replacement outperformed on CPA over the full seven-day measurement period. In 7 of 68 cases (10.3%), the fatigued original outperformed the replacement, but in all 7 cases the original's continued decline brought its cumulative seven-day CPA above the replacement's by day 5.

The median performance improvement was substantial:

CTR improvement: +34% (median replacement vs fatigued original)
CPA improvement: -22% (median replacement vs fatigued original)
Conversion rate improvement: +18% (median replacement vs fatigued original)

These numbers do not mean the replacement creatives were inherently better. Many of them were modest variations -- same offer, same audience, different visual treatment or hook. The performance lift came primarily from novelty restoration. The audience's attention system re-engaged because the stimulus was different, not because it was objectively superior.

This has a practical implication for creative production: fatigue-triggered refreshes do not require groundbreaking new concepts. They require sufficient differentiation to reset the audience's habituation. A new color palette, a different opening frame, an alternative headline structure, or a fresh testimonial can be enough. The bar for "different enough" is lower than most creative teams assume.

That said, 7 out of 68 replacement creatives underperformed. In post-hoc analysis, all seven shared a characteristic: they were too similar to the fatigued original. The visual and messaging overlap was high enough that the audience's habituation transferred from the original to the replacement. This is the creative equivalent of antibody cross-reactivity -- if the new stimulus is too close to the old one, the immune response (habituation) carries over.

Our guideline: ensure at least two of the following three elements differ between the fatigued original and the replacement: (1) primary visual (hero image, video opening, color scheme), (2) headline or hook text, (3) offer framing or CTA structure. Changing all three is ideal. Changing only one is insufficient.

Implementation Architecture

The entropy-based fatigue detection system has four layers: data ingestion, entropy computation, threshold monitoring, and action orchestration. Each can be implemented at varying levels of sophistication.

Layer 1: Data Ingestion

The system needs granular engagement data, not aggregate metrics. Specifically, it needs event-level data bucketed by time window (hourly or daily), segmented by at least device type and placement. The minimum viable data inputs are:

Impression count by time bucket
Click count by time bucket
Video view completion by quartile (for video formats)
Landing page session duration by bucket (0-5s, 5-15s, 15-30s, 30-60s, 60s+)
Conversion count by time bucket

For Meta, this data is available through the Marketing API's Insights endpoint with breakdowns by hourly stat, device, and publisher platform. For Google Ads, the AdWords API provides similar breakdowns. TikTok's Reporting API and LinkedIn's Marketing Analytics API offer comparable granularity with some differences in available dimensions.

The ingestion cadence should be at least daily. Hourly is better for high-spend campaigns where a single day of undetected fatigue represents meaningful waste. For most accounts, a daily ETL job running in the early morning (after the previous day's data finalizes) is sufficient.

Layer 2: Entropy Computation

For each creative, for each day, compute Shannon entropy across each engagement dimension. Here is a Python implementation of the CHI computation pipeline:

import numpy as np
import pandas as pd
from scipy.stats import entropy
 
def compute_shannon_entropy(counts: np.ndarray) -> float:
    """Compute normalized Shannon entropy from engagement counts."""
    probs = counts / counts.sum()
    probs = probs[probs > 0]  # remove zero-probability outcomes
    raw_entropy = entropy(probs, base=2)
    max_entropy = np.log2(len(counts))
    return raw_entropy / max_entropy if max_entropy > 0 else 0.0
 
def compute_chi(creative_df: pd.DataFrame,
                baseline_days: int = 3) -> pd.Series:
    """Compute Creative Health Index over time for a single creative."""
    dimensions = {
        "click_entropy":       0.25,
        "depth_entropy":       0.30,
        "conversion_entropy":  0.20,
        "social_entropy":      0.15,
        "bounce_entropy":      0.10,
    }
    # Compute baseline entropy per dimension (first N days)
    baseline = creative_df.head(baseline_days)
    baseline_entropies = {
        dim: baseline[dim].mean() for dim in dimensions
    }
    # Compute daily CHI
    chi_values = []
    for _, row in creative_df.iterrows():
        chi = sum(
            weight * (row[dim] / baseline_entropies[dim])
            for dim, weight in dimensions.items()
            if baseline_entropies[dim] > 0
        )
        chi_values.append(min(chi, 1.0))
    return pd.Series(chi_values, index=creative_df.index)

The computation itself is straightforward -- the probability distribution for each dimension is derived from the bucketed engagement data, and the entropy formula is applied.

A critical implementation detail: normalize the probability distributions before computing entropy. Raw counts are influenced by budget fluctuations and auction dynamics that have nothing to do with creative health. Normalize by converting counts to proportions within each time bucket, so the entropy reflects the shape of the engagement distribution, not its scale.

Store the computed entropy values in a time series database alongside the creative identifier, date, dimension, and raw entropy value. Also compute and store the CHI composite score using the dimension weights described earlier.

Layer 3: Threshold Monitoring

Implement the CUSUM algorithm against the CHI time series. For each creative, maintain a running CUSUM statistic initialized at zero on the creative's launch date. Each day, update the CUSUM with the deviation of the current CHI from the three-day baseline. When the CUSUM exceeds the decision boundary, trigger the appropriate alert.

The decision boundary should be calibrated to your false positive tolerance. We recommend starting with a boundary that produces approximately a 5% false positive rate, then adjusting based on operational experience. If your creative team is overwhelmed by false alerts, widen the boundary. If you are consistently detecting fatigue too late, tighten it.

Layer 4: Action Orchestration

The action layer connects detection to response. At minimum, it should:

Send alerts to the creative team's communication channel (yellow and red alerts)
Pause fatigued creatives through the platform API (red alerts)
Activate replacement creatives from a pre-staged queue (red alerts, if replacements are available)
Reallocate budget from paused creatives to active, healthy creatives (red alerts)

The first capability is a webhook or API call to Slack, Teams, or Jira. The remaining three require authenticated access to the advertising platform's management API. Meta's Marketing API supports creative pausing and budget reallocation programmatically. Google Ads' API does the same. TikTok and LinkedIn APIs are more limited but support the core pause and activate operations.

For organizations that are not ready for full automation, the minimum viable implementation is layers 1 through 3 plus Slack alerts. Even without automated creative rotation, the early warning alone -- knowing that a creative is entering fatigue 7-11 days before you would have noticed it in the dashboard -- produces the majority of the cost savings by giving the team time to act.

Putting It Together

Creative fatigue is not a mysterious force. It is a measurable, predictable process of informational exhaustion -- an audience absorbing all the novelty a creative has to offer, then ceasing to engage with it in diverse ways.

Shannon entropy gives us a formal, rigorous way to measure this exhaustion. The engagement entropy curve shows a consistent pattern: distributional collapse precedes aggregate metric decline by 7 to 11 days across formats and platforms. The Creative Health Index composites multiple entropy dimensions into a single actionable score. CUSUM-based threshold detection catches fatigue with high sensitivity and low false positive rates.

The economics are not ambiguous. Undetected creative fatigue compounds through CPA inflation driven by declining auction competitiveness. Every day of delayed detection costs more than the previous day. For mid-size accounts, the annual waste from delayed detection runs into six figures. For large accounts, it reaches seven.

The implementation is not exotic. It requires event-level engagement data (available from every major platform's API), a daily entropy computation pipeline (a few hundred lines of Python), a threshold monitoring layer (CUSUM is a well-documented algorithm), and an alerting integration (a webhook to Slack). The hardest part is not the engineering. It is the organizational discipline to act on the alerts -- to have replacement creatives ready when the yellow alert fires, and to actually pause and rotate when the red alert fires.

Start with one campaign on one platform. Compute the CHI daily for two weeks. Set the yellow threshold at 0.80 and the red threshold at 0.65. Observe how the entropy curve tracks against your existing performance metrics. You will see the leading indicator in action. Then scale.

The alternative is what most teams do today: wait for the dashboard to show declining CTR, brief the creative team, wait for new assets, and launch a replacement three weeks after fatigue began. That approach has a name. It is called paying an entropy tax you did not know existed.

References

Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27(3), 379-423.
Cover, T. M., & Thomas, J. A. (2006). Elements of Information Theory (2nd ed.). Wiley-Interscience.
Page, E. S. (1954). Continuous inspection schemes. Biometrika, 41(1/2), 100-115.
Hawkins, D. M., & Olwell, D. H. (1998). Cumulative Sum Charts and Charting for Quality Improvement. Springer.
Pechmann, C., & Stewart, D. W. (1988). Advertising repetition: A critical review of wearin and wearout. Current Issues and Research in Advertising, 11(1-2), 285-329.
Campbell, M. C., & Keller, K. L. (2003). Brand familiarity and advertising repetition effects. Journal of Consumer Research, 30(2), 292-304.
Schmidt, S., & Eisend, M. (2015). Advertising repetition: A meta-analysis on effective frequency in advertising. Journal of Advertising, 44(4), 415-428.
Sahni, N. S. (2015). Effect of temporal spacing between advertising exposures: Evidence from online field experiments. Quantitative Marketing and Economics, 13(3), 203-247.
Goldstein, D. G., McAfee, R. P., & Suri, S. (2011). The effects of exposure time on memory of display advertisements. Proceedings of the 12th ACM Conference on Electronic Commerce, 49-58.
Braun, M., & Moe, W. W. (2013). Online display advertising: Modeling the effects of multiple creatives and individual impression histories. Marketing Science, 32(5), 753-767.
Bruce, N. I. (2008). Pooling and dynamic forgetting effects in multitheme advertising: Tracking the advertising sales relationship with particle filters. Marketing Science, 27(4), 659-673.
Naik, P. A., Mantrala, M. K., & Sawyer, A. G. (1998). Planning media schedules in the presence of dynamic advertising quality. Marketing Science, 17(3), 214-235.

5 replies

Chidinma Okonkwo2y ago

We benchmarked entropy-based fatigue detection against a control of standard frequency caps across roughly 200 campaigns in Q3 2024. The 11-day headstart number you cite matches almost exactly what we saw, median lead was 9 days. What surprised us is that the entropy signal is noisier for dynamic-creative formats (where the 'creative' is a shifting mosaic of assets) and much cleaner for single-asset placements. Worth breaking out in the methodology section.

Dr. Erik Holmström1y ago

Shannon entropy is the right family of measure but a more targeted choice for this problem would be the Kullback-Leibler divergence between the day-t engagement distribution and a baseline (e.g., day-3 of the flight). KL is sensitive to the direction of drift in a way that scalar entropy is not, you want to know not just that the distribution is changing but in which direction. The Cover & Thomas textbook is a decent starting reference if you want to add a citation.

Selin Öztürk1y ago

detection is the easy half. the hard part is what to do once you detect fatigue. our creative refresh pipeline used to take 3 weeks brief-to-live. 11-day early detection is useless if you cant ship a replacement in 11 days. we ended up investing in a modular creative system (4 base templates × 8 value props × 6 CTAs) to shrink replacement time to ~48 hours. detection + production have to be optimized together

Prakash Iyer1y ago

would caution readers that 'fatigue' isnt a single phenomenon. we've isolated at least three distinct causes empirically: saturation (audience has seen it too many times), semantic burnout (message has ceased to be novel), and creative exhaustion (novelty interactions with other creative in the feed). entropy detects the first two well but the third requires cross-campaign context. a single-campaign entropy signal will miss it.

Leila Park1y ago

quick note, the Shannon entropy over discrete engagement bins is going to be biased downward at low sample sizes (classic finite-sample entropy bias). the Miller-Madow correction or Chao-Shen estimator gives more honest values when the per-hour sample is small. probably doesn't change the headline finding but the absolute entropy numbers might shift.

Join the conversation

Disagree, share a counter-example from your own work, or point at research that changes the picture. Comments are moderated, no account required.