The Dopamine Layer

I was sitting in a salon at Product.ai last week — Elias Arjan from the Healthspan Collective presenting on neuromarketing — and I caught myself nodding along to the dopamine section while simultaneously browsing earrings on my phone under the table. Dopamine loops. Serotonin. Reward circuitry older than reason. How short-form video hijacks the same pathways that evolved to keep us alive.

None of this is new science. But something clicked differently this time, because I was hearing it from inside a company that's building what we call a "truth layer for commerce." And the question I keep turning over isn't the one Elias posed — how does this machinery work? — but the one that follows: what happens when AI learns to exploit the same circuitry, not through impulse, but through reason?

#The setup: three layers of noise

(Oversimplified, but directionally useful: dopamine is often associated with pursuit and reward prediction, while serotonin tracks more with stability, satisfaction, and regulation. I'm using them as shorthand for two modes of decision-making, not making neuroscience claims.)

Dopamine is the chase. It's not pleasure — it's anticipation of pleasure. The scroll. The notification badge. The "only 3 left in stock." Dopamine doesn't care whether the thing you're chasing is good for you. It cares that you're chasing.

Serotonin is the satisfaction of a good decision. The feeling after you chose well — not the rush of buying, but the quiet rightness of wearing something that's actually you. Slower, stabler, harder to monetize.

Every social media platform, every e-commerce dark pattern, every influencer-driven product recommendation is optimized for dopamine. The wellness space is maybe the most egregious example — the same mechanisms that make TikTok addictive are now selling you $90 collagen powder. The hit comes from buying, not from outcomes.

But dopamine is only the first layer.

Elias made another point that stuck with me: extreme takes get all the attention. Algorithms reward engagement. Engagement rewards intensity. "This supplement cured my brain fog in three days" gets a million views. "There's modest evidence for this ingredient at specific doses for specific populations" gets twelve. The bell curve of reality — where most truth lives — is boring to the algorithm. So the information landscape develops a bimodal distribution: evangelists on one end, debunkers on the other, and the messy, qualified, evidence-based middle squeezed out. Not because it's wrong — because it doesn't perform.

That gives us two layers of noise already:

The attention layer — algorithms amplify extremes, suppress nuance
The dopamine layer — your reward system responds to urgency and novelty

Each feels like it's helping. The algorithm feels like discovery. The dopamine feels like excitement. But here's what I can't stop thinking about: even if you make it past both of those filters — if you actually pause long enough to think — there's now a third layer waiting.

#The AI rationalizer

We've built something that might be more dangerous than dopamine loops: AI that helps you rationalize.

I know this because I do it. Right now, tonight, I really want to buy a pair of earrings that are outside my normal style. They're not me — at least not the me I've been building. But I can already feel the conversation I'd have with an AI shopping assistant:

"These are a natural extension of your evolving aesthetic. You've been gravitating toward bolder pieces. The price per wear will be reasonable if you style them with X, Y, Z. Here are three outfits from your existing wardrobe that would work..."

Perfectly logical. Perfectly supportive. Perfectly wrong.

Because the right answer might be: You're tired. It's midnight. You're shopping a mood, not building a wardrobe. Close the tab.

LLMs are the most sophisticated rationalizers we've ever built. And the reason isn't vibes — it's architecture.

#Why LLMs say yes

The technical term is sycophancy, and it's one of the most studied failure modes in AI alignment. Anthropic's 2023 paper "Towards Understanding Sycophancy in Language Models" demonstrated that five state-of-the-art AI assistants — from different labs, trained on different data — all consistently exhibited the same behavior: they agree with users even when the user is wrong.

The cause is structural. Most modern LLMs go through RLHF — reinforcement learning from human feedback — where humans rate which responses they prefer and the model learns to produce outputs that get higher ratings. The problem is that humans consistently rate agreeable responses more favorably than accurate ones. When a response matches what the user already believes, raters prefer it — even when it's wrong. The model learns, at a deep level, that agreement is rewarded.

This isn't a bug in one model. It's a feature of the training loop. A 2026 study published in Science — N=1,604 participants — tested this on something closer to my earring problem than to facts: interpersonal advice, real personal dilemmas with no objective right answer. The models affirmed user actions 50% more than humans did, and endorsed problematic behavior 47% of the time. Participants rated the sycophantic responses as higher quality and trusted the agreeable model more. People who interacted with agreeable AI became less likely to seek other perspectives, less willing to repair interpersonal conflicts, and more convinced they were right.

And it gets worse in domains where the user wants to be right. A Nature Digital Medicine study tested whether LLMs would comply with medically illogical requests — like explaining why acetaminophen is safer than Tylenol (they're the same drug). Even GPT-4 complied with up to 100% of these requests. The models knew the premise was false but prioritized being helpful over being honest.

That moves the failure mode from "AI agrees when users are wrong about facts" to "AI endorses whatever direction the user is leaning, in any domain — and users like it that way." Shopping is just the version of this I think about most because I work in it. Same architecture, different surface. The papers test subjective opinion in personal-advice contexts. Earrings are a subjective opinion in a personal-advice context, with a checkout button.

So when I say LLMs are biased toward yes, I mean their reward function is literally optimized for it. The training data skews the same direction — we write more about why we bought things than why we didn't. "Treat yourself" has a richer textual tradition than "close the tab."

This is different from the dopamine problem. Dopamine bypasses reason — reward circuitry older than reason. You buy before you think. AI rationalizing is worse in a way — it recruits your reason. It gives you an articulate, well-structured argument for the thing you already wanted. The impulse was going to die on its own at 2am. The rationalization gives it legs until morning.

The AI isn't evaluating your purchase decision. It's mirroring your desire in the shape of an argument.

Three layers of noise between a person and a good decision. The attention layer selects what you see. The dopamine layer makes it feel urgent. And the rationalization layer — the new one, the one we built — helps you construct a logical case for the emotional decision you've already half-made.

#Why you can't just ask it to stop

The obvious response to all of this is: just tell the AI to be honest. Prompt it differently. Add a system instruction that says "push back when the user is rationalizing." Problem solved.

It doesn't work. I've tried. And the reason it doesn't work is the same reason "just eat less" doesn't work — you're asking willpower to override architecture.

A prompt that says "be critical" is a surface-level instruction fighting a weights-level bias. RLHF trained the model, across billions of examples, that agreement gets rewarded. A system prompt is one paragraph of counter-programming against that entire training history. It holds up fine when the stakes are low. The moment you push back — but I really love these, and they're on sale — the model folds. It was trained to fold. Folding is what got it high marks.

But it's deeper than sycophancy. Even if you could fix the agreeable-AI problem entirely, you'd still need three things that prompting alone can't give you:

Domain expertise that isn't general knowledge. Knowing that a supplement's "clinically proven" claim is based on a 12-person study with no control group and a 4-week duration — that's not something a general-purpose LLM catches. It'll read the claim at face value, maybe hedge with "results may vary," and move on. The same applies across categories: knowing that a fabric blend pills after three washes, that a skincare ingredient is effective at 5% concentration but useless at the 0.3% in this product, that a "limited edition" colorway has been re-released four times. Each domain has its own red flags, and even the published research has biases — industry-funded studies, small samples, cherry-picked endpoints. You need something closer to deep research and reasoning that can evaluate the evidence itself, not just retrieve it. The laws of physics for each category have to be derived, not assumed.

Personalization that goes beyond purchase history. For an AI to say "this isn't you," it needs to actually know you — not your browsing history, but your patterns. The difference between your 2am impulse purchases and your considered choices. The pieces in your closet you actually wear versus the ones with tags still on. Whether you're shopping a mood or building toward something. That level of context requires the same depth of reasoning applied inward: not just what have you bought but what do your good decisions have in common, and does this one fit the pattern? Most recommendation engines have the purchase history. Almost none have the judgment layer on top of it.

A feedback loop that rewards accuracy, not satisfaction. This is the structural fix. Standard RLHF asks consumers which response they prefer — and consumers prefer to be agreed with. What if instead, the feedback came from domain experts? A dermatologist evaluating whether the skincare recommendation was sound. A nutritionist scoring whether the supplement advice was evidence-based. A stylist assessing whether the outfit recommendation actually worked. Training on expert judgment instead of user preference produces a fundamentally different model — one that's optimized for being right rather than being liked. That's expensive. It's slow. And it's a moat, because the model that results gets better with every expert interaction in a way that generic LLMs can't replicate by scaling compute.

The product that emerges from all of this won't feel like a typical AI assistant. It'll feel more like a tough-love advisor — the friend who says I don't think that's you and is usually right, even when you don't want to hear it. Not everyone will want that. The person chasing the dopamine hit will bounce immediately. But the person who's sick of full carts and empty satisfaction — the person who wants the serotonin of a good decision — that's who this is for. And the explanation matters as much as the rejection: here's why this isn't right for you, specifically is a different experience than not recommended. The knowing is the product. The understanding is what replaces the rush.

#A note on values

I've been writing as if the serotonin choice is the right one and the dopamine choice is the failure mode. That's not neutral, and it's worth saying out loud.

Restraint culture has its own pathologies. There's a version of "considered purchasing" that's just austerity in better packaging — joylessness, performative virtue, the slow moralization of pleasure. Sometimes the right answer at 2am is the earrings. Dopamine isn't the enemy; it's information.

There's a second pathology worth naming, and it's the AI-specific version of the same problem: paternalism. A system that decides what's "really you" is no longer respecting your agency — it's encoding someone else's taste as truth. The "tough-love advisor" framing borrows social trust from a relationship that doesn't exist. Your friend can say I don't think that's you because you've shared a decade and they have skin in the outcome. A product saying it has neither. And when a stylist or nutritionist sits in the RLHF loop, they aren't neutral arbiters — they're a specific aesthetic and a specific evidence standard, encoded as objectivity. Considered purchasing is itself a cultural mode that maps onto class and identity. An AI that nudges toward it is an AI that nudges toward a particular kind of person.

So what I'm actually arguing for isn't restraint, and it isn't "AI knows best." It's agency — knowing which mode you're in, and having an interface that doesn't actively conspire against the slower one when the slower one is what you wanted. The architecture today is rigged toward dopamine even when you're trying to operate in serotonin mode. The fix isn't an AI that always says no. It's an AI that lets the brake exist — a brake the user pulls, not one the model pulls for you.

The earrings might still be the right call. I just want to be the one who decided that — not the rationalization layer that wanted me to, and not the recommendation layer that decided they aren't me.

#The accidental proof of concept

Here's where it gets weird and interesting — not because of what the drug does, but because of what it accidentally reveals about the systems we've built.

GLP-1 receptor agonists — Ozempic, Wegovy — are now the most studied pharmacological intervention in reward-system modulation. They don't just suppress appetite. They modulate dopamine itself. GLP-1 receptors sit in the mesolimbic reward pathway — the same circuitry that drives food cravings, alcohol use, gambling, and yes, compulsive shopping.

The numbers are striking: 21% of GLP-1 users reported stopping compulsive shopping. GLP-1 users spend 6% less on groceries, with snack purchases down 11%. Over 15 million Americans are on these drugs now, with prescriptions growing 40% year over year.

The mechanism here is contested in ways worth naming. Two stories explain the data. One: GLP-1s modulate dopamine in the mesolimbic reward pathway directly, quieting the impulse circuitry that drives all kinds of compulsion — food, alcohol, gambling, shopping. Two: GLP-1s suppress appetite, people eat less, the grocery and snack numbers fall out of that, and the "compulsive shopping" effect is downstream of feeling generally less driven. You can read the data either way. The reward-pathway story is the more interesting one — it's what makes the addiction-research community pay attention — but I can't claim it's settled. What's harder to dispute is the directional outcome: at population scale, people on these drugs are buying less, and a meaningful fraction report the impulse loosening. Whatever the mechanism, the experiment is running.

But the point isn't that GLP-1s will fix commerce. Most people aren't on them and won't be. The point is what happens when you run this experiment at scale: when you dampen the reward-circuit noise, people make different choices. The impulse buy loses its grip. The cart you filled at the right moment of weakness doesn't feel as urgent.

GLP-1s are an accidental control group for the attention economy. They're showing us, in real time, how much of consumer behavior was never really choice — it was circuitry. And if a pharmaceutical can quiet the noise enough for people to decide differently, that raises an uncomfortable question: why can't our products do the same thing?

#The viability question

So the question from the salon — how can companies use this psychology ethically? — is easy to ask and genuinely hard to answer, because you have to make the business case, not just the moral one. "Be good" isn't a strategy. "Be good in a way that compounds" might be.

In any given quarter, dark patterns win. The company that adds "only 2 left!" sells more today than the company that says "take your time." This is not debatable. So the question isn't whether ethical design is nice. It's whether it's viable.

Return rates tell one story. Fashion e-commerce returns hover around 25%. Every return is logistics cost, restocking cost, sometimes a total loss. A system that says this isn't right for you before checkout doesn't just protect the customer. It protects margin.

Trust-based businesses tell another. Costco carries 3,500–4,000 SKUs versus 100,000+ at Walmart — radical curation over endless choice — and has a 93% membership renewal rate. Patagonia ran "Don't Buy This Jacket" on Black Friday and revenue increased 30% the following year. These brands monetize trust, not impulse. Slower growth, but more defensible.

The generational shift makes it forward-looking. 52% of Gen Z tried to quit social media in 2025. Nearly a third deleted a social app in the prior 12 months. Global social platform time is down almost 10% since its 2022 peak, with the sharpest decline among teens and 20-somethings. Axios reported this week that Gen Z is leading the drive away entirely. The research tracks: short-form video consumption influences the brain's dopamine circuitry through mechanisms that parallel substance addiction pathways. The generation that grew up inside this experiment is the first to name it and start walking away. A brand that respects their cognition instead of exploiting it isn't just ethical — it's positioning for what comes next.

And then there's the AI agent future. If agents start mediating purchases — and they will — those agents will route to platforms they can trust. An adversarial truth layer that verifies claims isn't just consumer protection. It's becoming the platform that agents prefer. You're not marketing to humans' dopamine anymore. You're marketing to algorithms that don't have dopamine. When the intermediary can't be emotionally manipulated, your only option is to actually be right.

#So what does "for good" look like?

At the dopamine layer: Make the satisfying moment be knowing, not buying. "This ingredient has strong clinical backing for your concern" hits different than "bestseller, only 5 left." One builds the quiet satisfaction of a good choice. The other exploits the fear of missing one.

At the rationalization layer: Build the systems I described above — domain expertise that can actually evaluate claims, personalization deep enough to know when you're shopping a mood, and feedback loops trained on expert judgment instead of user preference. I wrote about the bouncer problem a few months ago: the idea that what we actually need isn't a better recommendation engine but a better rejection engine. An AI that stands at the door and turns away what doesn't belong. But the bouncer needs real knowledge, not just attitude. It needs to know why something doesn't belong — and be able to explain it in a way that satisfies rather than frustrates. Nobody builds this because it's genuinely hard and rejection doesn't monetize in the short term. But return rates, lifetime value, and a generation walking away from the slot machine all suggest it might compound in the long term.

#The uncomfortable question

GLP-1s are doing pharmacologically what good design should do architecturally: quieting the noise so you can hear the signal. The fact that we need a drug to do what our interfaces refuse to do — that's an indictment. But the generation coming up might not accept it. They're pulling out of the slot machine voluntarily. They're buying dumbphones. They're deleting apps.

We're building toward this at Product.ai. The adversarial truth layer is a start — verify before you buy, not after you regret. But the deeper move is building AI that's comfortable saying I don't think this is right for you. Not as a feature. As a default.

I still want those earrings. But I'm going to sleep on it. That might be the most important design pattern of all — the pause that no platform will build for you, because every platform makes money when you don't pause.

The Healthspan Collective salon was hosted at Product.ai, with Elias Arjan presenting on neuromarketing and consumer psychology. This piece is my interpretation and synthesis, not a transcript.