Bearnaise sås test
Béarnaise
under the spoon
We took 15 supermarket béarnaise sauces, gave them to a blind taste panel, and asked the only question that matters: which jar actually tastes good — and why do the rest fall short?
How to read this report
Four simple things were measured. Here's what each one means in plain English — no stats degree needed.
Liking score (1–7)
How much people liked it. 1 = hated it, 7 = loved it. 4 is the middle. Higher is better.
CATA — "tick the words"
Tasters ticked every word that fit the sauce (creamy, sour, watery…). It tells us what each sauce tastes like.
JAR — "just about right?"
For things like saltiness, was there too little, just right, or too much? It shows what to turn up or down.
Penalty — what helps or hurts
How much each flavour word pushes liking up or down. "Creamy" lifts a sauce; "artificial" sinks it.
The headline
Every sauce scored differently, and the gaps are real, not random — the differences are statistically solid across every measure. Four things sum up the whole study.
The ranking
Average liking score (1–7). Coop wins, and it's the only sauce to come first on looks, smell, taste and texture all at once. At the bottom, Felix sits well clear of everyone else.
In plain words: the longer and greener the bar, the more people liked it. Anything around 5 is well-liked; anything near 3 struggled.
Looks, smell, taste & texture
Almost every sauce looks fine — the eye is forgiving. Taste is where they split hardest, and taste is what decides the overall winner.
In plain words: these show the top 5 sauces for each quality. Notice how the "Looks" scores are all high and close, while "Taste" spreads out — that's where the real differences are.
What makes or breaks a sauce
This links each taste word to how much it changed people's liking. The pattern is remarkably consistent: a few words reward almost every sauce, and a few punish almost every sauce.
In plain words: when a sauce was called "creamy," people liked it more. When it was called "artificial," they liked it a lot less. The bars show how big each push was, in liking points.
These lift liking
Average liking gained when tasters ticked this word
These sink liking
Average liking lost when tasters ticked this word
“Artificial” is the kiss of death.
In six different sauces, tasting "artificial" (Swedish: konstgjord) was the most damaging thing of all — costing between 1.3 and 1.8 liking points wherever people noticed it. No other off-note is this common or this harmful. If a béarnaise tastes fake, nothing else can save it.
Too little, just right, or too much?
Liking tells you which sauces work. JAR tells you how to fix the ones that don't. For each flavour, tasters said whether there was not enough, just about right, or too much. Tap a sauce to see it.
In plain words: you want a fat green middle (most people happy). A big orange chunk means "needs more of this"; a big pink chunk means "too much of this."
The priority fix list
A flavour problem only matters if it's both bad and common. Weighted penalty does that maths for you: it multiplies how much a problem hurts liking by how many people noticed it, then ranks the fixes. The top item is the change that would help a sauce the most.
In plain words: this is each sauce's "fix this first" list. A big number = a problem that's both painful and widespread. Coop's list is short and small (little to fix); Felix's and Caj P's are long and large.
What each sauce tastes like
The share of tasters who ticked each word, by sauce. First, the big picture of which words show up most — and whether they're a good or bad thing to be known for.
Now the full map. Read a row to get one sauce's character; read a column to see which sauces own a flavour. Darker = more tasters tasted it.
Every sauce, verdict by verdict
All 15 sauces, ranked best to worst — each with its character, what's working, what's hurting, the one thing to fix first, and a clear buy-or-skip call.
How the test worked
The tasting
A blind test: each person tasted some (not all) of the 15 sauces, with about 198–211 ratings per sauce and 3,040 ratings in total. Nobody knew which brand they were tasting.
What was asked
Liking on a 1–7 scale (looks, smell, taste, texture, overall); 21 taste words to tick (CATA); 7 "just-about-right" flavour checks on a 5-point scale; and optional written comments.
The maths
Differences were checked with ANOVA (are the gaps real, or chance?) and Tukey HSD (which sauces truly differ). Penalty and weighted-penalty analyses link taste words and JAR answers to liking.
Worth knowing
Because each taster only tried some sauces, certain word-by-word comparisons use proportion tests rather than stricter paired tests. Swedish flavour words have been translated into everyday English.


In the tasters' own words
Three sauces also had written comments — each a different way to disappoint. The free text matches the numbers almost exactly.