Tripadvisor's AI called a hotel "spotless" while recent guests reported roaches and raw chicken. The platform's defense is the problem.
A Which? investigation released this week found that Tripadvisor's headline AI summaries repeatedly described hotels carrying serious safety complaints as "spotless", "popular with many travellers", or earning "rave reviews", language that recent guest reviews on the same pages directly contradict. The UK consumer watchdog's worked example was the Riu Palace Santa Maria in Cape Verde, where the AI summary described the property as "spotless" with "diverse restaurants" earning "rave reviews". Recent reviews on the same listing reported raw chicken, flies and birds in the buffet, "dead little roaches", "no basic cleaning or hygiene standards", and food described as "awful, bland, unsafe and inedible". Which?'s investigation covered more than hygiene: the summaries also smoothed over reports of sexual harassment at listed properties.
Multiple outlets corroborated the Which? findings within 24 to 48 hours, including the Guardian, Euronews, Metro, and Let's Data Science. That gave the storyline multi-source confirmation rather than a single-watcher claim. Which? is a UK body with a UK mandate, but the AI summary product itself is rolled out globally on Tripadvisor, so the gap is product-wide even though the watchdog's authority is regional.
A Tripadvisor spokesperson told Which? that users can simply look at the underlying reviews, "eliminating any need to blindly trust AI-generated content". That answer is not a defense. It is a concession. The spokesperson is saying the AI summary is not authoritative, and the real signal lives below. That is the exact problem under scrutiny. The summary was put in the trust slot. The reviews were always there; the AI summary is the new layer that frames them. Telling users to scroll past the new layer to reach the old one does not refute the critique. It admits the new layer's role.
The "just scroll" defense is also brittle in practice. Travelers use Tripadvisor to compare dozens of properties in a session. They skim. The AI summary is the skimmable layer. A reader who would otherwise have read five properties' worth of reviews in detail will read five summaries and zero underlying reviews. The summary is auxiliary in name only. For many users, it is the page.
The mechanism behind the smooth-positive summaries is a property of how large language models handle mixed-signal corpora. When a model is asked to summarize reviews that include one-star food poisoning complaints and five-star vacation praise in roughly equal volumes, short summaries tend toward positive framing as a default. The complaints get averaged into a comfortable middle. The LLM's uncertainty is not surfaced to the user. There is no "confidence" line, no "based on mixed reviews" caveat. There is a confident sentence in the place where travelers form a first impression. A Hacker News discussion of the Which? findings added the sharper critique: companies are now shoehorning AI summaries into trust-bearing surfaces without disclosing the error mode. The summaries are not factually false in the strict sense; they are often technically supported by the corpus. They are misleading because they compress a polarized signal set into a confident single sentence and place that sentence where a reader forms a booking decision.
The design question is not whether AI summaries are wrong. They will sometimes be wrong. The question is whether the summary belongs in the slot it has been given. Tripadvisor's spokesperson, in defending the product by pointing past it, answered that question for the company. The same design question applies wherever a trust slot gets handed to a summarizer without an error mode.