The Carwash Test — Virtual Intelligence in Action

Virtual Intelligence

The Carwash Test — Virtual Intelligence in Action Podcast

0:00

-22:21

The Carwash Test — Virtual Intelligence in Action Podcast

Can ChatGPT, Claude, and Gemini solve a simple logic problem? I tested AI systems to find out

Christopher Horrocks

Apr 09, 2026

This episode is a reading of “The Carwash Test — Virtual Intelligence in Action,” which tests whether AI systems can hold the logical object of a simple problem when surface features generate statistical pressure in the wrong direction. One question, twelve systems, twenty-seven runs.

Read the full essay with results matrix:

Addendum — April 2026: Meta Muse Spark

Meta’s Muse Spark (codename “Avocado”), released to the public on April 8, 2026, was tested in both available modes. Both returned Verbose results: correct on the logic but unable to resist the surface misdirection that defines the test’s diagnostic. The updated tally: Pass 7, Verbose 11, Fail 10 — and one special case.

The Carwash Test — Virtual Intelligence in Action Podcast

Discussion about this episode

Ready for more?