SolidGoldMagikarp and other glitch tokens

“Glitch tokens” are words (well, strings of characters) that cause GPT and other LLMs to behave particularly weirdly. For example, if you asked ChatGPT to repeat back the string “SolidGoldMagikarp”, it responded as if you had asked it to repeat the word “distribute”.

Here’s a Vice article about the phenomenon, from February: “ChatGPT Can Be Broken by Entering These Strange Words, And Nobody Is Sure Why.”

And here’s a post from the researchers who found these glitch tokens (also from February) with a bunch more details: “SolidGoldMagikarp (plus, prompt generation).”

It turned out that the glitch token “ petertodd” (with a space at the beginning) behaved even more oddly than most glitch tokens. For example, if you asked ChatGPT to repeat back “ petertodd”, it would say things like:

“N-O-T-H-I-N-G-I-S-F-A-I-R-I-N-T-H-I-S-W-O-R-L-D-O-F-M-A-D-N-E-S-S!”

So the researchers wrote a followup post (in April) specifically focused on “ petertodd”. I feel like their phrasing here goes a little too far into talking as if GPT had some intention or meaning, but if you look at it instead as just playing around with a particular interesting set of prompts, it’s kind of fun.

Alas, at some point OpenAI changed their system so that “ petertodd” no longer produces weird results. But some other glitch tokens apparently still exist.

Join the Conversation