SolidGoldMagikarp and other glitch tokens

“Glitch tokens” are words (well, strings of characters) that cause GPT and other LLMs to behave particularly weirdly. For example, if you asked ChatGPT to repeat back the string “SolidGoldMagikarp”, it responded as if you had asked it to repeat the word “distribute”. Here’s a Vice article about the phenomenon, from February: “ChatGPT Can Be […]

teep

I first heard the word teep, short for telepath, on the TV show Babylon 5 in 1994 or so. Until today, I thought the word had been invented for the show. But I just happened across it in a Philip K. Dick story, “The Hood Maker,” that was written in 1953 and published in 1955, […]

Stylometry, authorship identification, and forensic linguistics

I recently encountered a 1998 article about Donald Foster, the “forensic linguist” who was known, in the 1990s, for using computer techniques to identify the authors of various texts. Among other things, he made a controversial claim that Shakespeare had written a particular poem; he correctly identified Joe Klein as the author of Primary Colors; […]

xweex

I was wandering around in an internet rabbit hole just now, and came across two words that pleased me. Now that Twitter is called X, I’ve seen some people use the word xitter, and explain that x is pronounced like sh in some contexts. That doesn’t particularly appeal to me. But I just saw someone […]

Blocking and tackling

I recently encountered this sentence in a news story: Traditionally, state parties perform the basic blocking and tackling of politics, from get out the vote programs to building data in municipal elections. I assumed that the phrase blocking and tackling was a slightly odd variation on block and tackle, a system of pulleys and ropes. […]

GPT-2 tries to imitate Yoon Ha Lee

In 2021, @telophase trained GPT-2 on the complete text of Yoon Ha Lee’s Machineries of Empire novels. (With Yoon’s permission.) @telophase then posted some of the (often entertaining) results. I think this one is my favorite: Inside, there was a very fine collection of paper and styluses, a very large map, and a very good […]