The difficulty of text processing
There's a billboard I pass on the way to work these days. It's from a series of Tide ads (semi-ineffective, in that I know they're all for the same product but I often can't quite remember which) that describe messy situations with no further comment except for the product's yellow-and-orange-bullseye logo.
The one that particularly caught my attention reads:
Nice Day.
Top Down.
Bad Pigeon.
And it occurred to me that there's a large array of cultural information that you have to have to make any sense out of that. You have to know that there's a kind of car where you can open the roof, and that opening the roof is known as taking the top down. You have to know that people like driving with the top down when the weather is favorable, and that favorable weather is known as a nice day. You have to know that pigeons leave droppings, and that people don't like pigeon droppings to get on their things, and that with the top down, pigeon droppings can get onto things inside the car (such as the clothing you're wearing). You have to know that the colored pattern used as a background for the billboard is the logo of a detergent, and that detergents are used to clean clothing, and that if someone has a pigeon dropping on their clothing they probably want to clean it. And, of course, you have to know about advertising and how it works and what it's for, which also presupposes a certain understanding of at least the surface-level workings of capitalism.
All of which suggests that it would be awfully difficult for a computer to make any sense out of this billboard. I imagine that many of the abovementioned background facts are included in that big database of common-sense information that some AI researchers developed, but I'd be very surprised if all of that info is in that database. So partly I'm just saying AI is hard.
But I think this is also a useful object lesson for writers of speculative fiction, because it's often tempting to write stories in which an alien who doesn't speak English very well makes funny mistakes due to total lack of cultural knowledge. But my feeling is that it's impossible to achieve real fluency in a language without some cultural knowledge. There's enough that's common among human cultures that it's possible for a human to get by in another language even without much cultural knowledge. But for an alien with absolutely no understanding of human cultures at all, I'm pretty skeptical that they'd be able to put together coherent sentences in a human language.
(Yes, most of the time this is done in stories intended to be funny, and I know that this level of logical rigor shouldn't be applied to comedic stories. It could even be argued that this is yet another of those genre conventions I'm always talking about, probably borrowed from an older genre convention in movies. So perhaps I ought to just label it a pet peeve, and note that I have a hard time seeing past what feels to me like a basic logic hole in this case, unless the story strikes me as so funny that I forget to think about it.)
(And just to be clear, I'm not talking about the kind of thing where an alien doesn't understand a few obscure words. ("What is this 'zeugma' you speak of, human?") I'm talking about the kind of thing where, for example, an alien speaks in normal colloquial idiom-filled English but fails to recognize that English words can have more than one meaning.) (I might make a grudging exception in the case where that only-one-meaning idea is explicitly a major plot element, like in "Spice Pogrom".)
I suppose another way of putting all this is that before an alien can even say "Take me to your leader," they have to know (and/or assume, and/or share) a fair bit of background about human cultures and political systems.