Occasional Google misses

A few days ago, I tried to use Google to search for a particular Waldrop quote. I was pretty sure that I'd posted it somewhere online, but I wasn't sure where.

Google couldn't find the quote. I tried a bunch of variations on words I thought were in the quote, but no dice. I searched my hard drive, but couldn't find it there either.

Eventually, it occurred to me to use my journal's search function to see whether I'd made some oblique reference to the line. And there it was—Waldrop's bit about Godard and Truffaut not understanding English very well was right there in my journal.

Further Google searches revealed that the entry immediately before that and the one immediately following it were both indexed, and both those pages linked to the page with the Waldrop quote, but that Waldrop-quoting page wasn't indexed.

I dropped a note to Google about it, but just got back a generic form response saying that they can't guarantee that they'll index all pages on a site, and pointing me to their webmaster info for details on how to build a Google-friendly site. Which is all well and good, but none of the info there addresses the case where Google is indexing all of the pages for database entries except one. (I checked, and the page validates as XHTML 1.0 Transitional, and it doesn't have any untoward meta tags, so there's nothing I can see that would make Google ignore it.) The email from Google also recommended posting to google.public.support.general; I went and poked through that newsgroup and did find a fair bit of interesting stuff, but no answers to my specific issue, except that sometimes Google just glitches and misses something, and that submitting a specific page via their Add URL form can sometimes help. I thought about posting to the newsgroup, but I suspect it's a frequently asked question.

I've seen something similar happen with boingboing, which uses Google search for its site search engine. (And which, btw, seems to have been redesigned just now (looks like the redesign just went live this evening); everyone's doing it!) F'rinstance, if I used their search box to search for the term "Jed," it used to give me only about two or three hits, not showing quite a few items that I knew that they'd posted with my name attached. (It gives more results now, but I don't know if it gives all of them.)

I imagine this just means that, like everything, Google isn't quite perfect. It's interesting to me that this bothers me; with most tools, I'd shrug and write it off as another imperfection in an imperfect world, but I'm so used to getting good results with Google that I'm always a little shocked on the rare occasions when I don't. It's like finding a bug in a compiler or a typo in a dictionary—it's not supposed to happen! :)

3 Responses to “Occasional Google misses”

  1. Vera Nazarian

    Jed,

    I think it is hard spaces. Sometimes it acts really weird if you have leading or following hard space characters…

    reply
  2. Jed

    Hmm. You mean the nonbreaking space entity? ( ) If so, there are no more or fewer nonbreaking spaces on that particular journal-entry page than on any of the other pages, so I don’t think that’s what’s going on here.

    reply
  3. Kenny

    Wish I had a better answer for you too, but it’s a big web out there. We’ve only got 4 billion pages in the index. 🙂

    Seriously though, I forwarded your comments to some of the quality gurus around here. I don’t know if they’ll look at it or not, but I do what I can. 🙂

    reply

Join the Conversation