{"id":2339,"date":"2004-10-10T21:45:59","date_gmt":"2004-10-11T04:45:59","guid":{"rendered":"http:\/\/www.kith.org\/journals\/jed\/2004\/10\/10\/2339.html"},"modified":"2004-10-10T21:45:59","modified_gmt":"2004-10-11T04:45:59","slug":"another-gender-guesser","status":"publish","type":"post","link":"https:\/\/www.kith.org\/jed\/2004\/10\/10\/another-gender-guesser\/","title":{"rendered":"Another gender guesser"},"content":{"rendered":"\n<p><a href=\"http:\/\/cgi.sfu.ca\/~gpeters\/cgi-bin\/pear\/gender.php\">Geoff's Gender Guesser<\/a> attempts to determine whether a given name is more likely a male name or a female name, by analyzing Google results.<\/p>\n<p>(And I'm tickled to see a note on his comments page from the Google team saying that they're increasing the daily query limit for his software: \"We wouldn't want anyone who's trying to guess their gender to run out of queries.\")<\/p>\n<p>It uses a simple algorithm: it compares the number of results for \"Mr. <span class=\"variable\">name<\/span>\" to the number of combined results for \"Mrs. <span class=\"variable\">name<\/span>\" or \"Ms. <span class=\"variable\">name<\/span>\" or \"Miss <span class=\"variable\">name<\/span>\".<\/p>\n<p>This approach is pretty good at guessing the gender of common strongly gendered American names, but it does produce some odd results.  For example, the system considers <span class=\"word-as-word\">Coffee<\/span> to be a fairly common boy's name&#8212;Mr. Coffee, donchaknow.<\/p>\n<p>And of course this algorithm results in a lot of last names being counted.  It doesn't know that <span class=\"word-as-word\">James<\/span> is pretty much an exclusively male name, because if you search for \"Mrs. James,\" the results include \"Mrs. James Devereux\" (a woman married to a man named James Devereux) and \"Mrs. James' birth\" (where James is her last name).  Searching for \"Ms. James\" is unlikely to end up with a husband's name, but <span class=\"word-as-word\">MS<\/span> can also stand for \"Master of Science\" or \"manuscript\" or just someone's initials.<\/p>\n<p>But for such a simple algorithm, it does seem to give fairly reasonable results a fair bit of the time.  Anyone have any thoughts on other purely textual gender markers (in English or other languages) that one could use in a very efficient Google-based gender guesser?  Note that gendered pronouns aren't necessarily any use, 'cause they generally won't appear next to a name.<\/p>\n<p>I went to the Guesser's <a href=\"http:\/\/cgi.sfu.ca\/~gpeters\/cgi-bin\/pear\/gender.php?report=male\">most masculine names<\/a> page to find out which names are strongly male.  It turns out that <span class=\"word-as-word\">Yashwant<\/span> is, by this algorithm, the most masculine name on the web, well over 1000 times as likely to be a male name as to be a female name.  Also on the list: <span class=\"word-as-word\">Splodge<\/span>, at #4.  (For those unfamiliar with the word, it's a mostly British term meaning \"splotch.\")<\/p>\n<p>I find it interesting that on the \"most masculine names\" list, none of the top 50 names are common Western male names (while quite a few of the top 50 \"most feminine names\" are relatively common Western female names).  I also find it interesting that the top ten most common names (regardless of gender) are all common Western male names; and that many common Western male names have a relatively low gender factor, meaning that many of them appear prefixed by a female title nearly as often (within an order of magnitude, say) as with a male title.<\/p>\n\n","protected":false},"excerpt":{"rendered":"<p>Geoff&#8217;s Gender Guesser attempts to determine whether a given name is more likely a male name or a female name,&#8230;<\/p>\n","protected":false},"author":5,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[47,12],"tags":[],"class_list":["post-2339","post","type-post","status-publish","format-standard","hentry","category-gender","category-language"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.kith.org\/jed\/wp-json\/wp\/v2\/posts\/2339","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kith.org\/jed\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kith.org\/jed\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kith.org\/jed\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kith.org\/jed\/wp-json\/wp\/v2\/comments?post=2339"}],"version-history":[{"count":0,"href":"https:\/\/www.kith.org\/jed\/wp-json\/wp\/v2\/posts\/2339\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.kith.org\/jed\/wp-json\/wp\/v2\/media?parent=2339"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kith.org\/jed\/wp-json\/wp\/v2\/categories?post=2339"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kith.org\/jed\/wp-json\/wp\/v2\/tags?post=2339"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}