Gendered language
Jean pointed to The Gender Genie, where you enter some text and it applies the Koppel-Argamon algorithm (or perhaps that should be "Koppel's and Argamon's algorithm") to determine whether the author is male or female. (This is the same software that I pointed to an article about last month, when Meredith noted in a comment that their claimed 80% accuracy isn't really all that impressive.)
I'd heard the algorithm was simple, but I hadn't realized it was quite as simple as it is. (The Nature article kind of explained it, but I thought there was a lot more to it than there turns out to be.) As far as I can tell, what it mostly comes down to is that men (it claims) use the words the and a and some frequently, while women use with and possessives frequently. (There are adjustments to that for various factors, but that seems to me to be the main thrust of the algorithm.) I suspect the the part is intended to apply largely to noun modifiers—if I'm right, the idea (their idea?) is that men are more likely to simply refer to "the boat" or "a boat" or "some boats," while women are more likely to specify whose boat it is, or refer to it as belonging to someone or something. Does this suggest that women are more likely to be propertarians than men?
(Note that in the Genie's numerical breakdown/analysis, it says "personal pronoun" where I think it should say "possessive pronoun," though the Nature and New York Times articles describing the algorithm appear to disagree on this point.)
And the other half of the equation is that with is a female word, perhaps suggesting (I'm speculating wildly here) that women are nurturing and cooperating types, while men are rugged solo individualists.
As you may perhaps have figured out from the above, I'm mighty dubious about the whole endeavor. (Their whole endeavor?) But it's true that one of my journal entries came out as solidly male, while one of Mary Anne's came out solidly female. On the other hand, the test thinks Jean's male. Given that the odds of success when choosing randomly would be 50/50, so far it's not instilling a lot of confidence in me. But this is much too small a sample space to be useful, so you shouldn't take these numbers seriously. (Besides, citing specific numbers is apparently a male trait.)