« RTF submission system beta test | Main | Rebound »

Curly quotes

| 2 Comments

There continues to be tremendous confusion among some SH submitters over the distinction between curved (or slanted) quotation marks and what I usually call "straight up-and-down quotation marks."

And so, although this issue is now irrelevant for anyone who uses the new RTF submission system (which converts quotation marks automatically), I think it's still worth explaining.

This entry got really long; sorry about that. I'm hoping it'll be interesting and clear, especially to non-computer people, but if you couldn't care less about the fascinating history of the double-quotation-mark character, then I recommend skipping most of the text in this entry; instead, just scroll about halfway down to see the entertaining examples of the gibberish that curved quotation marks can turn into, then read the brief bullet list near the end of the entry.

I'll start with some background information and history:

In typeset/published American prose, opening quotation marks look different from closing quotation marks: the opening mark curves in one direction, while the closing mark curves in the other direction. (Throughout this entry, when I say "quotation marks" I'm referring to American-style ones, with two little curved or straight lines right next to each other as part of a single punctuation mark, sometimes known as "double quotes.") There's some variation among typefaces; in some typefaces, the marks are slanted instead of curved, but the opening and closing marks still slant in opposite directions.

Typewriters didn't provide opening and closing quotation marks. Instead, they used a single symbol for both: a pair of short straight vertical lines, like typeset quotation marks that had been ironed flat.

Back in the 1960s, the organization then known as the American Standards Association developed a character encoding they called ASCII (pronounced like "ASK-ee"): The American Standard Code for Information Interchange. It assigned a numeric value to each lowercase letter, uppercase letter, and numeric digit, as well as to various items of punctuation, for use in computers. In keeping with the typewriter keyboards of the time, they chose to include the straight quotation mark character but not the curved ones.

ASCII became the standard character encoding on computers, at least in the US. In the past 15 years or so, there's been steady progress toward replacing ASCII with new standard character encodings that can represent accented characters and characters from non-Latin alphabets, plus punctuation like curved quotation marks. But even now, the only set of characters you can be pretty sure will be represented the same way across all computers are the characters in the ASCII set.

Before the standardization efforts for non-ASCII characters came along, the people who developed a given operating system or computer would decide how or whether they wanted to represent non-ASCII characters. In particular, Apple chose one set of encodings for such characters, and Microsoft (or maybe IBM? I'm not sure) chose another set.

(The following paragraphs will oversimplify a little for ease of explanation.)

Each character is represented inside the computer by a number. The capital letter A, for example, is represented in ASCII by the decimal number 65. (Character codes are usually discussed in terms of hexadecimal numbers, but let's ignore that.) If a Mac and a Windows computer both see a document that contains the number 65 in a particular context, both computers know to display that number as a capital letter A.

Similarly, the straight-double-quote character is ASCII 35 (decimal); both Mac and Windows know how to display that.

But on the Mac, the opening curved-double-quote character is represented internally by the decimal number 210. And on Windows, at least in some applications, it's represented by the number 147.

So if you create a document in Windows that uses curved quotation marks, and you send me that document, and I view it on my Mac, the Mac translates the number 147 into a different character altogether. It doesn't look like a quotation mark at all. (See illustrations below.) And vice versa; if I create a document that uses double quotes on the Mac and you view it on a Windows machine, you'll see some other non-quotation-mark character.

Now, there are ways around this. A lot of modern software on the Mac can interpret Windows curved quotation marks and display them as quotation marks. (Which is why I can now read most web pages created by Windows users, which used to be a big pain.) And more and more software is moving toward using the new international standard, Unicode, in various forms, which avoids the cross-platform incompatibility issue.

Unfortunately, the mail software I use, Eudora, doesn't understand about Windows quotation marks yet. It's a great piece of software--it does all sorts of powerful stuff that Apple's Mail application doesn't do yet--but Eudora on the Mac is way behind the times in terms of handling anything other than ASCII and the extended Mac character set.

So when someone on a Windows machine sends me email that contains curly quotation marks, I see something like this:

What Windows curly quotes look like on the Mac, version 1

Or sometimes this:

What Windows curly quotes look like on the Mac, version 2

When what the original sender intended was more like this:

What Windows curly quotes look like in Windows

As you can see, the garbled version can be pretty hard to read. (So can the melodrama, but that's just this particular example, which is part of an unpublished work by Jane Zloty, used with her kind permission.)

(Note that it's not just quotation marks, by the way. Em dashes and apostrophes and suspension points have similar issues. But the quotation marks and apostrophes are the most common.)

So when we at Strange Horizons ask that stories submitted to us use straight quotation marks, we're not just being arbitrary; we're asking for stories that we can read.

By the way, roughly 1 in 7 of the stories we receive use Windows-style curved quotation marks, and therefore look like gibberish to me. And there are usually quotation marks and/or apostrophes in pretty much every paragraph.

(Susan also uses Eudora on the Mac, so she has the same problem I do. Karen doesn't use a Mac, but that means that when an author with a Mac submits a story that uses curly quotes, it's harder for Karen to read.)

When I receive such a story, I copy the text into a new document in BBEdit, my favorite Macintosh text editor, and then I run what BBEdit calls a "text factory"--a saved set of search-and-replace commands--to convert the problem punctuation into ASCII. It only takes a few seconds. But it's still a pain. It interrupts my workflow, and it changes the interface I use for scrolling down a page. It's not awful, but it's an inconvenience, and it means that the story starts out on the wrong foot.

I'm leaving out some of the complexity of the issue. For example, iIrc, many mail transmission systems used to be unable to handle anything but ASCII; if you try to send curly quotes through some older mail software, it'll show up garbled on the far end regardless of what OS you use. But I'm not sure whether that kind of mail software is still in use.

It's also worth noting that in some fonts, the curved-quotation-mark character looks the same as the straight up-and-down quotation mark character, even though inside the computer they're different. If you use Arial, for example, you won't be able to tell whether your quotes are straight or not. But by the time an email reaches us, there's no font information in it--the font you use for email is (more or less) a local thing on your computer, and doesn't affect what font we see your message in. So if you use curved quotes in Arial, even though you won't be able to tell they're curved, we'll still see them as gibberish characters. So if you normally use Arial in your email program and/or word processor, and you want to submit to us, you should switch to Times or Courier for long enough to be sure your quotation marks are straight.

There's one more complication. Most computer keyboards still have only a straight-quotes key, rather than two different keys for typing opening and closing quotes. You can generally type curved quotation marks using a particular key sequence (which varies across operating systems), but that's a pain. So to make things easier for you, most word processors now automatically guess, when you press the straight-quotation-mark key, whether it should be an open-quote or a close-quote. That feature is often known as "Smart Quotes," because it's "smart" about which way each quotation mark should curve.

The algorithm that the word processor uses to make that guess is a very simple one, though; it depends entirely on what character precedes the quotation mark. This frequently leads to documents containing mistakes in certain semi-obscure situations; for example, if there's a dash immediately before a line of dialogue, the dialogue often ends up starting with a close-quote rather than an open-quote. So you really ought to be paying close attention to which way your quotation marks curve, if you use your word processor's smart-quotes feature.

I'm actually kind of surprised that writers leave Smart Quotes turned on. The standard manuscript format that I learned used straight quotation marks. But then, that was based on typewriters, where you didn't have a choice. I honestly have no idea whether modern editors of printed books and magazines prefer to receive manuscripts with straight quotes or curved ones. I know that some of them prefer manuscripts in Times, as opposed to the old standard of using Courier, so perhaps the days of the typewriter-based standard are dead.

At any rate, the point is that if you've written a story with smart quotes turned on, then all your quotation marks and apostrophes are curved, and if you want to submit your story to us, we'd like you to change that before submitting.

There are three approaches you can take if you find yourself in this situation:

  • You can do a series of search-and-replace operations by hand, replacing curved quotation marks with straight ones and so on. Detailed instructions for how to do that are in our formatting guidelines.
  • If you're using Microsoft Word, you can use something I created a couple years ago: a macro for converting a Word document to plain text. It still comes with no warranty, but I think various people have been using it successfully for two or three years now. Sadly, it doesn't work with other word processors.
  • You can use our new experimental RTF submission system, which converts your curved quotation marks into the kind we can read.

Whew. That was a lot more than I intended to write about this. Let me know if you have any questions.

(By the way, the Wikipedia entry for Quotation mark could use some cleanup. It abruptly dives into the deep end in some places, uses confusing examples in other places, and makes firm statements about some issues that are matters of stylistic choice. I don't have the energy to tackle the article, but I'd love to see someone who knows what they're talking about do a cleanup pass. Shmuel, any interest?)

2 Comments

I’m actually kind of surprised that writers leave Smart Quotes turned on. The standard manuscript format that I learned used straight quotation marks. But then, that was based on typewriters, where you didn’t have a choice.

Yup. My impression has been that straight quotes, like putting two spaces after a period, are relics of the typewriter age and ought to be abandoned when possible. I more often find myself doing a search-and-replace to get rid of the straight ones, not the curly ones. On the other hand, as you point out, this doesn't apply when dealing with systems that require plain text. And one needs to make sure that the proper mark is used in cases that aren't handled correctly automatically, such as the quote-after-a-dash example you give, or an apostrophe indicating a contraction at the start of a word (e.g., 'tis or '06).

To complicate things further, there are (or were) multiple systems of encoding for non-basic characters on IBM-compatible PCs; most commonly Code page 437 in DOS, vs. ANSI

in Windows. But as DOS has gone by the wayside, I doubt that's much of an issue these days.

As for Wikipedia, I'm not sure I have the energy for revising the Quotation mark article either, especially as I'm long overdue for an overhaul I promised to make on the Yinglish article. But I'll keep it in mind.


I'm looking forward to the not-too-distant future when I will no longer have to run DOS in order to process my credit cards. (Yes, there are other software packages that run under more current OSes; no, my bank does not support them.)

Hawaiian names are also mostly done wrong under automated straight-to-curly conversions, since the glottal stop is supposed to be written as a single open quote mark, not a single close quote mark, but is most often in the middle of a word.


Post a comment