« New leaves | Main | Matt Ford »

Save as text

| 3 Comments

We've had a fair number of complaints in recent months that the SH fiction department's formatting requirements are too stringent. It's possible that at some point in the future we'll allow RTF; that's mostly contingent on my finding or writing a Perl script that strips RTF of its formatting and produces a plain text file formatted the way we want it. Shouldn't be too hard to write, but I'd rather just use someone else's; unfortunately, the most likely name for such a utility (rtf2txt.pl) shows up as part of a commercial application suite.

So that's on hold for now, but in the meantime I've got something that may solve most of the problem: a Word macro that cleans up formatting and saves as text.

There are a couple of problems with it. The biggest is that (I suspect) it'll work only in recent versions of MS Word. It was created using Word:mac v.X, and tested only there and in Word 2002 (part of Office XP) under Windows XP. I'm guessing that older versions of Word on all platforms may choke on it.

It should also probably be modified to set line wrap to about 65 characters, but I'm not gonna do that right now.

It hasn't had nearly enough testing to make it an official SH recommendation yet, so here's the scoop: if any of you want to try it out, you're welcome to, and I'd love to hear how it works for you. However, this is totally at your own risk. I don't think it should do anything nasty; it certainly shouldn't contain any macro viruses or anything like that. But I can't guarantee anything. It's vaguely conceivable that it might do all sorts of horrible stuff—I can't imagine why it would, but the power of Word macros scares me. If you're not comfortable taking that risk, don't try this macro. I honestly don't know that I would try it if someone else as inexperienced as I am with Word macros had created it.

If you're an expert on Word macros, I would be thrilled to get your feedback and/or improvements. (Mostly I just recorded some search-and-replaces; I made a few small tweaks to the code, but didn't write any of it from scratch.)

Okay, if anyone still wants to try it, here's how:

  1. Very important: back up the file that you're going to try converting to text. Make absolutely 100% sure that you have more than one copy of the file, just in case. In fact, ideally you would run the converter on a new copy of the file, just to be certain you don't mess up the original.
  2. Download it.
  3. Unzip it. (On Windows, use whatever unzipping utility you normally use. On Mac, StuffIt Expander should do the trick.) The result should be a file called SHMacros.doc.
  4. Open that file in Word.
  5. Now here's a scary bit: copy the macro into your standard template. (If anyone knows a less scary way to make the macro accessible to other documents that isn't too difficult, let me know.) To do this:
    • Choose Tools > Macro > Macros (or equivalent command in your version of Word).
    • Click the Organizer button.
    • Select SHMacros in the "SHMacros.doc" column, and click Copy to copy it to Normal.dot or Normal.doc or whatever your system uses as the standard name for the Normal template.
    • Click the Close button at the lower right to dismiss the dialog box.
  6. Open the document that you want to convert to text—or, better, open a copy of it.
  7. In that document, choose Tools > Macro > Macros again. In the list of macros should be one named ConvertToText (or SHMacros.ConvertToText, or something similar).
  8. Here's the other scary bit: click Run.
  9. Theoretically, it should do the conversions and then save the file as a text file named Textfile.txt.
  10. Open Textfile.txt in a plain text editor (Notepad, WordPad, BBEdit, Text Edit, whatever) and see whether the underlined words now have underscores around them, and see whether the curly quotation marks and apostrophes have been changed to straight ones.
  11. Let me know how it goes.

Hope it works. All feedback welcome.

3 Comments

This works really well! The only thing I had to fix was a place where I had three lines (separated by para breaks) all in italics. It got confused over where to end the italics mark, but other than that, great!

You don't have to add the macro to Normal.dot -- it is slightly less scary to just add it to the current open .doc file (using the Organizer). This works in Office2000, but I know the Organizer has changed a lot in the new version of Word for XP.

Thanks for the share :)


Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse sem enim, consequat in aliquet non, mattis ut lectus. In vel lacus at ante varius consequat. Aliquam iaculis eros ipsum, ut rutrum risus. In convallis, orci non luctus congue, nisi felis sollicitudin purus, nec ullamcorper lectus justo vel lacus. Nam ut feugiat massa. Quisque malesuada dui et tortor pharetra adipiscing. Etiam nibh turpis, condimentum ut sodales eu, egestas eu felis. Vestibulum odio nisl, cursus a suscipit eget, rhoncus quis felis


Naomi: belated thanks for the comment (five years late!)

Anonymous: Hmm. You seem to have correctly identified my blog title, which is the only reason I'm not marking your comment as spam.


Post a comment