« Trial date | Main | Debugging principle »

More on MT and OpenID

| 13 Comments

In between reading stories tonight, I played around some more with the forthcoming Movable Type version of my journal.

I read up on OpenID (thanks much, Nao!). It's pretty cool; the upshot is that I could configure my journal so that anyone who has a LiveJournal account and is logged into it could leave an authenticated comment on my journal by entering the URL of their LJ account. And my journal's comments section would show the LJ username and an LJ icon and a link to the LJ. Nifty.

Unfortunately, none of that works (at least at the moment) unless I set up the journal so there's a separate static HTML page for every entry. Which would mean thousands of extra HTML pages sitting around on my server, instead of generating them on the fly. No thanks.

Also, it's kind of a pain to have to enter your LJ's URL every time you want to leave a comment. And that would leave out people who don't have LJ accounts. (Though I also installed an OpenID server at kith, so I could probably provide authentication for friends; I'm a little unclear on how that works.)

So I'm not gonna use the OpenID system or any other kind of comment-authentication system, at least not for now. Instead, I've switched to leaving comments entirely open. Anyone can comment on anything and it'll show up immediately, just like in my current journal. MT 3.2 has a bunch of spamfighting stuff built in; it'll be interesting to see how well that works.

I also made some cosmetic tweaks—more space between paragraphs, better spacing of comments, a "liquid" layout for the individual-entries page (to adjust better to browser window width), etc. Go take a look, and leave me comments if you have further ideas or thoughts or suggestions or complaints. I'm particularly curious about whether it looks good on a Windows system. I suspect it looks terrible in old browsers (like NS4 in particular, which might crash). I should also look into accessibility (is it usable with a screen reader? It seems to look okay in lynx, so that's a good sign), and I should validate the XHTML code, and I should decide whether I want dates built into the URLs, and I should figure out how to get old URLs and email addresses from old comments to import/display properly.

But overall, I think it's pretty close to being ready for me to switch over. Maybe this coming weekend, maybe not.

13 Comments

You know, before I commented on your comments over on my LJ, it would've been sensible to come read your journal first. Ah, well.

I think you made a good choice on the authentication question on your comments.


Jed, would it be possible to switch over the DesiLit blog to MT 3.2? It's drowning in spam... Or is that not allowed under the free license?


Unfortunately, none of that works (at least at the moment) unless I set up the journal so there's a separate static HTML page for every entry. Which would mean thousands of extra HTML pages sitting around on my server, instead of generating them on the fly. No thanks.

I didn't even know MT had an option to generate the pages on the fly. Are you sure you want to do that? I think the fact that MT generates static HTML is one of its best features. Disk space is cheap, HTML pages are small, and serving static pages is much less taxing than generating dynamic ones. Plus if anything happens to your database, or you want to retire the site, or archive it in any way, you've got the files right there.


M: Not sure. I think it would be possible to upgrade, but I'm not sure when I'll have time to look at it. Maybe next weekend. Another option would be to install MTBlacklist, a plugin that's good at blocking spam and works in pre-3.2 versions.

David: Fascinating; I was just thinking "Surely nobody actually wants to use static pages anymore, do they?" MT has had the option to do dynamic pages for a couple versions now, but I forget which version you're using. Anyway, I strongly prefer dynamic pages. It seems very natural to me: you've got a database full of content, you want to view pieces of that content, so you have a single page template and you build pages based on that template by drawing from the database. That's how my journal system has always worked. I don't like the idea of having a separate static copy of every piece of the database; it feels redundant to me.

And if I build on the fly, I can instantly change anything about the site. Static building also lets you change one place and have the change appear across the site, but it's slow and gets slower as the site gets bigger. When I switched to static to test the OpenID thing, it took quite a while for MT to generate all 2500+ pages, and that'll just get worse over time.

Presumably I won't be making sitewide changes very often, so arguably the speed of rebuilding is irrelevant once my design settles down. Certainly at SH we made the decision that since published content is unlikely to change, we might as well create static pages rather than going to the trouble of storing all the content in a database. But (a) the average size of SH pieces is a lot longer than the average size of my entries, and (b) I've already got my entries in a database, so it's no trouble to put them in one, and (c) there's more uniformity of format in my entries than in SH's content (I don't have to worry about spanning multiple weeks, for instance).

(Also, whenever someone posts a comment on an old entry, doesn't that page have to be rebuilt? I suppose that's still less of a load on the CPU than if the page has to be rebuilt every time, but really, I can't imagine that page-building is a significant problem for the processor.)

So I guess my counterargument is: CPU is cheap, dynamic building is elegant (much as using one central CSS file is more elegant than including all your CSS in every page), and it's the way the web has been headed for a while; more and more of the web is dynamically generated these days. I'm sure static pages will never go away, and I don't object to other people using static builds of pages from databases, but the idea of doing it myself rubs me the wrong way. Which is really what it comes down to; I'm not really making a logical argument here, more trying to justify a gut feeling.

"if anything happens to your database"—that's why I back up my databases regularly. And something could just as easily happen to my site.

If I want to retire the site, it's (marginally) easier/quicker to remove the database than to remove thousands of static pages. Or by "retire" did you just mean "stop posting"? That seems equally easy either way.

If I want to archive it, I would much rather archive the database and the template files than the static pages. That would let me later generate it in any form I wanted (including the static form), rather than having to painfully extract the content from the static pages before I could do anything else with it.


If you can point me to instructions on installing MT Blacklist, that'd be appreciated, then...


Well, Jed, bear in mind that you're talking to somebody who's two hundred percent burnt out on configuring software, and to whom the idea of writing his own journaling application is at this point actually marginally more attractive than figuring out how to upgrade, back up, and/or otherwise feel fully comfortable with the MT system he's got. :)


P.S. And that goes double for the box said MT system's running on.


BTW: The downside to using dynamic page generation is that Google can't index the contents. That means that your wise words are unfindable via search engines, and that if your machine went down, you were hit by a car, etc. your wisdom would be lost to the ages.

Not that you may mind that, just saying. Gary Price calls the phenomenon the "Invisible Web"...the subject(s) of which are being batted about the Library/archivist world.


David: well, if you want my homegrown system, I'd be happy to give you a copy. :) But that doesn't really address the issue. (For that matter, if you want a blog at kith, in my MT3.2 installation, you're welcome to one. But I suspect that doesn't help either.)

Brandon: Actually, it's not true that Google can't index the contents of dynamic pages. For example, if you Google for [waldrop godard truffaut riot], the first result is my journal entry about Waldrop's "French Scenes."

And the new Google Blog Search is specifically focused on blogs, many of which are dynamically generated at this point.

The term invisible web refers to pages that a search engine can't get to, either because there are no existing links to those pages or because you need to log in to get to them. If the only way to get to a particular page is by typing something (like a search term for a searchable database), then a search engine can't index that page. But in an online journal, all publicly viewable pages are linked to by other pages (in a chain of links going back to the journal's main page), so all such pages are reachable by search engines, including Google. Any page that you can reach by following a link (and that isn't password-protected, and that you didn't explicitly mark as not to be indexed) is indexable by search engines, including dynamic pages.

And btw, I know this is a nitpick but I think it's important not to be too alarmist about this: if my machine went down, my web hosting company (Pair) would replace it, and I would restore the content from the backups I keep. If Pair went out of business (very unlikely), I would host my site somewhere else (using the backups I keep). This is all independent of Google; if a static site goes down for an extended period, Google doesn't keep showing the content indefinitely.

The whole hit-by-a-car thing is another different issue. I would hope that someone would keep my site going for a while (certainly it would stay up by itself until the money I've pre-paid Pair ran out), but again, this is independent of static vs dynamic, and independent of Google.


I appreciate the offer, Jed, but then it’d just be your software that I’d have to learn to install, configure, and administer. :)

On dynamic page generation, where it’s really a problem (and it is really a problem) is where the same URL can't be counted on to get you the same content, or where large amounts of content all share the same URL. It happens a lot with sites that use too much POST and not enough GET.


Jed -

Neither is it untrue - the URL that you referenced in your query is not a persistent URL, such as a static web page would have. What Dave mentioned above is exactly the issue at hand.

Not to say that dynamic page generation is horrible or anything...just that there are some issues with relying on them.


Brandon -- I have to disagree. I don't think a URL that ends with "show-entry.php?Entry_ID=816" is inherently more unstable or less persistent than one that ends with "entryID_816.html". Either one depends on Jed maintaining the same naming convention on his server. As Dave says, the problem is only when a given URL (and URL here means the entire URL, including the querystring at the end) doesn't uniquely identify a page. Which is a failure of the programmer, and is not the case with Jed's homegrown system or with MT.

As Dave alludes to, the method used to provide information that tells a server what dynamic page you're looking for should always be GET (a querystring appended to the URL) rather than POST (form contents submitted in the HTTP header). POST is only meant to be used for entering information into a database or otherwise submitting information that changes the state of a server. Submitting this comment, for example, uses POST. Submitting a Google search query uses GET.

Sorry to get technical.

Let me also point out that snazzy web servers like Apache let you transform a URL from one form to another, so Jed could set up his site such that you would enter a URL ending with, say, "/showEntry/EntryID_86.html" and the server would serve up "/showEntry.php?Entry_ID=816". It would still be a dynamic page, but would look like an HTML page based on the URL.


Jacob - No worries - you haven't gotten too technical at all!

Anyhow...not worth debating about as you two already know about the subject, and that was the only point I had!