Indexing, Tagging, and Other Locating or Scanning Devices

I recently had the (ahem) extreme joy of going through my manuscript for Show Sold Separately: Promos, Spoilers, and Other Media Paratexts to make the index (that’s not it above — that’s a screenshot from Google Books’ copy of Watching with The Simpsons). It’s a bittersweet moment in book publishing, since it’s the last thing you need to do before you then see it as a tangible object a few months later … yet it’s not a fun task (though one can introduce very small elements of fun: check out the Dharma Initiative entry in my index to Television Entertainment for an example, or the Bill O’Reilly one in Satire TV). You could pay someone else to do it, but then you won’t see a dime of proceeds from the book, and while I’m not foolish enough to think I’ll make much money off my books, it’s nice to get at least something out of it, even if that something equates simply to a load of groceries or a nice dinner.
Indexing’s also a complex act, since you must wrestle with who you’re doing this for – yourself or others – and if you answered “others” to that question, you then need to try and predict what categories will make sense to these hypothetical readers and their interest. I thought I’d reflect a bit on that act here, while discussing other modes of memory/locating devices. More after the fold …
An index is an interesting paratext, since it’s an attempt to provide an analog version of a web browser to your book. Of course, in a Google Books era, one could see that as a redundant act, but I’m sure indexes will still prove important, since they can also be really helpful to a casual book browser deciding whether the book’s worth it. The index can be a place to see what sort of terms and what sort of individuals hold sway over a book (which can prove all the more helpful if, like SSS, the book uses endnotes not a bibliography). Who and what are the book’s “sponsors”?
Along those lines, I wish modern books would include Wordles, as a much more pleasing way to look at word frequency. I’ll publish more when the book’s out, but for example, here’s a Wordle of Chapter 3, made by wordle.net:

Ultimately, though, I also find myself wracked by concern that the words I’m using aren’t the words that others might be searching for, even if they’re close correlates. So, for example, I write a lot about paratexts as frames, yet someone interested in “framing” might be disappointed to find “frames” missing from the index. Similarly, since I mobilize the terms “synergy,” “hype,” “paratexts,” “peripherals,” “extratexts,” “licensed merchandise,” and a variety of others, I had to wrestle with when to index each of these, without making the index a pp. 1 – 250 kind of affair. But what if the term I chose not to index is what someone’s looking for? And, since I invoked Google Books, what if I’m not using the word “synergy,” for instance, yet am discussing exactly that, when someone is Googling that phrase and finding nothing?
Such questions may seem woefully vain, concerned only with how people will find me and my book, but if we move beyond the case of simply my book, they’re cause for concern. After all, I face similar issues when I need to decide how to tag each of these blog entries. As does everyone indexing or tagging. And thus if we add all those gaps and human errors of indexes and tagging together, we have an archive that can be remarkably hard to access. I’m inspired here by a recent (as yet unpublished) piece I read by one of my dept’s fantastic grad students, Megan Sapnar Ankerson (who, by the way, is on the job market, so hire her if you can), about doing Internet historiography and about the Net’s many, many, many missing pages and records. As we all start to use database searches more and more, and as the world – and hence those databases – fills with ever more publications, we can often fool ourselves into thinking that they’re giving us access to everything. But it all rests with the accuracy of the indexing, the tagging, and whatever system a database uses.
So I’m left wondering if there are better systems in place? And if so, what are they?
Finally, I should note that this concern is personal in other ways too. The more that I read and the more that my head fills, the more I’ll need my indexes and other search mechanisms to access all the research I’ve already done. In grad school, I was often driven by a fear, when researching, that I’d forget all that I’d read, or that years from then I’d overlook the perfect quotation or idea, simply because I’d forgotten about it. Especially if that perfect idea was an aside in a book about something else. So I indexed my notes, quite painstakingly. But just as my thoughts on topics change, so do my categories into which I put them, and my vocabulary for them; thus, looking back on my index for everything I read as a PhD student, I’m confused as to why I classified some things in one category, not another, and so I know I’m likely to forget about a great many of these things. What strategies do others have to wrestle their memory into submission?
Behind on my blog reading, so I just read this. A couple of things come to mind: you didn’t mention one of the primary ways that indexes are used for academic titles – a short-cut to see the theoretical influences and models used in a book. I remember in grad school, this is one of the main ways I mapped the field, seeing what work was mostly Marxist vs. Foucauldian vs. pomo, etc. (And of course now that we’re published authors ourselves, the index is an analog equivalent to self-Googling!)
Another thing that comes to mind is that although it may not be monetarily advisable to pay for an indexer, it is really eye-opening to see how someone else categorizes your book. I hired a student to index my recent book, and it was quite interesting to see how she categorized the content. Maybe not worth the money if you don’t have a funding source, but it does shine a light on your own writing in a way that neither the author nor Wordle can.
ahem, I wrote “The index can be a place to see what sort of terms and what sort of individuals hold sway over a book [....] Who and what are the book’s ’sponsors’?”, so yeah, I agree with your first par. that this is a really important use for indexes. As I wrote that bit about “sponsors,” I thought of Sesame Street episodes and the letters or numbers that sponsored it
And I agree it would be fascinating to see how different an index by someone else would look. Indeed, I’d love to see three or four competing indexes to illustrate the point. Alas, there would go my semblance of royalties