<EPIGRAPH><CIT><Q><LG TYPE="quotation">
<L>Since I can do no good because a woman</L>
<L>Reach constantly at something that is near it.</L>
</LG></Q><BIBL>--Beaumont and Fletcher.</BIBL></CIT>
</EPIGRAPH>
<EPIGRAPH><Q><L>Mein Herz, mein Herz is traurig.</L></Q></EPIGRAPH>
<P>Then our heroine crossed the dangerous
<PB>
<FIGURE><HEAD>Our Heroine Crossing the Dangerous
River.</HEAD></FIGURE>
<PB>
river.</P>
If an illustration should occur between paragraphs, you must enclose the <FIGURE> element within <P> tags:
<P>Then our heroine crossed the dangerous river.</P>
<PB>
<P><FIGURE><HEAD>Our Heroine Crossing the Dangerous
River.</HEAD></FIGURE></P>
<PB>
<P>She noticed, however...
Finally, if an illustration should occur at the beginning of a chapter, you should enclose it within <HEAD> tags:
<DIV1 TYPE="chapter">
<PB>
<HEAD TYPE="figure"><FIGURE><HEAD>Caption
[if any]</HEAD></FIGURE></HEAD>
<HEAD>Chapter II.</HEAD>
<P>Chapter begins here...
And for you, my noble lady, take my blessing on your
head,
it should be encoded as a single line of verse, without the line break:
<L>And for you, my noble lady, take my blessing on your head,</L>
act | advert | book | castlist | chapter | colophon | concluding | contents | corrigenda | dedication | endnotes | epigraph | frontis | index | letter | part | poem | preface | scene | section | undetermined
For those sections called "chapters" use TYPE="chapter". Larger divisions are called "parts" if they have no designation of their own, and that smaller divisions (within chapters, poems, etc.) are called "sections" if they have no designation of their own.
Proofreading is the most important part of this whole project, and the most difficult. It requires a great deal of concentration and a different kind of reading than you're probably used to. You'll want to record the text as it appears on the page, and be sure that you're taking the time necessary to see the page as it is printed. The best method is generally to read a line of text, read a line on screen. When you find an error, fix it and reread the line to see if there are other errors in the same line. You'll also want to impress upon your staff the importance of accurate proofreading.
I've tried to provide the most accurately OCR'ed pages possible. In fact, I've been OCR'ing most of the pages twice, using different settings, and choosing the more accurate page. But in some cases, the printing is too faint or too dark, or has too much bleed-through, or too many random speckles, or something, and the OCR comes out completely garbled.
The best thing to do in these cases is to retype the page, rather than try to correct the errors in the OCR'ed text--you'll figure out pretty quickly which ones these are. The more errors a page has, the more difficult it is to proofread, because you may fix one error, but miss others in the process. A certain number of titles simply cannot be OCR'ed, and we'll have to consider having these titles retyped by a vendor. Contact me if you think your text has too many errors to proceed.
Some common errors you'll find from the OCR include:
URL: http://www.letrs.indiana.edu/wright/guidelines.html