Wednesday, August 31, 2016

Photoquotes

Photoquotes
Def:    A quotation that is not transcribed but, rather, photographed and embedded in the flow of prose where normally a block quotation or quotation marks would be used.  A photoquote is not an illustration of what the scholar is referring to. It is not even just supporting evidence.  It becomes an integral part of the argument.
Quotation is a fundamental tool of historical and literary scholarship--and it is tricky. Anyone who has vetted a manuscript, be it an essay assignment in school or an article or book manuscript, or anyone who, in reviewing a book, has checked the quotations against the originals, knows how easy it is to find typographical errors.  But more important, it is easy to find cases where the quotation is not from the alleged source, but rather from some knockoff reprint.  It is also easy to identify cases where the quotation was truncated strategically or taken out of context.  
Photoquotes make such errors and deceptions more difficult if not impossible.  Scholars in search of facts and sound argument would welcome that difficulty.
Perhaps the first remarkable use of photoquotes was in descriptive bibliography when the quasi-facsimile transcription of title pages gave way to photographic reproductions.  The quasi-facsimile transcription was an attempt to reproduce with codes, symbols, and strategic deployment of white space the salient characteristics of a title page in order to give users of the descriptive bibliography a way of identifying copies of a book.  The title-page photograph not only gave a more detailed rendering of the original, it eliminated the human error virtually inevitable in transcriptions.
Skepticism and a will to believe form a basic tension in scholarship.  Scholars always want to know where information comes from.  Footnotes and lists of works cited do not just witness the range of a scholar's frame of reference, they do not just bolster the arguments with an array of authorities.  They are primarily the identification of the sources of one's information.  And yet, the most scrupulous attention to annotation only gives readers a place to go and check the accuracy and fairness of one's use of sources.  Scholarship's report about where information comes from does not change the fact that quotations and reports of information are, to the reader, hearsay--reports at second hand.
Thomas Paine explained his reluctance to believe in God and an afterlife on the grounds that he had no first hand evidence.   He said he did not deny the possibility of revelation; he only denied ever having had a revelation himself.  He did not deny that other people might have had a revelation from God; but him, when these revelations were reported, it was second hand.  He wanted evidence.  Presumably he was interested and would have welcomed evidence of God and the afterlife.  He was not stubborn; just skeptical.  He would not bank his life on secondhand information.    That is perhaps extreme if applied to all knowledge.  But scholarship is about identifying and evaluating sources and replicating arguments.   Accepting on faith what one reads is not scholarship.
Photoquotation may be susceptible to manipulation, but a photo taken from a properly identified source document, with sufficient inclusion of margins and context would give readers a sense of authenticity far more convincing than a mere footnote to a block quotation.   For example, the following is taken from an essay on the composition and publication of Virginia Woolf's To the Lighthouse.  Imagine it with transcribed block quotations, rather than photoquotes.

Readers familiar with To the Lighthouse might be reading the following passages for the first time, since they appear in very few places--and not in any trade edition of the novel.  The question before us is, Do the cut passages add to or change one’s notion of James and his relation to his father?  All the cut passages are in the mind of James, who does not elsewhere in the novel think or express these thoughts.  In the text as it was published on both sides of the Atlantic, James thinks about his father’s nearing presence.
Fig. 2  (p 286 British Edition)
(Grateful acknowledgement is made to The Society of Authors as the Literary
Representative of the Estate of Virginia Woolf for permission to photo-quote
from Woolf's original documents.)

Originally, the proof-version that Woolf thought was the final form, was rather different.  One must imagine the next picture of proof , pages 286-87, without the black pen alterations, which were copied by an editor at Harcourt Brace in New York, onto this original set of proofs only after Woolf had altered and sent in the second copy:

Fig. 3. Two pages of 1st proofs with Harcourt Brace editor's alterations
(copied from 2nd proofs, see Fig 4).
Given a second directive from the printer to cut additional material from the novel, Woolf produced the following:
Fig. 4.  Second copy of proofs with VW's inked alterations.
The HB editor accurately recopied onto the first copy of proofs the alterations Woolf had made on the second set of proofs.  No problem there, but the major deletion seems to me very significant, removing from James imagination the horror of thinking of his mother with his father near.  Readers can make of that what they will; it is not explicit elsewhere in the novel.

Friday, August 12, 2016

Computers, Wood, and Textual Studies



     Of course nothing I know about textual studies required that I learn it from computer programming or woodworking, but some of us are slow learners or learn inadvertently from seemingly unrelated events.
     In 1976 I managed to get funding from the NEH for an edition of William Makepeace Thackeray’s The History of Pendennis, an 800plus page serialized Victorian novel.  Tucked into the grant proposal was a request for funding the development of a computer collation program.  It was to be built on a prototype that had been developed by Susan Follet, a computer science MA student, under the joint direction of Edwin Ellis, professor of computer science, and Miriam Shillingsburg, professor of English.  Their work was based on earlier prototypes by Margaret S. Cabannis and Penny Gilbert.  So, in 1976, I began working with Russell Kegley, another MA student in computer science, meeting with him all day every Friday for nearly an academic year.  The end result, CASE (Computer Assisted Scholarly Editing) was a suite of nine programs that enabled collation, manipulation and merging of collation results, manipulation of diplomatic transcriptions, printing of alterations in manuscripts, textual apparatus tables, and computer typeset edited texts.  But the process, not the product, was what taught me what I had failed to learn in the normal course of training and practice in textual studies.  It is that in my line of work it was easy to fool one’s self into believing one was being careful, thorough, and conclusive.  When one has to explain to a computer (by way of a programmer who writes the explanation as instructions in a programming language so that a machine will do what is required), when I had to explain with such clarity what needed to be done so that the instructions could be written in a yes/ no, off /on, if/else form, then I discovered that in the humanities, there is normally nothing to stop us from fudging the little stuff, allowing guestimates and plausible speculations to influence our results.  When applying the unforgiving precision of computing to my process, the result of every fudge and every speculative guess was a jammed program or an endless loop.   Computer science--or at least programming--teaches attention to detail and the need to discover that which would otherwise be swept under the rug.
     The real results of that year of working with Russell Kegley was not the suite of programs, CASE, which went on to be used in preparing print editions of Thackeray, Irving, Carlyle, Dickens, Hardy, and other works, using forms of CASE developed for Univacs, Dec10, IBM mainframes, Macs, PCs, and laptops in a variety of languages from PL1, EBCDIC, Pascal, TurboPascal, and C, for various operating systems including Linux.  That was pretty satisfying, but not the main deal for me.  Nor was it the scholarly edition of Pendennis which I was able to use to support grant proposals that led to the production of nine other volumes in the Thackeray edition, which was also satisfying.  The main product of that initial grant was my comeuppance in discovering that my hatred of error and my attempts to pay attention to detail could be thwarted by a computer.  My first reaction, of course, was to blame the idiot machine that would do only what it was instructed to do.  That, of course, gave way to the realization that it was an idiot that was instructing the computer to do idiotic things.  Humbling and, eventually, enabling.  The lesson underlies the most useful thing I have done as a result of having undertaken textual studies, the writing and publishing of three editions of Scholarly Editing in the Computing Age (SECA, for short).   I have written other books, but SECA is the only one that has led perfect strangers to come up to me and tell me how helpful that book was to their own journey in textual studies.
    That brings me to woodworking, which I have taken up in a more or less serious way since I retired in 2013.   I already had a forest of hardwood trees in the steep mountains of western North Carolina, and I had acquired chainsaws, a tractor with a logging winch, a heavy duty pickup truck, and a variety of woodworking tools.  Weather causes trees to fall in the woods, where they rot, unless someone cuts them up for firewood or drags them out and takes them to a mill to be made into lumber.   That kind of work is so rough that one has to make a big mistake for it to be noticed.  If the mistake is big enough, one does not live to tell the tale.   Once the boards are cut, and air dried, and kiln dried, and planed, they can be made into floors, or tables and benches, or cabinets, or bat houses, or anything you can think of that is made of hardwoods, like maple, oak, cherry, or hickory.   My son raised my woodworking game to a new level when he bought me some new tools in exchange for giving his son a week of carpentry camp.  Table saws, radial arm saws, band saws, planers, drill presses and lathes are great for concentrating the attention of 9 and 10 year old children, but I have also found that my grandchildren’s parents and even former graduate students love carpentry camp.  Not to be outdone, another son gave me a jointer in exchange for a promise of a new desk made of hickory wood, which he and I found in the woods and dragged out together with the help of his wife and three of his children and two of my dogs--and of course the tractor and winch.
     It was in building the desk that I discovered another truth or principle important in textual studies.   Because the jointer was expensive, I had to make the desk perfect.  Time and effort would just have to be spent to make things right.  Error could not be tolerated.  Error would destroy the aim of making a product that I could be proud of and that would not embarrass my son.  The harder I worked to remove the imperfections of my work, the more obvious the little flaws became.  When one uses rough sandpaper to get rid of obvious nicks and irregularities caused by rough power tools, the more little nicks start showing up.  The finer the grain of sandpaper, the clearer and more abundant the tiny flaws become.
      My father once told me that the word “sincere” came from carpentry.   Carpenters, it seems, have from time immemorial used wax or some such substance to fill in the nicks and crannies that are too difficult to sand away.   The Latin word for wax, ceram, combined with the word for without, creates sincere, without wax.  Let me tell you.  It takes a lot of work to be sincere.
      In fact it is so hard to be sincere in carpentry that it reminds of nothing more than how hard it was to be sincere in textual studies.   To be sincere means not to pretend to anything other than what you have done.   Either you use wax or some other substance to fudge your product and hope no one notices or you work hard enough to not need wax or any other substance OR you come to the realization that you cannot produce a perfect product.
      I began by saying that some of us are slow learners.  I now realize that this certainty of imperfection was well understood in textual studies long before I came along.  And yet not all students of texts, not even scholarly editors, have leaned how to deal with the inevitability of imperfection.  The idea that one can produce an edition that will never have to be done again is a holy grail, tempting untold number of Jasons in the field.  But the old masters already knew.  They called their editions Critical Editions.  Some hopeful fool coined the word Definitive Edition, but that did not last long.  Textual scholars tend, on the whole, to be honest people--hopeful always, naive, sometimes, but mostly honest.  No one claims to have produced the perfect critical article that will end the need for new critical articles.  That would be laughable.  Critics can do seminal work, but they cannot do terminal work.  Likewise, textual critics and scholarly editors can bring high levels of sophistication and skill to their work; they can be innovative and exciting; but they cannot produce a definitive edition that will make all other editions unnecessary or passe.  That is equally laughable.
       My son’s desk is imperfect.  I told him I would build a replacement desk.  In the meantime, he put the imperfect desk in his home office.  The first person who came to his house to do business offered to buy it.  But it is not good enough.   Neither is that edition you just finished producing.  Learn to live with it.

Bichitra: The Making of a Tagore Website

Bichitra: The Making of a Tagore Website
Sukanta Chaudhuri (with others)

            There are, in the world, more native speakers of Bengali than of Russian, Japanese, German, French, or Italian.   There is only one Bengali writer who has won the Nobel Prize for Literature, but the archive of his writings is larger than Shakespeare's, Goethe's, Proust's, or Faulkner's.    His name is Rabindranath Tagore, poet, novelist, historian, dramatist, painter, sculptor, composer, educator, translator.    His archive of manuscripts and printed works, amounting to over 140,000 pages, is the first major writer's archive to be entirely (almost) digitized and posted to the Internet--"almost" because 40 rare books out of 450 books and 300 out of 3,200 journal items could not (yet) be obtained for reproduction.  The virtual archive was accomplished in two years by a team of 30 plus researchers and computer programmers funded primarily by the Indian government, which found itself justly proud of its Nobel Laureate on the occasion of his 150th birthday in 2011.
            How they did it and why you should care is the subject of a new book, Bichitra: the Making of a Tagore Website, by the project director Sukanta Chaudhuri.   Readers of Chaudhuri's book, The Metaphysics of Text, are familiar with his elegant and clear prose, his attention to detail, his self-effacing grace, and his incredible stamina.  Most of the world needs this book because we don't know Tagore well enough, we don't know Bengali, and we don't know how to build or use virtual archives.  The onus is on us but Bichitra, the book, makes it easy to find out. 
            The first step is to understand the importance and achievements of Tagore himself.  He is a recognized world figure, but few will know that his works (he wrote in both Bengali and English) exist in multiple versions.  Sometimes he turned a play into a novel or vice versa, or he incorporated poems into novels or other works.  Sometimes his works were both collected and anthologized under his supervision, for which he made changes.  Sometimes he wrote the same work (more or less) in both Bengali and English.   But more often he was discovering new things to say with his already written works--he changed his mind--or finding a better way to say what he originally thought.    The richness of Tagore's archive for the study of the genesis of thought and of literary works is unsurpassed by any writer anywhere.  That is why it is called Bichitra, the various, the curious, the bizarre. 
           Obviously a reader needs more than just this book to explain Bichitra, the website.  One needs to be able to work one's way around in the archive.   So, there are tools: search engines and a concordance engine bring Tagore's words and subjects together.  A bibliography with links to every form of each work aggregates the related materials.  A collation program identifies the variants in the different forms of each work.
            It is an archive not an edition.  At one point Chaudhuri modestly calls it a "mere archive" to explain why the site does not explain the genetic process or explicate the significance of textual variants--except for a few examples to show the potentials.  He rightly points out that would be a major project in itself.  The site enables that kind of work; it does not do it for us.    There is nothing "mere" about this archive.  For the first time, persons interested in Tagore can read any one of dozens of versions of his works, can read rare works, can read works in the context of collections of Tagore's works or as originally printed, can read the images of original publications or the transcripts made of them in order to be computer searchable.  And readers can read manuscripts of works (mostly) published, but also for versions that were never published.
            Suppose, however, you are not interested in Tagore, you can still learn much about the Bengali language and its particular difficulties for keyboards, printing presses, and software for searching and collating.  Even questions about fonts receive careful attention.  In the absence of adequate software environments for major literary virtual archives (even for Roman alphabet languages), the Bichtra project invented its own standards for imaging, for transcriptions, and for collations.  Everyone with a large text project confronts the delight and disaster of OCR (Optical Character Recognition) which even at 98% accuracy produces an average two errors per 100 characters (counting spaces) or 40 to 50 errors per page and OCR is of no use at all for manuscripts, which have to be transcribed manually.   Bichitra represents major accomplishments of interest to digital humanists everywhere--if they can just overcome their lack of interest in Tagore or Bengali.  Ignorance is a comfortably debilitating condition, bliss--sort of.
            For me the major accomplishment of the Tagore archive is the images of (almost) every version of every work.  Digital collections of transcriptions are not archives, regardless of what anyone may claim for them.  A transcription is a copy, a reset copy.  It is different from its source text in every character because it is a copy susceptible to error at every character; it is not the original, it is not the same.   Of course, a digital image is a copy also, but it is at least visually accurate.  No one says that a picture of a person is the person.  None should say that a picture of a book is the book.  But digitally, images are as close as technology can get to providing surrogates for the material originals.   Bichitra's crown jewels are its images.  No institution has all the documents; but in this website they are collected, photographed, and mounted.  That is not only great for Tagore studies, but for all aspiring digital archives.  The process, the cameras, the lighting, the negotiations for permissions to photograph, and the alternatives for storing, archiving and displaying images are all so complex that anyone wanting to create a sophisticated archive website will learn much from the Bichitra experience.   But it is so much more.  Images cannot be searched, analyzed or collated.  For these operations transcriptions are needed.  Bichitra provides them. 
             Those last three words were so easy to write.  Over 47 thousand pages of manuscript made transcription anything but easy.  The chapter on manuscript transcription is easily the longest and most interesting because it deals so openly and sensibly with an extremely complex problem.  Most readers will soon get over their unfamiliarity with the language as they get deeper and deeper into considerations of what every manuscript transcriber has experienced.   Transcription is detective work, interpretive work, philosophical work, and practical work.  Before the end of the day, decisions have to be made about how to proceed.  Tagore was a rapid writer and inexhaustible reviser.  Some of his assistants learned to emulate his hand.  Is it a nightmare or a fertile field?  Chaudhuri seems to know that it is the former but he treats it as the latter.
            Every project director and every technical officer and computer science partner on a digital archive project will benefit from reading chapters 6 through 9 in particular.  Chapters 6, 7, and 8 do not shy from technical detail but even technically challenged textual scholars should have no difficulty understanding them.  
            They recount first the task of organizing the file structures required to keep track of hundreds of thousands of individual files of transcriptions and images.   The project team devised a new content management system because there was none to hand adequate for the job. The description of Tagore’s tangled bibliography is merely prelude to describing the organizational system that brought digital order to it.  Next they tackle the job of providing indexing and search capabilities to the website. Third, they describe the construction and function of a collation program that will handle Bengali language and multiple versions.   These three back-end systems and tools represent a formidable accomplishment; given the time in which it was done it is like a miracle.  
            Chapter 9 describes the front end--the user interface design and functions.  Given the intricate and orderly content management system, display of content for the user is potentially infinitely malleable.  The achieved system is not perfect but it is more than a very good beginning.  Nevertheless, the project was launched at a significantly high plateau of achievement.
       Chapter 10 treats the entire project as a good start and addresses three areas for improvement: additions to the content; improvements of the internal synchronization of images and transcriptions, and additional analytical tools and uses for the content.   The project, thus, fulfills the expectations of modern modular project structures, rejecting the intricate monoliths of early electronic projects.  It is extendible.
      The book begins and ends with acknowledgements to those who constructed or supported the project.  It is fitting that this description of so large a project, with such high standards, should begin and end so.  It takes a village to build a digital archive.