Current Projects:

In collaboration with scholars at University of Pittsburgh and Bowdoin College,  we are using text mining and machine learning methods to compare the word usage habits of three Supreme Court Justices and to suggest the possibility that Justice Scalia played a significant role in the authorship of the 2000 Bush v. Gore concurrence. An analysis of high frequency function words in the concurrence indicates many hallmark signs of Justice Rehnquist’s signature style. In the medium to low frequency strata of lexical words, however, we see compelling evidence of Scalia’s preferred vocabulary. Our preliminary analysis suggests that Chief Justice Rehnquist may indeed have been the crafter of the document but that much of the more forceful language of the document may have come at the intervention of Justice Scalia.

In collaboration with the Walt Whitman Archive, the Lab is working on a project that uses computational tools to explore Whitman’s Leaves of Grass. Whitman’s collection of poetry has a unique publication history: it was published in 9 separate editions during Whitman’s lifetime. Whitman heavily edited each edition, adding, changing, and deleting material. This project seeks to better understand Whitman’s editorial process by examining these changes. Currently, we are using topic modeling and sentiment analysis to examine whether the themes and emotional valence of Leaves of Grass transformed between various editions.

While validating the results of the Syuzhet R software package, the Lab noticed discrepancy between different human-coders on their categorization of positive, negative, and neutral sentences. With sentences usually characterized by their ambiguous nature, different readers had a greater tendency to classify sentences as having polar opposite sentiment value. A team including researchers from English and Psychology is leveraging fMRI technology in the Center for Brain, Biology, and Behavior to better understand reader-response to emotional language. The team is now analyzing and comparing fMRI data with demographic data from the pilot stage of the project to understand the relation between reading and affect.

Past Projects:

Willa_Cather_ca._1912_wearing_necklace_from_Sarah_Orne_JewettThe Lab has worked on three projects that explore the works of the celebrated  American novelist, Willa Cather.  All of these projects use computational methodologies to mine Cather’s texts, including her fiction, journalism, and correspondences. These projects explored questions related to Cather’s style.  One project examined how Cather’s novelistic style differs from that in her letters and journalistic writing.  One objective in this project was to  identify a core or root style that transcends genre–a complication that is often noted in authorship attribution research.  A second Cather project examined the extent to which Cather’s novels exemplify her own novelistic ideals.  In this work, Cather’s prose is held up to the standards of prose style that she articulated in her Ars Poetica “The Novel Démeublé.” Finally, a third Cather project employed the tools and techniques of computational authorship attribution in order to reassess the attributions given to a series of anonymous journalistic articles that have been attributed to Cather. You can read more about these projects in an article released by the University of Nebraska-Lincoln’s press office:


Mr_Collins_didn't_read_novelsThe Lab has also supported larger scale projects focused on characterization in the 19th century novel.  One project developed new methods for the identification, extraction, and measurement of character networks.  A second project explored character archetypes and specifically focused on the types of behaviors associated with male and female characters in 19th century novels. These projects asked questions such as: what are the differences between character networks in 19th century American and British novels? Do male and female authors create different types of character networks? What types of behaviors and actions are associated with male and female characters in the 19th century novel? How do these behaviors evolve over the course of the century?



The Lab has supported projects focused on using computational methods to analyze the concept of “genre.”  This work dovetails with our projects on character while adding genre as an additional facet for analysis and correlation. Here we explored questions such as: are there trends in the ways in which female and male characters are portrayed within different genres? Are male and female characters “allowed” to behave and speak differently depending on the conventions of the genre in which they appear? To read more about the Lab’s work on gender and genre, please see

Leave a Reply