DH Methods Presentation

April 28, 2014June 26, 2015 / Jana Rosinski / Leave a comment

I made a handout for my presentation in RMH’s methods course. I tried to structure the presentation as a conversation in which I (attempted) to answer questions and show examples. I created text files from our exam reading list to mine/visualize, but we ran out of time. We were going to explore/compare Bitzer’s “The Rhetorical Situation” with Vatz’s “Myth of the Rhetorical Situation” in voyant tools. I also had an idea to look at “rhetoric” in each book of Aristotle’s “On Rhetoric” in voyant tools to look for how the term was situated. I could still pursue these projects myself, but what arose was an interest in comparing rhet/comp exam reading lists from as many institutions I can find. I don’t know what I’ll do with the data yet, but I am curious to look at the lists across institutions.

Working DH Literature Review

April 1, 2014June 26, 2015 / Jana Rosinski / Leave a comment

For Rebecca Moore Howard’s CCR 635: Textual Research in Composition and Rhetoric

In creating a literature review for doing digital humanities work, I had to eventually take pause—DH is active, and in looking for it, I seem to find it everywhere. Even employing methods of reading distantly, that is using digital tools to allow me to read a large number of texts I have collected for the keywords that are illustrative of their focus, I still couldn’t cover enough of the smart work that is being done, the critical questions being raised, the collaboration and conversation taking place. This synthesis of works is by no means comprehensive or perhaps even representative of work being done under the umbrella of DH. I have attempted to immerse myself in the flow of conversations and materials circulating on the web—bookmarking every crumb that makes mention of tools, programs, articles, arguments, definitions, and projects. In a meeting, you (RMH) asked “is it more accurate to talk about tools than methods?”, to which I couldn’t give an answer. I’m not sure I can now either, but I will work to provide coverage of responses that both answer and complicate this question in doing research that employs DH methods. I will discuss some approaches to framing DH work; how texts become sets of data; creating visualizations of data; interactions of scale in creating and reading visualizations; ethical, social, political, economic and cultural considerations; and curation of DH research.

Framing DH: Definitions, Justifications, Orientations

I thought a reasonable place to begin my research into DH methods would be to define what they are. However, I quickly learned that DH is difficult to define as a uniformly accepted term. From my reading, I attribute this to variations in definition of what it means to do digital humanities work (as influenced from what disciplinary affinity – computer sciences, artificial intelligence, social science, etc.); how tightly one aligns themself with the “digital” or the “humanities” (which are both rich terms as themselves); or how one envisions the joining of the two terms—what work is possible and what work should be of concern. Often DH is referred to as a large, overarching structure: a tent, an umbrella, something under which different alignments, investments, and affinities can find space. DH isn’t singular in its material base, projects, methods, or scholarly sphere. What seems to surface with frequency is an conceptualization of research that works to answer the call of the humanities to increase the use of evidence in work, conducting RAD research that is replicable, aggregable, and data supported. This suggests a move that engages the materials of humanities—texts— in ways that can yield more than anecdotal accounts of experience or interpretations of a small scope—a close reading of a text or a small handful of texts. The form(s) this research takes—how it is designed, from what data it works to find pattern within, how its findings are presented (which methods of data analysis and visualization) are what makes DH so varied—the textbase (that allows for the creation of textual data), the emergent questions and curiosities, and how to illustrate what is of interest vary from inquiry to inquiry. The forms and the content develop in relation to one another to best see something interesting. In Graphesis, a work by Johanna Drucker that works to articulate the creation of visualizations of data in humanities research, Drucker explains that how we know what we know about any given concept, is based on our models of knowing—our models, our visuals, “mediate our experience by providing conceptual schema or processing experience into form” (15). I think this is a provocative and durable statement to hold on to in thinking about DH methods because it both captures the essence of intrigue in the work—the desire to look at something differently to look for things we have not yet seen—as well as the relationship between how we represent data as visual constructions of patterns that exist within the materials we care for and research from.

A line of tension relative to these visuals—our forms of what we know and how we know—exist in DH as issues of disciplinary alignment, often vis a vis tensions in thinking of research method or methodology. This is perhaps too tidy an explanation for something I do not have the history nor breadth of examples to represent more robustly, but is something in need of articulating nonetheless. DH can be oversimplified at times as “tools”, or a fixation on technologies for collecting, reading, and visualizing as neutral or ahumanistic and more akin to work of the (computer) sciences . While these tools are essential to DH work, there are other accounts of research going on that care more for who designs the tools, has access to the tools, and what they afford and constrain—not just in findings, but in context of people, institutions, and social, economic, cultural, political, and philosophical factors. Johanna Drucker, in “Humanistic Theory and Digital Scholarship”, questions of humanities scholars what impact the humanities have had on the digital environment, and the possibility of digital platforms and interfaces that are created from humanistic methods instead of the borrowing of methods from outside of the discipline, which she describes as at odds with the cares and concerns of humanities work. She explains that humanities work has encountered digital tools, but what of humanities tools in digital contexts? I see this, again maybe too simply, as deep concern with methodology—how and what researchers are doing for what reasons, for whom or what. A humanistic approach, she explains,

“means that the premises are rooted in the recognition of the interpretative nature of knowledge, that the display itself is conceived to embody qualitative expressions, and that the information is understood as graphically constituted”.

While I will continue to refrain from defining DH too tightly or narrowly, or too constrained in one epistemology over another, I can say from my readings that DH is not just tools, or rather, DH is not about the uncritical creation and use of tools that don’t care for matters of concern in the tools interaction with contextual and textual networks of relations (person, place, time, and so on).

Text as Data: Mining, Encoding, and Method

For many, data has numerical connotations—in sciences, data begins as quantified observations, which visualization methods are selected to highlight; so what does data mean when as a discipline, texts are what are created as observations? Some scholars wish to distance data in the humanities from data in the sciences because it functions differently. Johanna Drucker compares data versus capta explaining that capta is “taken” actively while data is assumed to be a “given” that is able to be recorded and observed. The difference Drucker sees arising is that humanistic inquiry acknowledges that its knowledge is “situated, partial, and constitutive”—this is the recognition of knowledge as a construction, “not simply given as a natural representation of pre-existing fact” (“Humanities Approaches to Graphical Display”). While Drucker calls for a rethinking of data as capta that better expresses its ambiguity over certainty—which gets at what she describes as interpretative complexity—DH data needs to acknowledge the lens that it is constructing to look at its texts. I will provide a gloss of how data is garnered from humanities texts along with some methods for collecting it: data mining, textual analysis, and differentiated reading. To begin, data in the humanities does not have the same aims as data in the sciences. In the sciences, data is used to attempt to arrive at a truth, but in the humanities data is used to arrive at the questions we desire to ask (Stephen Ramsay, Reading Machines 68). The digital humanities exist to collect and create data, to notice something of interest, to formulate questions, to work to create visualizations that will allow us to better see patterns of interest, and to postulate all over again: data transforms theory, and theory transforms data into interpretation for further theorizing (Cathy Davidson, “Humanities 2.0: promise, perils, Predicitons”). Data is the locus of digital humanities work because it functions as a representation of information, that can be interpreted and reinterpreted, in a manner (or form) that functions as something to communicate, something to interpret, or something to process – it is what research works to articulate, make visible, and let articulate (Alex Poole, “Now is the Future Now?”). Texts are data, or are the potential to become sets of data because they are addressable as a thing of many parts; one can query a position within the text at a certain level of abstraction, like that of a character, word, phrase, line, etc. (Michael Witmore, “Text: A Massively Addressable Object”). Data are created from text based on what is of interest to look for, and are data as singular texts and as large corpora of texts, the size of the base relating to what is of interest t the researcher. Texts are used to create textbases from which various methods and tools might be implemented to draw out patterns and examine trends, which serve as catalyst for further questioning and research (Lang and Baehr, “Data Mining”). Text bases are “coherent collection of semi or unstructured digital documents” that “can come from any written discourse” which “all cohere in some manner”, existing as a corpora of documents assembled around a specific unifying principle that is either thematically or generically similar (Cooney et al.”The Notion of the Textbase”). A data set might come from a collection of romantic poetry in which the researcher wishes to see what adjectives are used to describe love—those adjectives from that set of texts would be the data. This data is created through a process called data or text mining. Text (or data) mining is, to put simply, is knowledge discovery; creating data to apply DH tools to is not just a step in a process toward results, but a practice of curiosity and inquiry in what might result—this process can be exploratory, descriptive, and even predictive in its function (Lang and Baehr, “Data Mining”, 176-179).

Data mining works in this way: first, the features of the text that are of interest to examine are determined, this is often done as texts are collected because something of interest surfaces to help determine how the texts will be read; the texts are then rendered into plain text format so that they can be read as their text only (this means they are stripped of style and layout in their publications); a number of tools can be used to perform mining on texts, but what is determined is what parts will be looked for as differentiated from other parts of the text—this means that a script must be created that informs the tool (computer program) how and what to read. Directives for reading the texts (for their particles, their pronouns, their adjectives, etc.) are created along with lists of words to not get hung up on, more commonly referred to as stop words. An algorithm is run on the texts to bring to the surface words that fit the criteria set and to disclude words that do not. What results will vary widely based on topic of inquiry, tool being used, and how the inquiry was framed. Distant reading and text analysis are not, to my understanding, different form data or text mining conceptually. All of these terms work to describe a shift in the scale at which we read texts—a move from closely reading one text from beginning to end as a whole, to ways of reading parts of a text or collection of texts that can shift scale on interaction based on what is driving inquiry. If I had to define each of these concepts to better frame this concept of texts as data, I would explain distant reading as a concept that drives DH work—a way of looking at texts differently, with distance or changes in scale, that seeks patterns that can become visible with this differentiated reading. Data or text mining is more procedural in that it is what is done to texts so that they can become data sets for inquiry. Text analysis is like text mining in that it is looking at parts of a text, however, instead of singular terms, text analysis is interested in concordance or the association or proximity of terms (Geoffrey Rockwell, “What is Text Analysis Really?”).

Visualizing Data: Making Visible the Form of Content

Visualizations are as diverse and varied as the imagination can conjure—pie charts, line graphs, scatter plots, bar graphs, word clouds, bubble diagrams, maps, and so on. Visualizations are what are created from the data to represent findings. While I can’t account for all of the types of visualizations (but will include links to some resources lists at the end of this review), I would like to discuss what thinking goes into creating a visual of textual data. Visuals can work by: offering a visual analogy, providing a visual image of non visible phenomena, and providing visual conventions to structure operations (Johanna Drucker Graphesis, 5). Because visualizations are articulations of patterns found in research, their design, or their form matters a great deal. A visualization brings attention to patterns, and thus needs accommodate a mix of evidence and argumentation. Visualizations are intended to show patterns of interest and should provide a way to find possible patterns to investigate and a means to locate those patterns against a baseline provided by a set of other relevant works (Radizkowsa et al. “Information Visualization for Humanities Scholars”).

In deciding what type of visualization should be sued, experimentation with forms is encouraged to develop a perspective in which to situate the interpretations (Radzikowsa et al). Tanya Clement, quoting computer scientist Ben Schneiderman, describes visualizations as providing “a window into research results but have an inherent limitation of space that results in the ‘occlusion of data, disorientation, and misinterpretation’” (“Text Analysis, Data Mining, and Visualizations in Literary Scholarship”). Visualizations are digital humanities’ lens to see its text differently and at different scales.

Scales of Seeing

Scale is a matter of concern in creating visualizations of data because it determines what is seen. Franco Moretti, in his oft cited work “Graphs, Maps, Trees: Abstract Models of Literary History” (which based on frequency of citation might be a seminal work in digital humanities), coined the term distant reading, which is a difference in scale in how texts are “Read” or encountered from a distance in a large corpora with the assistance of computational tools. He describes the significance of scale:

“What do literary maps do … First, they are a good way to prepare a text for analysis. You choose a unit–walks, lawsuits, luxury goods, whatever–find its occurrences, place them in space … or in other words: you reduce the text to a few elements, and abstract them from the narrative flow, and construct a new, artificial object like the maps that I have been discussing. And with a little luck, these maps will be more than the sum of their parts: they will possess ‘emerging’ qualities, which were not visible at the lower level” (53).

Differential reading, or reading at scales, defamiliarize texts, making them unrecognizable in a way (putting them at a distance or oppositely at a proximity) that helps identify features otherwise unseen, to make hypotheses, generate questions, and figure out patterns and how to read them (Clement, “Text Analysis, Data Mining, and Visualizations in Literary Scholarship”). Scales of interaction share a common objective: detail in our data – “The objective is much the same: to restore to our field of view precisely that which is right beneath our nose but too ubiquitous to be synthesized in the human mind” (Flanders and Jockers, “A Matter of Scale”). Reading close and distant isn’t necessarily how we think of close and distant as binaries in proximity to an object. Scale more operates along a continuum that shifts as attention to particular parts of texts or across text shifts; which allows us to “pay attention to databases, data flow, data architectures and the human element behind them” (Clement). Scale can take place through the construction of the data set, or textbase, the number of materials in the corpora, or in the tools being used to read the data. This notion of scale as both close and distant, micro and macro, surfaced in how scholars discussed tools as supportive of that kind of shuttling between different levels of scale: seeing patterns, seeing outliers, zooming in and zooming out (Flanders and Jockers, “A Matter of Scale”). “The computer revolutionizes, not because it proposes an alternative to the basic hermeneutical procedure, but because it reimagines that procedure at new scales, with new speeds, and among new sets of conditions” (Stephen Ramsay, Reading Machines, 31)

Critical Considerations

“We need to acknowledge how much the massive computational abilities that have transformed the sciences have also changed our field in ways writ large and small and hold possibilities for far greater transformation in—research, writing, and teaching—that matter most” (Cathy Davidson “Humanities 2.0: Promise, Peril, Predictions”).

Criticisms and cautions in DH work are complex negotiations of context, resources, and cultural value. What emerged from many texts was the notion that DH work is done to advance the goals of humanist scholarship; what differs, though, are the affordances and constraints of working in the digital. Jamie Skye Bianco asks “does DH need an ethical turn?” to which she responds yes because it operates through webs of people, institutions and politics in uneven networks of relation. People and institutions are a part of DH work: they have/n’t access to texts to research, are/n’t represented in texts, have/n’t access to tools for research, and have/n’t access or representation in what is created. Texts are contextual, they are heterogeneous and dynamic; but reading them for their semantic parts and rendering them as visualizations of selected parts that are oft negligent of situating in the whole being can run the risk of de-emphasizing the human element of the humanities. This risk may come from separating the methods of doing DH work (the tools) from the theories that give impetus to the work. This separation of theory and method risks flattening context by not revealing difference; “the constellation of context, affect, and embodiment must remain viably dynamic and collaborative in digital and computational work” (Bianco, “The Digital Humanities Which is Not One”). Because digital and computational work “documents, establishes, and affectively produces an iteration of real worlds” that are “multimodally layered” (Bianco), not losing context (and its embedded elements) becomes matter of concern. The challenge is to shift humanistic study from attention to effects of technology to a humanistically informed theory of making of technology – considerations of affect, the constructivist force of knowledge as observer dependent and emergent (Drucker, “Humanistic Theory and Digital Scholarship”). Digital work needs to consider the realms of the digital, and the context that are digitized and situated around digital materials, need to be envisioned as “shared knowledge, culture, and semantic content” (Bianco).

Sustainability, Durability, and Curationability of DH

In reading about the tremendous labor that goes into this work—from digitizing and collecting texts in searchable databases with flexible metadata, to inventing and maintaining the tools, to creating and housing projects—I could not help but question who cares for DH and according to what protocols. For work that is necessarily digital, I wondered about the durability, even the lifespan, of such projects. While some work is being done to curate DH research, the uptake is, at this time, thin. Matters of concern in terms of accessibility and availability of data seems of highest priority. Work is being done, from a variety of institutions and organizations, devoted to the preservation and representation of DH research to promote research on texts as cultural artifacts (Cooney et al.”The Notion of the Textbase: Design and Use of Textbases in the Humanities”). The goal is to move beyond issues in just aggregating data toward managing DH content as knowledge (Graban et al.), which requires larger dialogues about access, proprietary rights, the boundaries of technologies, and conflicts between personal and communal interest (Graban et al. “In, Through, and About the Archive: What Digitization (Dis)Allows”). Key issues that affect curation include the size of the data set (digital files of large corpora are tremendous in size), the number of objects to be curated and their complexity, the interventions needed to care for the data, ethical and legal concerns, policies, practices, standards, and economic incentives (Poole, “Now is the Future Now? The Urgency of Digital Curation in the Digital Humanities”). Aside from standards that would need to be set in care of each of these issues, much would have to be created in terms of infrastructure to take on such content—it would need to be flexible, scalable, and economically and technologically sustainable. Interfaces for both human and machine curators would have to be created as standardized for managing this content. Additionally, in order to create such a system and an interface to that system, metadata standards would have to be created and agreed upon for content to be identified/identifiable and retrievable so that it is useful (Poole). While digital content, and some tools and services exist, they are, currently, not necessarily useful or usable (Poole).

Treading Water in DH Flows

This literature review, as to be expected in being a beginner orienting myself in new texts and ideas, barely scratches the surface. Because I’m talking about DH more generally, instead of focusing on any particular tool or method, this work is more like a survey of establishing traces of work to re-immerse myself in, orienting as interest or use dictates. Until then, I take pause on thinking through this handful of sources to establish connections, figures, concepts, ways of doing, and navigating DH not as a singular discipline, but an assemblage of many.

Resources

This is in no way comprehensive (but on the web it can be amended and tended to, maybe in a separate location on my blog). Here is a (small) handful of lists and links to explore DH tools, concepts, and projects.

Glossary of terms from MLA Commons by Daniel Powell, Constance Crompton and Ray Siemens

DH tools for beginners

Getting Started in the Digital Humanities from Lisa Spiro – executive director of Digital Scholarship Services at Rice University’s Fondren Library

CUNY Digital Humanities Resource Guide

HASTAC (Humanities, Arts, Sciences, and Technology Alliance and Collaboratory) Digital Humanities Resource Guide

Digital Humanities at Princeton Resource Guide

Digital Humanities Research Tools and Resources Guide by University of Illinois Urbana-Champaign

Digital Humanities tool list built by Alan Liu

Tutorials for DH Tools and Methods list built by Alan Liu

Journal of Digital Humanities -“comprehensive, peer-reviewed, open access journal that features the best scholarship, tools, and conversations produced by the digital humanities community”

The International Directory of Digital Humanities Centers

Bibliography

In conversations in our Rhetoric, Composition and Digital Humanities seminar with Collin Gifford Brooke, CGB has described an interest in creating bibliographies that assign and visualize weight in the use of texts in a written work. I rather liked this idea and attempted to make visible how much I used each text in my bibliography. Texts used the least are in 12 point font (one citation), while texts used the most are in 24 point font (five citations) —with a range in between to capture the distributed attention the text received.

Bianco, Jamie “Skye”. “This Digital Humanities Which Is Not One.” Debates in the Digital Humanities. By Matthew K.. Gold. Minneapolis: Univ Of Minnesota, 2012. Print.

Clement, Tanya. “Text Analysis, Data Mining, and Visualizations in Literary Scholarship.” Literary Studies in the Digital Age. MLA Commons. Web.

Cooney, Charles, Mark Olsen, and Glenn Roe. “The Notion of the Texbase: Design and Use of Textbases in the Humanities.” Literary Studies in the Digital Age. MLA Commons. Web.

Davidson, Cathy N. “Humanities 2.0: Promise, Perils, Predictions.” Debates in the Digital Humanities. By Matthew K.. Gold. Minneapolis: Univ Of Minnesota, 2012. Print.

Drucker, Johanna. “Graphesis: Visual Knowledge Production and Representation”. Poetess Archive Journal 2.1 (2010): 1-50. Web.

Drucker, Johanna. “Humanities Approaches to Graphical Display.” Digital Humanities Quarterly 5.1 (2011). Web.

Drucker, Johanna. “Humanistic Theory and Digital Scholarship.” Debates in the Digital Humanities. Minneapolis: Univ Of Minnesota, 2012. Print.

Graban Tarez Samra, Alexis Ramsey-Tobienne and Whitney Myers. “In, Through, and About the Archive: What Digitization (Dis)Allows”. Rhetoric and the Digital Humanities (forthcoming).

Jockers, Matthew L. and Julia Flanders. “A Matter of Scale”. UNL Digital Commons. Web.

Lang, Susan and Craig Baehr. “Data Mining: A Hybrid Methodology for Complex and Dynamic Research”. College Composition and Communication 64:1 (2012): 172-194.

Moretti, Franco. “Graphs: Maps, Graphs, Trees: Abstract Models for Literary History”. New Left Review: 28 (2003): 67-93.

Poole, Alex H. “Now is the Future Now? The Urgency of Digital Curation in the Digital Humanities.” Digital Humanities Quarterly 7.2 (2013). 30 Jan. 2014. Web.

Radzikowsa, Milena, Stan Ruecker, and Stéfan Sinclair. “Information Visualization for Humanities Scholars.” Literary Studies in the Digital Age. MLA Commons. Web.

Ramsay, Stephen. Reading Machines: Toward an Algorithmic Criticism. Urbana: University of Illinois Press, 2011. Print.

Rockwell, Geoffrey. “What is Text Analysis, Really?” LLC 18.2 (2003): 209-219.

Witmore, Michael. “Text: A Massively Addressable Object.” Debates in the Digital Humanities. Minneapolis: Univ Of Minnesota, 2012. Print.

Annotated Bibliography: Digital Humanities Methods

February 11, 2014June 26, 2015 / Jana Rosinski / Leave a comment

I feel like this annotated bibliography should come with a disclaimer: this isn’t any sort of definitive digital humanities methods collection. That’s not an expression of self-deprecation, but a sincere reflection on how difficult it is to frame a research method in DH when there isn’t one. If I could put digital humanities simply (and this is of course a too flattened depiction of the work), the research methods are the ways of doing DH work, and there’s any number of ways to do DH work based on variances in tools, textbases (text collections), and purpose. Going into this project, I was aware that I was going to encounter some difficulty in selecting core texts to the discipline about research methods. While digital humanities isn’t new per se, and there is an abundance of texts at varying scales and scopes and domains, I was looking for resources that attended to the tools and ways of doing digital humanities work that also cared for the methodological (or epistemological) impetus for the work, while not neglecting visualizing the work being done—not leaving it in theoretical abstraction.

I am fortunate enough to be taking Rhetoric, Composition, and Digital Humanities with Collin Gifford Brooke this semester, whose syllabus served as a conduit for finding sources. While some of the resources are texts (collections) we discuss/encounter in class, many of them were located vis a vis these texts. I found that Collin’s class functioned, for me, as the anchor in Cheryl Geisler’s “Anchoring in Literature” approach to finding good sources. As someone who identifies as a scholar of digital humanities, as he puts it “I’m a digital humanities person, back from before there was a digital humanities”, his knowledge was entry into locating texts in the discipline. Looking at my collected sources, it can be seen that Johanna Drucker comes up as three different sources, and that two collections—Debates in the Digital Humanities and the MLA Commons volume “Literary Studies in the Digital Age”— house a large percentage of the other collected resources, serving as foundational texts. Here, though, I would like to unpack what makes these foundational texts foundational, even though they don’t quite follow Cheryl Geisler’s cited reference search process (image page 3 from “Anchoring in the Literature”)

Many of these texts are digital and were designed to be digital—that is, many of these are not digitized version of print texts, but texts designed to be interacted with (though to varying extents and with different tools) through layered textual features. These foundational texts, while being foundational due to their being collections of selected works, seem to retain a certain fluidity that keeps them from becoming “The Collected Works of DH, edition 2013”—they’re responsive and responding to developments in the discipline.

I approached collecting these sources thinking about them as tags or keywords within a digital humanities cloud—representations of tools, interfaces, ethics, durability, text materials, etc. With further (or maybe closer and more distant reading) reading, I would like to visualize these texts more as an interface to doing digital humanities work, treating these texts as a textbase to apply DH methods to. What I would like this annotated bibliography to become is less an inventory of “big names” or key articles, and more a representation of what DH work can do.

Bianco, Jamie “Skye”. “This Digital Humanities Which Is Not One.” Debates in the Digital Humanities. By Matthew K.. Gold. Minneapolis: Univ Of Minnesota, 2012. Print.

“The digital humanities is one subset of computational and digitally mediated practices, though its current discursive regime articulates itself as an iteration of the one world, a world both felt and real. But work in computation and digital media is, in fact, a radically heterogeneous and a multimodally layered—read, not visible—set of practices, constraints, and codifications that operate below the level of user interaction. In this layered invisibility lies our critical work.”

Bianco is calling our attention to the “layered invisibility” of DH work that goes beyond consideration of context. It is the goal of DH research to reach broader publics outside of institutional academic, which takes on complicated considerations of ethics at every level of the research—the textbase created, how the textbase will be “read” for what interpretations, how the interpretations will be represented, collaboration, and access.

Clement, Tanya. “Text Analysis, Data Mining, and Visualizations in Literary Scholarship.” N.p.: n.p., n.d. N. pag. Literary Studies in the Digital Age. MLA Commons. Web. 5 Feb. 2014.

Clement opens with an oft evoked resistance to digital methods in humanities research based on the interpretation of “these tools seem too objective or deterministic—digital tools seem to take the “human” (e.g., the significance of gender, race, class, religion, sexuality, and history) out of literary study”. Clement sets out to challenge this resistance by “presenting several computer-assisted modes of scholarship that depend on differential (close and distant, subjective and objective) reading practices, technologies of self-reflection and collaboration, and the value of plausibility, all of which have always been crucial to literary inquiry”.

Clement offers a rich sampling of projects using a variety of digital tools, and explores the complex negotiation between the human and nonhuman actors in DH research as co-actors, or extensions of one another. Clement reveals work that connects method to methodology.

Cooney, Charles, Mark Olsen, and Glenn Roe. “The Notion of the Texbase: Design and Use of Textbases in the Humanities.” N.p.: n.p., n.d. N. pag. Literary Studies in the Digital Age. MLA Commons. Web. 5 Feb. 2014.

Cooney et al. define textual database as “a term that denotes a coherent collection of semi- or unstructured digital documents – any realm that produces written discourse”. They explain that

“Textbases all cohere in some manner. Unlike their cousins, large repositories of digitized texts like Project Gutenberg or Google Books, textbases exist as corpora of documents assembled around some specific unifying principle” and are built to enable text-centered scholarly research. Cooney et al. work to describe a selection of humanities databases with attention to their design principles that inform how they can be used, as well as scholarly approaches to work that can be done from textbases. This exploration between design of a textbase and what it dis/allows provides a more nuanced look at data mining, patterns, and visualization.

Davidson, Cathy N. “Humanities 2.0: Promise, Perils, Predictions.” Debates in the Digital Humanities. By Matthew K.. Gold. Minneapolis: Univ Of Minnesota, 2012. Print.

Davidson explains that “Humanities 2.0 is distinguished from monumental, first-generation, data-based projects not just by its interactivity but also by an openness about participation grounded in a different set of theoretical premises, which decenter knowledge and authority.”

There are a number of texts that seek to define digital humanities and differentiate, to varying degrees, how DH is distinct (or not) from humanities. Davidson’s text calls attention to the need of our attention—that technological changes have transformed the humanities (massive computational abilities) and that the discipline should be critically considering the implications on working and the future of work:
“Perhaps we need to see technology and the humanities not as a binary but as two sides of a necessarily interdependent, conjoined, and mutually constitutive set of intellectual, educational, social, political, and economic practices”.

Drucker, Johanna. “Graphesis: Visual Knowledge Production and Representation”. Poetess Archive Journal 2.1 (2010): 1-50. Web.

Ducker carefully explains the conceptual use of a methodology of data visualization explaining that “How we know what we know is predicated on the models of knowing that mediate our experience by providing conceptual schema or processing experience into form” (15). Visualizing our materials (the texts of our discipline) as a data set “is concerned with the creation of methods of interpretation that are generative and iterative” which have the potential “of producing new knowledge through the aesthetic provocation of graphical expressions” (41). This text carefully articulates descriptive critical language for the analysis of graphical knowledge and makes the case for studying visualization from a humanities perspective. Given that much DH work involves visualizations of data from some textbase, this text seems a useful exploration of the conceptual use of visualizations both in terms of creating and reading them.

Drucker, Johanna. “Humanities Approaches to Graphical Display.” Digital Humanities Quarterly 5.1 (2011): n. pag. Web.

Like Drucker’s Graphesis, which is working to situate quantitative visualizations not typical to humanities research, she continues to carefully explain the use of conceptual borrowing of natural and social sciences methods of graphical displays of information, but the limitations it carries into humanities work. Due to the nature of knowledge in humanities work as interpretive and co-dependent with the observer, Drucker is making the case for a humanities terming of data as constructed, and for expression to show ambiguity and complexity.

This has potential for DH work in that Drucker is working to make the quantitative methods humanities research is borrowing more fitted to humanities research—that is, moving from objective to interpretive. Although DH uses nonhuman agents in its research as ways of doing work, the human element, the semantic, is essential.

Drucker, Johanna. “Humanistic Theory and Digital Scholarship.” Debates in the Digital Humanities. By Matthew K.. Gold. Minneapolis: Univ Of Minnesota, 2012. Print.

Drucker frames her project with the question – “Have the humanities had any impact on the digital environment? Can we create graphical interfaces and digital platforms from humanistic methods?”

She sets out to articulate digital methods and theory that fit the humanistic value of humanities research, instead of explaining how digital methods might be applied to humanities research.

“We can cast an interpretative gaze on these instruments from a humanistic perspective, and we can build humanities content on their base; but we have rarely imagined creating computational protocols grounded in humanistic theory and methods. Is this even possible? Desirable? I suggest that it is essential if we are to assert the cultural authority of the humanities in a world whose fundamental medium is digital that we demonstrate that the methods and theory of the humanities have a critical purchase on the design of platforms that embody humanistic value.”

Drucker’s project is useful to DH research in that the methods or tools used should appropriately fit the purpose of the project to assert validity, authority, and value in the research.

Jockers, Matthew L. and Julia Flanders. “A Matter of Scale”. UNL Digital Commons. Accessed on 5 February 2014. Web.

These slides and accompanying script represent the keynote lecture of the Boston Area Days of Digital Humanities Conference at Northeastern University on March 18, 2013. The keynote was a staged debate between Julia Flanders and Matthew Jockers addressing the “matter of scale” in DH research. While scope is something that is addressed in designing humanities research projects, scale needs to be considered for DH projects—how closely or distantly the textbase is being “read”. Scale, or the micro or macro approach, influences the patterns, what is uncovered, that can be represented. Scale dis/allows patterns to be seen.

Lang, Susan and Craig Baehr. “Data Mining: A Hybrid Methodology for Complex and Dynamic Research”. College Composition and Communication 64:1 (2012): 172-194.

The process of data mining works to move from the lore, or anecdotal evidence of relative small sample size as justification and evidence of our assertions in scholarship (174) to uncovering, or making visible, interesting information in large amounts of data – the texts produced (176). They are cautious to note that data mining cannot provide simple answers from noticing, but is a methodology that operates as exploratory, descriptive, and predictive of patterns (177).

Lang and Behr work to define data mining as a research methodology—a move toward quantitative research that cares for the qualitative, or narrative, aspects of humanities research. This source is useful to illustrate how DH methods that seem unhumanistic because they are graphically based and are absolutely co-dependent on humanistic interpretation.

Moretti, Franco. “Graphs: Maps, Graphs, Trees: Abstract Models for Literary History”. New Left Review: 28 (2003): 67-93.

Moretti is considered one of the key, or originating scholars in DH due to his work on distant reading, or reading a large textual corpus from a distance with computational aid. Instead of closely reading a text for meaning, Moretti explores what meaning might arise from a collection of texts if they can be represented at the level of patterns across a corpus. Moretti calls us to question not just how we interact with our materials in scholarship, but at what scale. Distant reading alters the scale at which we encounter and interact with our materials by moving away, that is gaining distance from, the manner in which we read, understand, question, and act Moretti described as the impetus to his work, that “a field this large cannot be understood by stitching together separate bits of knowledge about individual cases, because it isn’t a sum of individual cases: it’s a collective system, that should be grasped as such, as a whole” (68).

Moretti’s distant reading models of literary historiography serve as a representation of DH research projects: creating a textbase, establishing a cultural significance or purpose in looking across the textbase, creating a means to mine the textbase to identify patterns, and creating a visualization that allows the patterns to be seen as emergent from the corpora.

Poole, Alex H. “Now is the Future Now? The Urgency of Digital Curation in the Digital Humanities.” Digital Humanities Quarterly 7.2 (2013): n. pag. Digital Humanities Quarterly. The Alliance of Digital Humanities Organizations, 30 Jan. 2014. Web. 5 Feb. 2014.

Poole concludes his article with the call of “now” that served a simpetus for his work:
“In 2009, Christine Borgman asserted that “Digital content, tools, and services all exist, but they are not necessarily useful or usable” [Borgman 2009]. Despite obvious progress in digital curation in the humanities, she issued a “call to action” to stakeholders and insisted the “future is now.” Three years later, we may — we must — ask the same question, lest we are reduced ultimately to exclaiming, along with Michael Buckland, “What a waste!”

Poole sets out to:

define and situate the digital humanities and both data and Big Data
probe digital curation
discuss the professionals who curate data, the key issues in data curation and how best to approach them, the importance of a lifecycle approach, the machinations of sharing and reusing data, and the role of data management planning.
explore reports on and case studies of digital curation undertaken
consider the trajectory of digital curation efforts
assess the state of digital curation in the humanities in 2013

While Poole is not alone in raising concerns over durability of DH work, this resource looks at digital curation for research that is being undertaken.

Powell, Daniel, Constance Crompton, and Ray Siemens. “Glossary.” N.p.: n.p., n.d. N. pag. Literary Studies in the Digital Age. MLA Commons. Web. 5 Feb. 2014.

While a glossary might not seem useful in terms of an account of research method, there is usefulness in having a collection of tools, textbases, DH projects, and potential methods for research. This glossary does represent keywords in the discipline, but not as definitions, more like a resource collection.

Radzikowsa, Milena, Stan Ruecker, and Stéfan Sinclair. “Information Visualization for Humanities Scholars.” N.p.: n.p., n.d. N. pag. Literary Studies in the Digital Age. MLA Commons. Web. 5 Feb. 2014.

Radzikowska et al. define their project as creating “A primary index to the quality of visualizations for humanities scholars is the quality and originality of scholarship that the systems support. In each of the projects mentioned here, we have been working with humanities researchers in an effort to produce a useful visual form of the data. Since humanities scholarship is often exploratory, we have also come to believe that interactive formats are in most cases preferable to static ones, since they allow the person using the system to add and subtract elements, experiment with different forms, pursue hunches or insights, and so on.”

The usefulness of this work, of which they have created two, is a change in the interface of available materials to do DH work from. Interactive visualizations work to explore available information by visual grouping instead of hierarchical classification schemes.

Rogers, Richard. Digital Methods. Cambridge: MIT, 2013. Print.

“This is not a methods book, at least in the sense of a set of techniques and heuristics to be lugged like a heavy toolbox across vast areas of inquiry. It is also not the more contemporary exemplar of the instruction manual or list of answers to frequently asked questions…Rather, this book presents a methodological outlook for research with the web” (1).

Rogers is setting out to term “methods of the medium”, what he explains as methods embedded in online devices, to think along with online methods and digital objects. Rogers is setting out to argue that thinking along with devices and the digital objects they handle that digital methods as a research practice will work to follow the evolving methods of the medium (1).

This text, while not definitively DH, nor a text on methods, holds potential for research that repurposed methods of the medium for research that is not exclusively about digital or online culture—using methods and tools for what they make visible in topics of cultural interest.

Witmore, Michael. “Text: A Massively Addressable Object.” Debates in the Digital Humanities. By Matthew K.. Gold. Minneapolis: Univ Of Minnesota, 2012. Print.

Witmore’s text explores what it means to be a text as evidenced in these excerpts:

“What does it mean to be an “item” or “computational object” within this collection? What is such a collection? In this post, I want to think further about the nature of the text objects and populations of texts we are working with”

and

“I would argue that a text is a text because it is massively addressable at different levels of scale. Addressable here means that one can query a position within the text at a certain level of abstraction.”

This resource has potential as it opens up texts to better understand how and why they are sites for researching. Understanding texts as addressable at different levels of scale helps conceptualize the how textbases and text corpora can articulate a purpose or interest for DH research. While the data mining approach (how one is sifting through texts) and the visualization (representation of what is uncovered) are important, many texts seem to focus on those aspects of research, and not how the projects come to light in the first place—their textual origins—from what we can envision DH projects.

page tectonics

small thing frequencies

research methods

DH Methods Presentation

Working DH Literature Review

Annotated Bibliography: Digital Humanities Methods