Last week in RCDH, Collin challenged us to visualize our semester. He left the form and content up to us. As I tend to do, I thought it would be interesting to look at it from the level of composites, allowing me to focus on different intensities through different lenses of focus.
From Tim Baker and Chris Shier’s gifmelter, I started with a gif that became an inside joke between Jason, Lindsey and myself after I used it as a response in a course last semester. Breaking the constraints of the gif (rectangle, perception based on timed loops), the image can be viewed differently. I think this particular iteration takes on new meaning at this scale (particularly poignant in its ability to be broken apart and pretty representative of this semester):
A significant event for me this semester was presenting (for the first time) at 4Cs. The panel I presented on was well received, but it also projected. These are ideas that are very much unsettled, active. Threads from this text network have been pulled, followed, and have connected me to other people/texts/ideas.
Another significant event this semester was participating in THATCamp CNY. While I didn’t showcase/workshop anything, I was party to interesting tools, conversations, fruit trays, and distributed dh connections across the campus. I wanted to do something with my tweets to reconstruct the events of those two days, as well as account for the retweets and favorites that connected each of my own tweets, as well as tweets I was drawn to saving/responding to. This is rather messy, but it’s an attempt at a sort of network (this is only the first of two days and chronicles the movement of my tweets + tweets I was mentioned in):
solidorange line: quoting speaker at THATCamp
solid black line: RT from someone in attendance
solid green: response to someone in attendance
solid purple: mentioned by someone in attendance (but not present)
dotted blue: favorited by someone not in attendance
long dashed black: favorited and RT by someone in attendance
dotted black: favorited by someone in attendance
black names: people present
blue names: people distant
Out of 17 tweets from those two days, 12 were tweets I RT or favorited. It’s debatable how to measure the furthest reaching; the picture of @ahhitt and I with Otto the Orange had the most favorites and left the central NY area, but the response/comment I made to @ahhitt that evoked @s2ceball might be the furthest reaching, if she is still in Oslo on her Fulbright…
Not pictured: Instagram foods and activities and other odd things; readings; notebooks and doodles; calendar of work and due dates; schedule of sleeping/waking; the starts and stops of an exercise routine; how many/little miles I walked each day; the amount of muscle rub applied to my “computer neck” on a daily basis; how many times I responded “[sigh] goooooood” when asked how I was doing vs. some other response.
In RCDH this week we’re discussing metadata by way of a collection of interesting readings. Jessica Reyman’s “User Data on the Social Web: Authorship, Agency, and Appropriation” (and texts she has shared via Twitter @jessreyman) primarily occupied my interest. I feel like I know enough about metadata to nod along in conversations where it surfaces, to understand that what it can index about data created on the web is tremendous (see the cross space ads encountered that track patterns of search and purchase or these Twitter metadata visualizations), that it is aggregated by large (unseen or hiding in plain sight) entities, and that the line between it and what I “create” on the web is grey. Reyman gets at this grey area by exploring the complexities of using social web technologies, recognizing all of the productive activities that occur—not only the creation of content, but the overlooked generation of data (515). Reyman poses the following:
How do we write on the social Web?
How do we write ourselves through the social Web?
And how do we write the social Web? (516)
It is Reyman’s argument that data is not merely the by-product of algorithms and aggregation formulas, but “a dynamic, discursive narrative about the paths we have taken as users, the technologies we have used, how we have composed in such spaces, and with whom we have participated” (516). Reyman provides examples of different social web platforms policy language on the distinction (or not so clearly delineated) between data the sites collect (which is framed through a rhetoric of enhancing user experience) and the content users produce. Reyman proceeded to blow my mind with a definition of data that I didn’t know: data and content/information are not viewed as the same thing (519). The data generated through content production is viewed as an “authorless object” (524), or one authored by technologies (nonhumans) alone. User consent requires only an agreement to the services when signing up, not an active role in understanding or mediating responsibilities of use (521). What Reyman argues is that data should be considered as a text coauthored with technologies, texts, and other users, which would reconceptualize it as living instead of as a sort of waste by-product of the content (523). Citing Krista Kennedy’s “Textual Machinery: Authorial Agency and Bot-Written Texts in Wikipedia”, Reyman makes visible the interaction between humans and nonhumans in data production “user data generation depends on users, on their interactions, participation, and production. It does not exist without them” (527).
Reyman tulimately draws our attention to the need to balance technology companies rights’ over user data (that they claim for the free services provided) with rights’ of the users, explaining that “With a balancing of users’ and technology companies’ rights over user data, the social and participatory Web could be nourished as a space that provides access to tools for participation and production, and also recognizes the value of human agency required for rich, meaningful social networks” (529).
In thinking about how we write ourselves and the social web through the social web, I’m wondering what means of balancing data are available beyond privacy fences (DoNotTrackMe and Ghostery) that can be built around browsing. I find myself thinking about the work the Creative Commons is doing to help users understand creating and interacting with web content; does data fall within Creative Commons domain? Is it an issue of outdated copyright law policies (seeing data as by-product rather than content) in need growing (more dynamically) with the times? I’m also curious to learn about more projects/texts created from data that could make visible what is too often dismissed as content debris—developing a meta awareness of metadata in interacting with the web.
“Environments are not just containers, but are processes that change the content totally.” Marshall McLuhan
In Unframing Models of Public Discourse, Jenny Rice works to destabilize the frame of the rhetorical situation by proposing that situations operate in more ecologically complex networks of “lived practical consciousness or structures of feelings” (7) – not a move to dissolve all boundaries for looking at the context of a situation, but destabilizing them enough to account for the limitations in too discrete of borders. Rhetoric is better understood and practiced (enacted) as the complex art that we describe it to be. Ecological metaphors for conceptualizing the growing and evolving structure of the field of rhetoric and composition are not new. The work of Marilyn Cooper, Jenny Edbauer Rice, [fill in others] have worked to move beyond metaphors for ways of accounting for rhetoric that leaks beyond the static rhetorical situation (Lloyd Bitzer’s rhetorical situation,1968, is an iconic and valuable text in the field that examined rhetorical discourse as called into existence by a situation. Despite its age, it is still of use. However, I am interested in more complex models to better articulate discourse as unfolding through media – in space/place, time, and power to invoke through the connectivity of web linking and metadata) into the complexly articulated social. While I draw from these works to serve as theory for my own project, I take interest in employing ecology as methodology for engaging texts. Ecology as methodology seeks to take the structure and nature of ecologies and enact them as ways of doing, or methodology. This isn’t a singular methodology, or a direct translation from ecology as theory to ecology as method, but an attempt to render this work visible. If ecology as theory is interested in dynamic models of discourse that seek to articulate the complexity of what comprises a text (and we account for text in an expanded notion of article, book, dialogue, image, graphic, and so on) with an emphasis on the material assemblage, then ecology as methodology is means of accounting for the materials that compose these dynamic models of text. Methodology as ecology seeks to make invisible intermediaries that compose text visible, accountable, and articulable. Accounting for the articulation of materials may allow for the articulation of interaction with texts based on the collection of materials individually, and in highlighting different relationship between the materials. Through employing distant reading methods, data mining, visualization, and pattern recognition as methodologies of ecology, represented as tagclouds, graphs, timelines, and maps, I will create ecological representations of Marilyn Cooper’s “The Ecology of Writing” (1986) as a means of making visible the ecological nature of text. “The Ecology of Writing” was chosen as the textual sample because it is arguably the piece of scholarship that introduced ecologies into rhetoric and composition – other pieces referenced here can be traced back to Marilyn Cooper’s text through bibliometric and citation information. Additionally, exploring the creation of ecologies of “Ecology” is meant to be playful. Cooper’s text is a response the cognitive process model dominant in the field at the time, which framed writing as entirely produced in the head of the author. Cooper’s critique was that such a model idealized the notion of a solitary author, which isolated the author from the social world; she remarked that “Such changes in writing pedagogy indicate that the perspective allowed by the dominant model has again become too confining” (366). Language, to Cooper, was social; ushering in a social turn, Cooper proposed an ecological model of writing “whose fundamental tenet is that writing is an activity through which a person is continually engaged with a variety of socially constituted systems” (367). Cooper distributed the material constituents of writing in more complex social systems comprised of characteristics other writers and writings;
“Writing, thus, is seen to be both constituted by and constitutive of these ever-changing systems, systems through which people relate as complete, social beings, rather than imaging each other as remote images: an author, an audience” (373).
To further ecological work, I take up the task of making models of systems (what constitutes materials and how they connect) to better understand ecological relationships.
In 2005, the National Humanities Center granted the Richard W. Lyman Award, a recognition of scholars who have advanced the humanities through the use of information technology, to John Unsworth, Vice Provost, University Librarian, Chief Information Officer, and Professor of English at Brandeis University. In his lecture, Unsworth made a call to action for more work to be done in the humanities that employs text-mining, data-mining, visualization, modeling, and pattern recognition in a large corpus of texts with “the goal of data-mining (including text-mining) is to produce new knowledge by exposing similarities or differences, clustering or dispersal, co-occurrence and trends.” Unsworth described the work of the NORA Project, MONK (metadata offer new knowledge), and WordHoard. The NORA project was “a multi-institutional, multi-disciplinary Mellon-funded project to apply text mining and visualization techniques to large humanities-oriented digital library collections. The goal of the Nora project was to produce software for discovering, visualizing, and exploring significant patterns across large collections of full-text humanities resources in existing digital libraries.” MONK “is a digital environment designed to help humanities scholars discover and analyze patterns in the texts they study”; all code from the project is made available as open source. MONK describes the ongoing value in this mining and visualizing work by stating that
“the scholarly use of digital texts must progress beyond treating them as book surrogates and move towards the exploration of the potential that emerges when you put many texts in a single environment that allows a variety of analytical routines to be executed across some or all of them.”
Like MONK, WordHoard is open source as a program that can be downloaded. WordHoard applies corpus linguistics to a collection of texts to assign tags “according to morphological, lexical, prosodic, and narratological criteria”; the project explains that “Deeply tagged corpora of course support more finely grained inquiries at a verbal or stylistic level. But more importantly, access to the words of a text at such microscopic levels also lets you look in new ways at the imaginative worlds created by those words.” While these projects are of a much larger scale (and with many more pairs of eyes), similar data mining and visualizing methods can be applied to create an ecological reading on a much smaller (and budget minimalistic) scale.
Ecological Reading as Invention: Method
In The Future of Invention, John Muckelbauer states that it is
“not always the case that an inventive inquiry works best when it responds to a problem by seeking its solution, or responds to the question with an answer; it may be well that my turning away from the question, we can uncover a different kind of trajectory, even a different kind of relentless directness with which to engage the problem.” (149-150)
As a newcomer to data mining, and one who is primarily working through trial and error without the aid of purchased software, much of this work is scrappy – making do within limitations of what I have — reading is done by eye, but at an alteration of scale. The emphasis, whether or not one is using software, is to alter the reading of a text by changing the scale, or trajectory in Muckelbauer’s words, of how the text communicates – drawing out keywords and phrases. To begin, after locating the full text PDF of the article through the JSTOR database, I used the note taking application, Evernote, to function as an OCR (optical character recognition) “software.” Creating a new note, I dragged and dropped the article in the space. The text is then copied and pasted into a text edit program to render it as plain text. This plain text is used to generate the tagclouds at Tagcrowd.com. Parameters can be set within the site to ignore stop words, to list frequency values, and to control how many keywords are represented in a cloud. Reading across the three clouds, I also created a list of the ten most frequently used words in the article, which served as the basis for an infographic. Ten most frequently used words with their numerical frequency in the article:
To create the data for the other visualizations, the computer’s search feature was used to locate keywords, which were then tallied in an inventory list. I searched for “Cooper,” “ecology,” and “ecological” within each of the articles that I collected that cited “Ecology” in their bibliography. I used these keywords to locate what was quoted and summarized of Cooper’s article as well to generate data of what elements of her text are being referenced. I located 14 quotes from her article and examined which pieces each article used in in-text citations. The use of quotations served as data to construct two infographics based on citation patterns. The last data searched NCTE (National Council of Teachers of English) database for articles that cited Cooper’s article, which accounted for the following:
Aviva Freedman “Show and Tell? The Role of Explicit Teaching in the Learning of New Genres” – Research in the Teaching of English 1993
Amber Buck “Examining Digital Literacy Practices on Social Network Sites” –Research in the Teaching of English 2012
Bruce Horner “Students, Authorship, and the Work of Composition” – College English 1997
Anish M. Dave and David R. Russell “Drafting and Revision Using Word Processing by Undergraduate Student Writers: Changing Conceptions and Practices” – Research in the Teaching of English 2010
Sidney I. Dobrin and Christian R. Weisser “Breaking Ground in Ecocomposition: Exploring Relationships between Discourse and Environment” – College English 2002
Jeremiah Dyehouse, Michael Pennell, and Lida K. Shamoon “Writing in Electronic Environments”: A concept and a Course for the Writing and Rhetoric Major – CCC 2009
Richard C. Freed and Gleen J. Broadhead “Discourse Communities, Sacred Texts, and Institutional Norms” – CCC 1987
Lester Faigley “Competing Theories of Process: A Critique and a Proposal” – College English 1986
Judy Kirscht, Rhonda Levine, and John Reiff “Evolving Paradigms: WAC and the Rhetoric of Inquiry” – CCC 1994
Lucille Parkinson McCarthy “A Stranger in Strange Lands: A College Student Writing Across the Curriculum” – Research in the Teaching of English 1987
Matthew Newcomb “Sustainability as a Design Principle for Composition: Situational Creativity as a Habit of Mind” – CCC 2012
Richard Fulkerson “Composition Theory in the Eighties: Axiological Consensus and Paradigmatic Diversity” – CCC 1990
Nathaniel A. Rivers and Ryan P. Weber “Ecological, Pedagogical, Public Rhetoric” – CCC 2011
Kristie S. Fleckenstein, Clay Spinuzzi, Rebecca J. Rickley, and Carole Clark Papper “The Importance of Harmony: An Ecological Metaphor for Writing Research” – CCC 2008
Trish Roberts-Miller “Discursive Conflict in Communities and Classrooms” – CCC 2003
These data sets, gathered from altering the scales at which the texts were read, served as the basis for the visualizations. The data focus could have been different for different means, and could be represented in any number of ways; what is of interest in my work a this time is in seeing what becomes visible in pulling out patterns and communicating that information graphically. Based on the patterns I noticed, I decided to construct:
a short video on the process of mining text
two animated gifs to represent:
one that represents the process of mining an article to render plain text to work with
a gif of the tagclouds to look at patterns across the clouds [NOTE: there is script that can display tagclouds along an interactive timeline that I have used in the past developed by Chirag Mehta. However, the composition must be hosted on a web domain.]
two infographics that represent:
the ten most used words in Cooper’s article
the most quoted line in publications that cite Cooper’s article
two graphs that represent:
journals that cite the article
the ratio of most quoted lines from Cooper’s article
Data Mining and Visualization
Franco Moretti, to whom the concept of distant reading can be traced, described as the impetus to his work, that “a field this large cannot be understood by stitching together separate bits of knowledge about individual cases, because it isn’t a sum of individual cases: it’s a collective system, that should be grasped as such, as a whole” (Maps, Graphs, Trees 68). To Moretti, graphs provide data handles, not interpretation (72); “What graphs make us see…are the constraints and the inertia of the literary field—the limits of the imaginable.” (82) The scholarship of Derek Mueller works to create visualizations that alter the scales at which we encounter texts. In his work Views from a Distance: A Nephological Model of the CCCC Chairs’ Addresses, 1977-2011 Mueller argues that
“As the scholarly record grows there is an escalating value in realizing connections. This is exceedingly important for newcomers to the field who must make inroads however they can, by conversation and conventional reading and writing, of course, but also by pattern-finding, by nomadically exploring conceptual interplay across abstracts and abstractive variations, and by finding and tracing linkages among materials and ideas, new and old.”
Mueller’s work takes an interest in what Moretti termed “distant reading,” which seeks to gather large textual datasets from which visualizations may be composed. While Moretti’s work focuses on charting entire literary genres, Mueller applies distant reading methods and data visualization more locally to alter the scales at which readers can encounter texts by generating word clouds that turn to data-mining processes to draw the most frequently used terms from full-text versions of the addresses. He notes that a radical reduction occurs in this process, allowing “selected parts to stand out from the thick, ecologically entangled whole.” These clouds are not written with the goal of summary or even coherence as a cohesive whole, instead attempting to shift what is available to a reader to wonder and wander about. Compositions are not self-contained; their materials are assembled from spaces beyond that of their frame. And while attention is focused in a works cited page—a microscopic view that inhibits the visualization and the tracing of materials that have been assembled, that have been constructed. In Grasping Rhetoric and Composition by Its Long Tail, Mueller describes data mining as a pre-inquiry based methodology, explaining that “these [data mining] methods catalyze questions and begin to provide a means of addressing such questions more systematically than is otherwise available” (200). Visualizing data affords new perspectives and patterns that might otherwise go unnoticed in materials (207). For the discipline, the benefit of quantitative methods is not limited to the noticing of singular or isolated patterns, but makes available “the stabilization of reusable, interoperable, field-wide datasets” (200) – this work is meant to be recreated to reflect its ecological nature. Mueller describes this as “heuretic disciplinography,” the
“writing and rewriting the field by exploring the intersections across different scholars’ bodies of work as well as the associated pedagogical, theoretical, and methodological approaches they mobilize.” (201)
Distant reading can aid in establishing and tracing the structure of scholarship through the relations of its materials to view our scholarship “in relation to the complex and highly distributed processes involved in the production, distribution, and valuation of those products” (Shipka 51) because they are not singular or isolated in composition. In Toward a Composition Made Whole, Jody Shipka argues that
“in requiring that we trace the highly distributed processes associated with the production of texts, the framework also militates against text-dependent conceptions of multimodality by foregrounding the variety of tools, participants, and actions supported (or may even have thwarted) the production of a particular text.” (52)
Johanna Drucker describes data visualization in Graphesis as a “concern with the creation of methods of interpretation that are generative and iterative, capable of producing new knowledge through the aesthetic provocation of graphical expressions” (41). Data visualization is concerned with the affordances for structuring/producing knowledge through graphical form—creating methods of interpretation that are generative in their expressive provocation. These visualizations are dynamic in that they are models of generating new knowledge through visual means, not a re-presentation of knowledge that already exists—a way of seeing what is usually unseen. These are knowledge in the making. Data visualization of the field is not writing histories or counter-histories to remember back or progress forward, but in making visible what is both available and unavailable to us in terms of materiality. Johanna Drucker explains “Our ideas of what something should be—a house, an airplane, an automobile—constrains our ability to design these things within an abstract model. Breakthroughs in knowledge come from changing the model, or by innovative expressions.” Our reading of texts is based on traditions of reading texts in a certain manner, at a certain distance or proximity; “How we know what we know is predicated on the models of knowing that mediate our experience by providing conceptual schema or processing experience into form” (15); what we compose is in part informed by how we have previously done composition, a process that makes available certain favored materials while excluding others as unavailable, or unnoticed. She goes on to argue that
“Graphic schema create syntactic structures within which semantic values can be assigned and maintained. We can read the organizing syntax of these graphic structures. The structured relations among information elements is as much an expression of a way of thinking as any other intellectual form. To put it another way, graphical structures are rhetorical arguments.” (17)
Creating word and sentence level data sets are “concerned with the creation of methods of interpretation that are generative and iterative, capable of producing new knowledge through the aesthetic provocation of graphical expressions” (Drucker 41). Like Drucker’s work, an ecological reading takes concern with “the creation of methods of interpretation that are generative and iterative, capable of producing new knowledge through the aesthetic provocation of graphical expressions” (41). These ecological visualizations are premised on the idea that an image, like a text, is an aesthetic provocation, a field of potentialities, in which a viewer intervenes. Knowledge is not transferred, revealed, or perceived, but is created through a dynamic process (36).
Data Terrariums: Ecological Readings
Let your ears and eyes wonder/wander across text at differing scales and representations. What do you notice? First, think of how you’ve engaged with the text up until this point – at what distance to you read it? Up close? Across sections? What might become visible in reading the following?
Mining an article to render plain text for data visualization work
These are only a few ways of reading ecologies based on materials and connections I chose to make visible – pulling them out of the rest. The same data sets I created can be represented differently, or, the data set can be expanded. What is visible is different than reading blocks of alphanumeric texts in a journal format. What can/can’t be noticed is different. Reading Marilyn Cooper’s “The Ecology of Writing” would surely draw attention to certain materials – keywords, citations – but there is something different in allowing these materials to shift themselves from the confounds of the rectangular field. The are able to move between relationships. What can be noticed is ecologically evocative – one reader is likely to notice something differently from another. One word had greater resonance, one relationship makes a line of thinking more available; or, one might notice what isn’t present that we might have taken for granted or forgotten. These should not be the only representations; they should not be let to settle and decompose. Ecologies are dynamic and need relationships across materials to persist. In exploring what ecological methodologies make available conceptually from materials we exist within, comes the necessity to raise as matter of concern that such materials must be rendered visible for conceptual use. In order to construct ecologies as methodology, data is necessary. Much of our data is text based scholarship that exists in an expanse of spaces in paper and digital print. Some is accessible only with membership to institutions or associations, some is open to circulate via Twitter, blogs, and the passing on of tags and links, others are only available through the mail or library reserves. We exist in an abundance of materials, but much of it isn’t available as data, or to construct data for reworking. Data sets from textual corpora can be mined, rendered, and shared for different eyes to look over and notice and collaborate with. Compositions can be more mindful of one another, and the materials assembled to make them possible.