In Liza Pott’s Digital Rhetoric class at Michigan State University, we were assigned a “Tracing Digital Events” project, with the objective of learning “how to trace events in digital spaces, broadening [our]understanding of these technologies, participants, organizations, and genres” (Potts, WRA 415 assignment sheet). As relatively new scholars in digital rhetoric, we decided to use this assignment as an opportunity to trace conversations into which we might soon be entering, choosing to focus our analysis on last year’s 2013 Computers and Writing Conference held at Frostburg State University. This project helped us identify current trends in the Computers and Writing community, while simultaneously helping us think about how these trends are integrated and preserved in the digital platforms surrounding the conference itself. It also lead us to an analysis of tools used to interpret big data in digital spaces.
Since this Sweetland DRC blog carnival is focusing on “big” digital data, we want to discuss our process for analyzing the 5,600+ tweets coming out of the 2013 C&W Conference, which as Quinn Warnick pointed out via twitter, caused the conference tweets to trend in the U.S. While we analyzed conversations surrounding the conference in digital platforms such as the conference website and Facebook page, we want to share our analysis of the conference conversation happening using the #cwcon Twitter hashtag.
The Tools: For the purposes of analyzing a relatively large data set within a limited time, we used Linguistic Inquiry and Word (LIWC) to identify and quantify various word categories that could illustrate the different types of moves being made through conversations at the conference. LIWC users can use the LIWC dictionary to define words in one or more pre-determined categories and subcategories. For example, LIWC will categorize the word “cry” under the following: negative emotion, sadness, overall affect, verb, and past tense verb. It will then assign a percentage to each of the categories identified in the analysis. We should also note that LIWC allows users to create their own dictionaries in the program, allowing researchers to focus on language that develops from the data itself. However, both the LIWC dictionary and the user-generated dictionaries have some limitations in terms of providing contextual evidence to expand on linguistic categories.
Following an initial linguistic analysis, we analyzed Twitter conversations archived and kindly shared with us by Melanie Kill, who used TAGSExplorer to collect all #cwcon tweets during the conference last year. We then used visualization tools such as Tagxedo to analyze the conversations about the conference more broadly.
The Analysis: We ran all 5,625 archived Tweets through LIWC’s default dictionary (after removing the symbols “@,” “#,” and “RT”). The following categories were identified by LIWC as most significant, representing at least 1% of the twitter conversations:
Pronouns (I, them, itself): 8.06%
Social (friends, family, humans, neighbor): 7.98%
Personal Pronouns (I, them, her): 5.04%
Positive Emotion (love, nice, sweet): 3.9%
Insight (think, know, consider): 2.38%
Achievement (earn, hero, win): 1.93%
Cause (because, effect, hence): 1.9%
Perceptual Processes (See, hear, feel): 1.73%
Leisure (chat, movie) 1.41%
Discrepancy (should, would, could): 1.21%
Negative Emotion (hurt, ugly, nasty, worried): 1.11%
So, what does this tell us? The high percentage of personal pronouns may be a reflection of the parameters of Twitter as a digital platform. Unlike Facebook, which provides a fixed space and a linear conversational record hosted by one anchor user, Twitter is a non-linear collection of statements or questions posted between individuals as an information network “without barriers” (about Twitter). Many of these individuals already know and “follow” each other in their professional and social interactions, and are anchored in this case by the #cwcon hashtag. The lack of a centralized conversation anchor (besides the hashtag), and personal relationships built before the conference may encourage users to tweet from a more individualized perspective thus using more personal pronouns.
The top three categories—pronouns, social, and personal pronouns—reflect the social nature of these conversations. Many of the tweets focus on supporting or lauding one another, expressing gratitude, or conveying personal emotion. Positive emotion greatly outweighed negative emotion. The instances where LIWC coded tweets as containing negative emotions reflect the stresses associated with an academic conference. Such words as “excited,” “great,” and “awesome,” reflected positive emotions, while words such as “stress,” “overwhelmed,” and “hard,” signaled the balance between work and leisure at play during the conference (and arguably in academia in general). While these tweets expressed an individual perspective, they were often re-tweeted, favorited, or responded to. In this way, tweets that expressed individual emotions became part of a collective experience as participants responded to each other. This tweet directly appeals to the group at large, and we can see that other tweeters interacted with it by retweeting. Retweeting, favoriting, and responding to tweets was one way in which the community interacted with each other, even turning tweets expressing personal emotion into a social interaction as part of the conversation at large.
We can also see conversational moves in the many tweets concerned with reporting information. Many tweets draw attention to material presented in the conference sessions, simply restating the material or supporting what was discussed by the speakers. Some tweets pose questions based on this material, or link the information to their own areas of interest. These reporting tweets are where LIWC’s categories such as insight ,achievement, cause, perceptual processes and discrepancy can be seen. This is particularly the case during keynote presentations, where multiple conference attendees are often in the running to be the first to tweet interesting points made during the presentations. For example, check out the numerous references to Plato and video games captured by Michael Maune in this Storify of James Paul Gee’s keynote.
Repeated tweets stemming from one presentation may help us identify what the community perceives as important, in this case linking to both ancient rhetorical tradition and more contemporary aspects of the field. We can then argue that Twitter provides a space for new names to enter the conversation, primarily by documenting the work of other established scholars. While #cwcon twitter conversations are not necessarily anchored around specific individuals, there is an emphasis on highlighted presentations and on the people who tweeted the most during the conference.
Issues with research methods and tools: While the linguistic analysis of #cwcon tweets provided a starting point for looking at disciplinary conversations in computers and writing, there are certainly many limitations to this type of research. As internet studies scholar, Nancy Baym suggests in “Data not Seen,” when analyzing big data on social media, “we need to remain keenly aware of the inherent multiplicity of meanings they collapse, the contexts in which they are embedded, and, perhaps most importantly, the depth of what they do not reveal.” This contextual layer was not always evident in LIWC’s analysis, which caused us to identify several potential discrepancies in the linguistic categorization:
Because of the informal way people communicate on Twitter, including the use of abbreviations, some key words are missing and could not be coded by LIWC. Here, two tweets that should have triggered “personal pronoun” would go unnoticed because both the first and second tweet omit “I”. Pronouns occupied the highest category percentage at 8.06%, but due to the style of writing used in tweets, this percentage may actually be higher. Without context, tools used to sift through data can lead to misinterpretations of meaning. For example, LIWC coded this tweet as anxious, a word that is categorized as a negative emotion. However, the tweet expresses positive emotion about the conference. In a second instance, another tweeter uses the word anxiety as a joke while referring to something unrelated to the conference. Conversely, LIWC may recognize positive words that are actually being used to express negative emotion. In this tweet, the writer is expressing feelings of concern by using words generally thought to be positive such as hoping and victor. These coding discrepancies point to areas of concern in computational rhetoric, which seek to provide context for machine-coded data.
As a way to provide some context to our data analysis, we decided to look more closely at several examples after the initial LIWC categorization. We started by running the tweets through Tagxedo, a visualization tool that helped us look for words tweeted most often. As evidenced in this word cloud, “students” and “writing” were major concerns for tweeters that remained absent from LIWC’s analysis. In addition to LIWC’s categories, visualizing the tweets helped us see areas of the conversation and the conference program that may or may not be represented on Twitter. For example, presentations about multimodality, learning, and students seemed to have more representation than issues of gender, language, and race, which appear much smaller in Tagxedo’s visual. Looking at LIWC’s categories in conjunction with Tagxedo’s visualization helped us contextualize the Twitter conversations, though there is certainly much more that could be done to represent the conversations more accurately.
Why Does this analysis matter? After the conference is over, tweets are one of the most tangible records we have of the conversations participants engaged in. Many people who do not have access to physical conferences participate through online components, often reading and even blogging about conversations happening on Twitter. While Computers and Writing has an online conference that is archived, and while participants are increasingly sharing their conference resources online, it’s important to think about the values being represented through all digital interactions. What is said on Twitter says something about the study of Computers and Writing. It says something about what is really being discussed by the community of scholars in this field, despite what is listed on the schedule. An analysis of these conversations brings greater visibility to areas of interest while helping us think about areas that might be neglected, and it may help us represent our disciplinary values more effectively. Using the record of our own conversation may help guide the community to areas that need more attention, and big data analysis with contextual additions may be only a small step in this direction.
–A big thanks to Liza Potts at MSU for helping us with this project and post, to Melanie Kill at U of Maryland for sharing her #cwcon twitter data, and Lindsay Neuberger at the University of Central Florida for help running our data through LIWC.