YouTube Commentary: Social Interaction in Online Publics


Conversation, that spontaneous and informal expression of ideas, occupies space in both the minds of participants and in the social sphere. This research paper explores how YouTube users manage to socialize using interactive features of the website without traditional context clues. Through a discussion of social phenomena in this setting, this paper argues that users understand meaning through a process of speech mode formation. Speech modes take into account the composition of a group's individual communicative choices in response to a particular setting, and they document how that collective forms a sphere of meaning and builds standards of practice. A combination of qualitative and quantitative research methods were applied to one hundred randomly selected commentary threads from YouTube's fifteen content categories. The conceptual placement of speech within Jurgen Habermas’ publics allows for the creation of internet cultures and normative communicative behaviors which enable users to maximize their communicative competence and to manipulate meaning. In constructing speech modes, this paper addresses internet aggression (trolling), economic activity, social-networking, supportive speech, and opinion transmission.

Table of Contents: 


    With wrists resting on a smooth plastic surface, fingers positioned above square keys, and eyes focused towards an LCD screen, the computer user stares at a cursor flashing in a short text box positioned at the bottom of a website’s main feed. On any given day, two billion computer users from across the globe access the internet (Miniwatts Marketing Group, 2011). While browsing the web, they perform a wide range of activities, but one major component of these tasks is social in nature (Green, 2002). Those social activities may be limited to communicating with people the user already knows; but there is an ever increasing degree of social interaction pursued by complete strangers. In the past, communication research focused on how strangers interact online in a one-on-one or size limited setting. But interactive users now face the new quandary of establishing ways to speak effectively and meaningfully to an unknown audience that transcends all known physical divides (Watkins, 2009). Scant research addresses the fact that, even without an awareness of geographic location, biographical information, or time, users manage to complete the surreal and impressive task of conversing with an unknown audience. Throughout this research paper, I argue that social interaction within online public sites is possible through a process of speech mode formation and acculturation that occurs when new users join an existing online community. A speech mode takes into account the composition of a group’s individual communicative choices in response to a particular setting and documents how that collective forms a sphere of meaning and builds standards of practice. To demonstrate speech modes, I rigorously analyzed a video-sharing website, YouTube, using quantitative and qualitative investigation methods. Disclaimer: Before beginning my discussion of online communication, I would like to warn readers about the language composition found within my ethnographic field site. Although I have tried to be as conscientious as possible with my selection of samples, some expletives and other nefarious content are referenced as necessary to fully illustrate how speech modes are constructed and followed within the YouTube community.

    Mediated Communication

    Understanding social interaction on YouTube is difficult without first addressing how technology fundamentally changes the ways in which people communicate (Bolter & Grusin, 2000). In face-to-face interaction, meaning is constructed through a process of conversational turns (Schiffrin, Tannen, & Hamilton, 2003). The first speaker constructs his utterance by considering the intended meaning and perlocutionary goal, but the speaker must also understand the conditions he is speaking within (Austin, 1975). All speech is arbitrarily constructed; that is, there is a large array of possible ways to perform certain tasks, yet only a small range is considered socially appropriate. Normative social behavior is culturally bounded and fundamental to communicative competence (Hymes, 1964). Without understanding what is appropriate communication in particular settings, the speaker may not be understood or accepted in the social setting. After an utterance is completed, the listener must consider the statement, the speaker, and the listener’s relationship towards the situation to substantiate meaning and formulate a response. A large component of this process is compiling unspoken clues. In face-to-face interaction, the listener relies on non-verbal clues or the environment around them to assist in constructing meaning. This is most apparent with sarcasm or other comedic techniques; the recipient interprets and understands the original message differently based on the speaker’s delivery. Tone, facial expression, the social setting, or numerous other context clues all modify the interpretation of fundamentally similar statements. While using the internet as a medium, recipients are forced to imply meaning from clues derived from the website structure, official site policies, content posted earlier, or by bluntly asking the speaker what meaning they intended. Social technology users, especially those who are just learning a new interface, can become frustrated and quickly recognize the implicit difficulty with text-only communication. Anyone who has ever struggled to interpret the tone of an email or tried to write the perfect resume to impress employers understands this dilemma.

    Speech act performance, as both a conscious and intuitive process, is learned early in life as an essential social skill (Austin, 1975). Yet despite being taught to communicate by parental figures early in life (also known as enculturation), the social learning process is never complete for human beings. This fluidity in communicative practice is even more common in contemporary society where mediated conversation is normal. Mediated conversation uses a medium to convey a message. In the scope of my research, the medium is YouTube and its interactive features. Technology such as YouTube requires expertise to understand the user interface; however, this social technology is more complex than an instruction manual could convey. Understandinghow to use a piece of interactive software is very different from grasping the second order messages relayed to the audience by selecting one medium over another. Remediation, as described by Jay Bolter and Richard Grusin, is how new mediums achieve their cultural significance (2000). Users renegotiate social assumptions for new technology, such as cellphones, based on older media, such as land-line phones. YouTube provides a salient example of this process. Users currently have the option to provide feedback by making a video a “favorite,” using a rating system, and/or posting commentary. YouTube is also testing a new reaction feature where users have the option to vote on videos as either OMGEpicLOLFailWTF, or Cute, a series of vernacular words representing different emotive states (Scott, 2011). The acronym OMG stands for “Oh My God” if the video was shocking; “Epic” is used if something worthy of being remembered occurred. “Laugh Out Loud” (LOL) means the video caused the viewer to audibly laugh. “Fail” is the label given when a normal activity is performed incorrectly. “What The [F---]” as another type expression of shock. “Cute,” is usually associated with children or pets. According to YouTube, these six reactions embody the bulk of sentiments expressed on comments throughout the website. The addition of this reaction system to the interactive features changes the social dynamic of the commentary thread in unforeseen ways, possibly by restricting the content that is appropriate to post. This is an example of remediation; if someone were to comment lol, their effort to type the comment takes on new expressive force, because the user turned down the time-saving option to apply the reaction feature’s LOL. The allocation of additional time spent reacting means the text takes on a more personal tone, but it also changes the expectations for the written word. Since those six reactions are codified, anything posted in the commentary thread must now be substantially different to be appreciated by YouTube users. That has implications for regional or intentional tone variation associated with those six emotive phrases. In the past, users could respond with lolz or lulz, a variation on LOL, only with a sarcastic tone, but the universalism of LOL could deprive users of that expressive force (Scott, 2011). Slight modifications to the user interface, such as this reaction feature, are common for YouTube as the company strives to maintain an industry leading status. Continual refurbishment of the site further impedes the creation of technology use standards and implied social meaning; users must negotiate and renegotiate their conversational activities and associated values.

    The negotiation of internet communication is further aggravated by users performing from real world positions. Although users glorify virtual experience for the opportunity it provides them to redefine their identity, a user’s behavior and life experience in the physical world shape their online persona. Positionality determines how the user will react and respond to certain content and speech formations, and physical reality is the only empirical tool users have to interpret text-based communication. The corporeal world informs how users interpret social interactions; at least until a virtual group constructs their own system of meaning or the user develops a sensitivity to internet cultures already in place. Once that occurs, the user can create frames of reference from the virtual-scape that exist independently of their material life. I argue that this dualistic progression is part of speech mode formation, since a group of speakers will form a foundational speech community that shares expectations surrounding the use of language. New members enter the community and learn those expectations through acculturation, (a process of socialization). Once they are experienced with the speech practices of that community, the new member is able to participate fully and influence the redefinition of communal norms and expectations. Social interaction online is naturally intricate, but the formation process for publics is complicated by a highly permeable boundary into the community. YouTube users are especially vulnerable to false interpretations of meaning when they interact within a community that spans the globe. YouTube’s distinct type of interaction is known as mediated quasi-interaction, which examines how users broadcast speech to a large number of unknown potential recipients in their public sphere (Thompson, 1995).

    Online Publics

    While it is difficult to claim universal principles for a system that incorporates over two billion users, internet communication does follow a set of foundational principles. This ideology underlies most online social experience in some capacity and gives preference to freedom of speech, to challenges to authority, and to the free exchange of information (Poole, 2010). This is both a natural and intentional consequence of virtual reality. It naturally develops when communication is removed from physical reality, leading to a disinhibition effect which manifests in several ways (Suler, 2004). The most notorious is aggressive or provoking behavior online, labeled as trolling, where the user intentionally writes content designed to get a negative emotional reaction from the victim. Disinhibition effects also create an atmosphere where users are willing to post personal or emotional content, and individuals claim knowledge or authority over topics they could not objectively defend in face-to-face conversation. These three components often lead to conversation “derailment” and the stereotype that no one ever wins an internet argument. For American participants at least, this liberal and non-restrictive internet ideology is an intentional result of policy makers’ commitment to promoting democratic participation and diverse collaboration. Media theorist Jessica Clark describes the action agendas for future public media institutions and policy makers as a need “[to] develop a participatory national network and platform; to cross cultural, social, economic, ethnic, and political divides” (2009, para. 8). Most websites are developed from this position. While website hosts do strive to create a secure and safe environment, usually codified in their community rules or regulations, the hosts and moderators place a strong emphasis on not censoring participants. However, this egalitarian system produces some highly controversial situations. Beyond the issues associated with preventing speech related crime (hate speech, violations of copyright, etc.), online freedom facilitates the networking of hackers and some radical groups that nation states consider dangerous for security. An equally controversial situation originated from the free exchange of information in late 2010. Wikileaks and its founder, Julian Assange, sparked a transnational struggle when the company leaked classified documents detailing the United States’ diplomatic negotiations, interdepartmental memos, and military events from Afghanistan and Iraq (Stephens, 2011). This situation highlights the juncture between challenging authority and the pragmatic consequences of an ungoverned internet system. The conceptualization of that divide between idealism and real world issues is extremely important for framing speech mode development. While some types of speech may be considered obscene, personal, assertive, or criminally negligent in spoken public communication, without a competing interest prohibiting such speech, it is tolerated in an online setting.

    Another important idea to incorporate into a holistic perspective of YouTube interaction is the idea of a public. Referring to the public sphere is common today, yet the concept is relatively new. Jurgen Habermas, a German sociologist, developed his theory on publics in the 1950′s by describing the bourgeois’ public sphere that emerged in Europe during the eighteenth century (Goode, 2005). Although Habermas’ theories concern face-to-face interaction and political speech, his ideas are adaptable to contemporary publics online. Historically, popular cultures were a platform for society’s powerful elites to represent themselves; but with political, economic, cultural, and educational transformations in Europe, an egalitarian public sphere rapidly developed where all citizens could meaningfully participate. Dr. Luke Goode, an expert on contemporary media, describes how Habermas romanticizes the educated discourse that emerged from coffee shops and salons throughout Europe, while he simultaneously critiqued the “frenetic pace of modern life [which] didn’t lend itself to critical reasoning. Neither… did the evolving mass media and cultural industries” (2005, p. 19). For Habermas, the creation of widely disseminated media culture is a return to the old system where societies’ elite decide content, and the unfortunate masses indiscriminately consume any propaganda placed before them. Habermas could not have predicted the development of meaningful participation seen in today’s world with Media 2.0, a class of media where content is selected and produced by the consumer. For this reason, contemporary scholars disagree with Habermas’ cynicism, but his processual arguments are foundational. The relevant component for my research is his seminal theory about the creation of an environment, the public, where strangers can meet, converse freely and meaningfully, and then part ways. Members of the public may resume the conversation with separate members at any point, and the dialogue continues when they leave the coffeehouse. In the YouTube setting, members select personally relevant topics and establish their own publics molded by internet cultures and speech ways. Because the commentary thread is a continuous dialogue, the participating public members may fluctuate, but the conversation persists.

    The political scientist Benedict Anderson further advances the understanding of public social spheres with his discussion of imagined communities. These communities are imagined, “because the members … will never know most of their fellow-members, meet them, or even hear of them, yet in the minds of each lives the image of their communion” (1983, p.6). Anderson is discussing the spread of nationalism, but other socially constructed phenomena function in the same way. When YouTube users compose utterances addressed to the entire viewership of a video, they imagine themselves as one element of the grand total. A user’s recognition of position within the grand system means the public sphere is not limited to a scientific hypothesis of social interaction; it is a real concept located within a cognitive process and is an integral part of interactive operations.

    Speech Modes

    The French sociologist Philip Bourdieu contributes a large body of scholarship to understanding social interaction. One of his famous theories, from Outline of a Theory of Practice, explains the idea of habitus (1977). This mental paradigm explains how patterns of practice develop. Bourdieu argues that, over time, life experience provides individuals with an intuition for behavior that, when combined with an understanding of the exclusive domain they are performing within, allows them to generate and regulate spontaneous actions. Speakers understand social settings and their own standards for normal communicative behavior through a cognitive process that weighs their motivational hierarchy, a structure constituted exclusively from past experience. I locate speech practices within Bourdieu’s representation of unconscious and repeated activity. When those practices become standardized and are shared by a group, a mode of action has developed. Speech modes are, borrowing from the French scholar Michel de Certeau, a way of using that incorporates the situational context and social conditions for expression (Certeau, 1984). In computer mediated communication, I argue that Certeau’s logic for “the operation of actions relative to types of situations” develops and is shared between internet users in the public sphere, and that this practical intelligence is necessary for communicative competence (p. 21).

    My theory is as follows. Speech practices are personally understood communicative decisions that are socially bound to speech modes, which act as interaction protocols. Speech modes are continually forming through collective negotiation and individual renegotiation. New members within a public speech community are socialized to understand speech modes so that they may successfully develop their own personal speech practices. Finally, a communicative rule of law exists where no user is above the unified and diachronic influence of speech modes; all speakers react to the collective force either in promotion, compliance, or defiance. The collective force can restrict or enable speakers in two ways: first, through the speaker’s personal desire to excel at communication and second, through the censorship of users who do not comply with the majority’s demands. An individual has speech practices, while a group shares a speech mode.

    For the sake of this YouTube study, speech modes are formed at the intersection of tone and social enforcement. Refer to Figure 1. The three tones–positive, undetermined, and negative–all express the reigning attitude on commentary threads. While there will always be extreme cases of individual speech practice, in general, the comments follow a high ratio of either positive or negative tone. In a few cases, the ratio of positive to negative tones are approximately equal, so the commentary thread has an undetermined speech mode. When I qualify a commentary thread as undetermined, I am not arguing for the absence of speech modes; instead, I imply that within the framework of that video and subsequent conversation, the community prefers an inclusive speech practice. The positive tone includes cases of supportive speech, personal anecdotes, cordial humor, and benign questions. The negative tone is commonly exhibited as sarcasm, ad hominem attacks, or other behaviors that violate YouTube’s formal community guidelines. After the speaker establishes a tone for their personal statements, there are varying degrees of communal enforcement that may be employed to either tolerate or to correct messages. By voting comments thumbs up or thumbs down, other users demonstrate which comments are exceptional or deficient. They can also label comments in violation of community standards as spam or, through voting them down repeatedly, a comment will be hidden for receiving too many negative votes. Another option for correction is a direct response to the offending user using the @ (at) function.

    The following excerpt demonstrates positive and negative speech practices with varying degrees of tolerance and correction by other speakers. YouTube’ users comment on a video documenting how a veterinarian treated a cat. Although the doctor’s technique to stabilize the feline is natural and does no harm to the animal, some express concern for the cat’s well-being. Animal cruelty is an inflammatory issue that can rapidly become highly controversial. No identifiable user names are given for these segments of YouTube commentary. Instead, the speakers are randomly denoted by a letter which is only applicable within that commentary segment. Please note that for all excerpts within this paper, quotations are taken directly from the internet without correcting spelling or grammar. Profane expletives are denoted in brackets with the later portion of the word censored, while pejorative terms referring to ethnic groups or vulnerable populations are deleted, noted in brackets as [pejorative].

    Speaker A: “this did not work, i had to chase my cat to get the clip off….”
    Speaker B: @Speaker A “lol”
    Speaker C: @Speaker A “Same here. But I got the clip off to the side (she wouldn’t stay     still long enough), so maybe that was the reason. I freaked because she kept hiding where I couldn’t reach her.”

    This positive segment illustrates the use of comments that are relevant to the video, contain personal anecdotes, and are supportive to Speaker A. When Speaker B laughed at Speaker A’s comment, that user validated Speaker A’s particular speech practice. That supportive laugh and Speaker C’s responsive story bolstered the social experience and added to the entire threads mood. The next segment, however, reveals increasingly volatile commentary.

    Speaker A: “In case someone asks…That does NOT hurt the cat! Mind you, that doesn’t work with all cats. I speak from experience….”
    Speaker B: @Speaker A “It hurts the cat. It works on all cats. Only stupid Americans don’t know this.”
    Speaker A: @Speaker B “I’m not american, and I have two cats. One of them does not react to this.”
    Speaker C: @Speaker B “Haha [d------], it’s an instinct from when they were kittens and being handled by their parent. It does not hurt, or else the parent cat wouldn’t do it to their kittens all the time.
    Speaker B: @Speaker C “Learn how to read; you dumb American. It hurts the cat. It works on all cats.”
    Speaker D: @Speaker B “I can assure you the cat is just fine. How do you think we restrain ALL the cats at a veterinary office??? It’s just extra skin, get a good grip on it and they freeze just like when their mother carried them that way when they were kittens.”
    Speaker B: @Speaker D “That cat was seriously injured.”
    Speaker D: @Speaker B “You would know that HOW??? Thanks buddy but i have been working with animals the last 10+ years. The cat is FINE. As soon as your get your veterinary license and are qualified to overrule me you just let me know.”

    The prelude for this exchange was Speaker A’s strong defense of the veterinary practice. Compared to the last segment, Speaker A increased their communicative force (the social power of an assertion) by manipulating capitalization, employing exclamation points and ellipses, and claiming personal authority. Speaker B then chose to challenge Speaker A with a negative comment calling them a “stupid American.” The remainder of the dispute involved Speaker A and two other users vehemently defending the video by appealing to their own expertise to make claims and, occasionally, by mocking Speaker B’s perception of the event. Speaker B repeated his accusations that the technique harmed the cat. Due to the seriousness of allegations of animal abuse, this conversation promptly became intense, especially when Speakers A, C, and D felt personally invested in the perception of Americans and their treatment of animals.


    My research handles the social interaction between YouTube users in several ways. First, a registered member posts a video that is usually one minute to five minutes in length; the user can chose to restrict access or disable the interactive features, but most videos are fully enabled and available to the public. The original poster selects which content category they would like their video viewed under. In addition to this, they also write several tags which clarify the video’s contents. Audience members who browse the site may discover the video in a suggestion category, or they can search using key words inside the search engine. Once the user opens a video’s main page, they are provided with a spectrum of feedback options: they can merely view the video and continue browsing; vote the video thumbs up or thumbs down; add the video to their favorites; comment; or address a reply to a specific user. If the user does choose to comment, others may reply and engage with them directly. In this way, lengthy conversations form between two strangers which are embedded within the socialization of thousands of other users. The decision trees for primary and secondary speech acts map the progression of this behavior. Refer to Figure 2 and Figure 3 for illustrations. Note that these interactive decisions occur in a matter of seconds. The process of communicating online becomes intuitive over time, and the users do not consider the implications of their speech decisions, but the nuances of the situation are still conveyed through the interpretive power of second-order messages. The primary speech act begins after a user views a video; they may continue browsing through the video selection, or they may provide feedback. The subsequent speech act(s) occur when someone directs a comment directly to another user. The first step is to determine the tone, whether it is aggressive or supportive. The speaker may choose to reply or ignore the directed speech. If they do choose to reply, they can be aggressive or conciliatory. The research focuses on how users navigate through this process as they interact.


    To understand the full complexity of YouTube interaction, I approached the community from several directions to both triangulate my results and to locate underlying patterns. I had four points of inquiry: statistical testing of video meta-data, an online survey distributed to Facebook users, content analysis of commentary threads, and an in-depth reading of commentary threads using techniques borrowed from conversation and discourse analysis. These specific methods contributed to my holistic impression of YouTube’s interactive community. The first aim of my investigation was to place YouTube commentary in perspective. This was achieved by understanding the technical aspects of interactive features and then by collecting the total number of users who engaged in those activities. By comparing and contrasting those numbers from different content categories and various sizes of viewership, I uncovered patterns within the views to interaction ratio. The comparison was completed using hypothesis testing of compiled distributions separated for size and content. My statistical analysis was computed using the SAS analytic software.

    The second point of inquiry employed a survey distributed through Facebook, a social networking site, to profile who uses YouTube’s interactive features and why. As a result of this indirect targeting technique, a large body of data was collected on YouTube users who never use the interactive features. Their survey answers covered why they expressly do not engage with others on public sites. The survey was hosted by, a company that provides online survey software. Another technique employed to investigate who is actively using YouTube reviewed YouTube user’s profile pages. This observation revealed several common material culture denominators that may be significant in understanding identity construction and, as a result, the social connections within YouTube commentary.

    In the next area of analysis, I used a python script (computer program executable on windows) to tabulate the word frequency of a commentary thread. This content analysis was performed to ascertain what words were being added to the YouTube vernacular(s), and if those words are necessary for meaningful expression. This information was recorded for ten videos of varying sizes within the fifteen content categories of YouTube. The changes in word choice and rate of appearance signal separate speech communities within YouTube.

    Ultimately, I used the context provided by the previous methods along with observation, or lurking as it is alternatively called within the internet community, to assist in my understanding of the social and cultural engagement that occurred throughout commentary on YouTube. Using methods from discourse and conversation analysis, I traced how members established meaning, navigated disagreements, and provided support when they agreed with a comment. Conversational turn-taking, sequencing, and repair, along with discursive assumptions and socio-psychological characteristics apparent in the YouTube commentary led to a thick and vibrant description of the speech modes apparent within the website. Finally, I verified my findings by speculating how users would interact in a certain setting based on the situational context, then observing if they did, in fact, follow the predicted speech mode.


    Rather than separating my findings into sections based on the particular research method used, I will examine speech modes and practices as specific themes, providing an explanation of how the themes relate to my theory and the analysis that supports this view. This functions as both an illustration of online interaction and an argument for my conceptualization of those interactions. I will begin with a discussion of evidence needed to support my theory, and how the collected data fulfills this requirement. The next section on community provides a starting point for discussing who is using the interactive features on YouTube, and why they are doing so. YouTube’s community guidelines, online survey responses, and contextually bound examples of users arguing about the proper use of features will be synthesized into operational terms. Then, in a section entitled functional roles, I will develop a public sphere based on those communicative desires. For the purposes of this study on YouTube, those functional roles emerge as either trollingeconomic advancementsocial-networkingsupportive speech, or opinion transmission, (although it is possible to have dualistic motives, such as economic and social networking), and this list is not exhaustive. Once public spheres are abstractly delimited based on these roles, I will apply statistical testing and individual examples to demonstrate the communicative rule of law, the fourth tenet of my speech mode theory.

    The evidence necessary to prove my theory will demonstrate that users, without explicitly being directed to act in a particular way, will act in a patterned manner naturally. This is the area of contention between absolute freedom, which I argue is a mythical construct, and rigid control of the system, which YouTube is clearly not. Because every interactive moment on YouTube is performed not from a set algorithm, but by real people making individual decisions without consulting one another, there has to be an explanation for why their behavior is not randomly occurring. The following data demonstrate that interactive behavior within YouTube does follow patterns. Patterns are formed and controlled by the collective decisions within this social sphere.

    The Community

    The first issue, one fundamental to how YouTube users imagine themselves within a community, is who comprises the interacting assembly. In the context of internet communication, it is nearly impossible to successfully profile psychological impulses behind the actions of such a large multitude of users; however, by using the functional reasons for the users to interact in this setting, I can build a framework to reference when analyzing phenomenon. YouTube videos are viewed hundreds of thousands of times a day, whether the video is embedded in other sites, discovered during a targeted search, or glimpsed while the individual user is simply browsing. The interesting component is that, despite a website that promotes interaction and collaboration, the majority of viewers are a silent, lurking majority. To understand why these individuals opt not to participate, I employed a Facebook survey that was uniquely critical of those who post frequently on private networking sites, but then choose not to engage in public dialogues on YouTube. In all but two cases, every survey respondent used Facebook as their primary social networking site. Of those, 81% posted original content at least once a week, with 26% posting daily. Seventy one respondents used YouTube within the last six months, more than any other type of public content site, (e.g. product review sites, news media, forums, etc). From that group, 53 users operated YouTube more than any other public site, and 82% used it at least once a week. However, 60% never posted new content on YouTube, and only 7% commented or produced original material for the site at least once a week. When asked if posting on Facebook felt the same as posting content on more public websites, only 21% thought they were equivalent. I ultimately asked the users to explain why, or why not, they commented on YouTube videos. The answers fell into nine content grades: five positive and four negative responses. A few memorable answers illustrate the nuances of these positions.

    Survey Question: Have you ever commented on a YouTube video? Why or why not?

    1. Aggressive behavior for entertainment: “Yes, to troll people.” (2 responses)
    2. Economic concerns: “Yes, because I am trying to spread my brand, and the best way to get more channel views is to comment. Also, commenting is fun.” (1 response)
    3. Social Networking: “Yep, but only when I know the person who uploaded the [video]. Also [I have] never really posted anything negative.” (4 responses)
    4. Supportive: “Yes, to show my support and appreciation of the video/person who administered the material. I feel it’s encouraging for them.” (5 responses)
    5. Opinion Transmission: “Yes, because I have an opinion I felt worth sharing.” (6 responses)

    For the negation side of the answers, these four responses embody the bulk of respondents’ positions.

    1. Dissatisfaction with other user’s behavior: “Nope. Most YouTube users are either idiots, morons, or trolls, meaning that engaging in a conversation with them would be a waste of my time.” (18 responses)
    2. Private sphere: “Nope. I’m a private person; the whole world doesn’t need to know what I think about a video.” (5 responses)
    3. Registration as a barrier to participation: “No, just because I don’t have an account and people would attack my opinions.” (8 responses)
    4. Nothing to supplement the discussion: “No, because I have [never] felt the urge to add anything. Plus most comments have nothing to do with the original video or are just meant to anger people.” (12 responses) (July 15-30, 2011)

    These survey responses assisted me in delimiting the perlocutionary goals of users who comment on videos. The perlocutionary force is understood by parsing the psychological consequences of an utterance; i.e. was the speaker intending to persuade, antagonize, or otherwise influence another person’s mind. My theory states that users capitalize on speech standards and modes to manipulate and create meaning in their individual speech practice. By understanding the speaker’s primary motive, I can trace the potential results and original speech mode from which their speech act arose. One way to verify this approach emerges from a textual analysis of meta-conversations embedded in YouTube commentary threads. Users who post comments occasionally address issues of roles and proper conduct by actually discussing their reasons for occupying the site.


    Due to user’s difficulty navigating meaning within YouTube commentary threads, a significant degree of meta-conversation occurs; that is, users converse about how to converse. The topics they discuss usually deal with handling trolls, but also occasionally navigate construction of dialogue through authenticity and comedy. Users, without the formal observation bias that might occur from being made aware of my surveillance, are directly addressing what they are doing and how they go about doing it in a visible and permanent fashion. A recent example from an entertainment video on July 24th provides numerous examples of forceful and corrective comments directed towards someone who is acting in an unusual manner. This user, denoted in the samples as “Antagonist,” was holding a series of overlapping conversations with multiple users that spanned approximately two hours, an eternity in YouTube time. Instead of moving to a different commentary thread, he continued to aggressively pursue other speakers in abusive and profane ways depending on their original post. Here are the responses he received:

    Speaker: “Oh FOR THE LOVE OF ALL THAT IS RIGHT AND WRONG JUST SHUT UP. Do you see yourself commenting on a video that is entertaining and funny? If you have a problem with YouTube, THEN GO [B----] TO THEM.”
    Antagonist: “No, I will interfere where they support. Because Youtubes support team is not transparent and hardly listen.. Why do you think it took so long for them to remove a pedo [pedophile] from the spotlight? It’s a simple thing! Delete button. Actions speak louder than words. They took so long because they have trouble with direction. Not a professional business, which is why they are burning bridges with reputable companies such as: Procter and Gamble [the global consumer company].”

    In this instance, the speaker noticed the other user antagonizing multiple individuals about issues unrelated to the original video’s content, yet specific to YouTube policy. In the speaker’s mind, meaningful discourse about policy or complaints should not be made in the commentary section because it disrupts other user’s enjoyment. Another user commented a few minutes later echoing that sentiment following an especially violent comment by the antagonist:

    Speaker: “My home town should get blown from terrorist attack? OMG?? YOU ARE A MANIAC!!! Now I see why you are having too many arguments. You say scientists/ experts should be entitled with comment rating rights. This is youtube. Anyone can come and enjoy here. If you want to change things/rules/ rights go buy YT [YouTube] and do it. Posting comments and arguing = no change. All your doing is making impression that your a violent  minded person.”

    This speaker joined the conversation when the force of the antagonizer’s arguments crossed a threat of physical violence threshold. Beyond being upset by the language content, they furthermore believe in open access to YouTube, or that anyone should be allowed to voice their opinion without the threat of being drowned out by another user or an institutionalized authority, especially anyone considered “scientists” or “experts.” Based on a large body of observed arguments, this comment addresses an underlying principle that most YouTube posters believe, that original content should be user created. The idea is to offer a forum for dissemination of information that is grassroots in origin as opposed to being economically or politically motivated. Another user went so far as to accuse the antagonizer of cyber-bullying:

    Speaker: “Stop trolling kids [d------] . It’s cyber bullying. You can actually get sued for that, you’re one stupid adult. Normally someone would know that.”
    Antagonist: “It’s called unpopular opinion. No you can’t get sued. Otherwise I could sue you just for your comment there. You are stupid, stop trying to turn the tables. I know it takes all of you to take me on, but none of you have anything credible to say.”

    Cyber-bullying is a serious issue, although it is probably irrelevant in this setting since the antagonizer took every opportunity to air his grievances and was not targeting a single user. Yet the speaker’s comment does highlight the perception of overly aggressive behavior and how those types of actions have tangible consequences, such as prosecution for cyber-bullying. Most of these speakers received up votes for defending the commentary thread, and this argument reveals several key tenants of the speech community. While most individuals would have overlooked the antagonizer’s directed speech had he spoke to another aggressive user, that scenario was not the case here. The user targeted innocent commenters while they attempted to enjoy the entertainment video. By voicing his complaints against YouTube in the wrong forum, threatening the safety of several people, and appearing to engage in cyberbullying, this user ostracized himself and received the scorn of a large group.

    Another level of meta-conversation exists within YouTube’s community standards page. These standards devised by YouTube offer a backdrop to silhouette the speech communities and their practices. For the purposes of this study, the following components are highly relevant, not because they are of greater significance than other regulations, but because they are systematically overlooked and unenforced depending on the particular video I observed. Discretionary enforcement supports the perspective that all speech, especially in this virtual setting, is context driven. The first systematically overlooked standard emerges from YouTube’s appeal to viewers to create a community by networking and interacting with others. The YouTube team, in an open letter, requests the following:

    “Remember that this is your community! Each and every user of YouTube makes the site what it is, so don’t be afraid to dig in and get involved! … Let folks know what you think. Feedback is part of the experience, and when done with respect, can be a great way to make friends, share stories, and make your time on YouTube richer. So leave comments, rate videos, make your own responses to videos that affect you, enter contests of interest- there’s a lot going on and a lot of ways to participate.” The YouTube Team, n.d., para. 13)

    Fascinatingly, however, this open request for engagement is systematically unheeded; the vast majority of users browse videos based on popular suggestion and never engage with the community. After statistical hypothesis testing, I concluded that despite minor variation in interaction rate compared to overall viewership, the community-at-large is mostly engaged in lurking, or viewing videos without providing feedback. At alpha .05, YouTube videos that have a viewership of less than one thousand individuals host approximately .1% user participation. For those videos with a viewership ranging between one thousand and ten thousand, the interaction decreases to a rate of approximately .05%. Videos that attract more than ten thousand views have a notoriously poor interaction rate at less than .05%. This means that for the most popular videos that receive hundreds of millions of views, the comment thread will only display tens of thousands of comments, a marginal number of the total viewership. For YouTube, this means their vision of a community based on interaction and collaboration is still a distant dream, with most of their traffic coming from a silent consumer base. This is not the only way in which their community standards are not reflected in user behavior, however.

    YouTube moderators impart to users the underlying principle to not cross the line and use common-sense while interacting with other community members. They openly state that there is no feasible way to police the millions of users who traverse YouTube daily, so they call upon others to self-regulate depending on the context:

    We encourage free speech and defend everyone’s right to express unpopular points of view. But we don’t permit hate speech (speech which attacks or demeans a group based on race or ethnic origin, religion, disability, gender, age, veteran status, and sexual orientation/ gender identity). (The YouTube Team, n.d., para. 8)

    Other decency standards elaborate on the prohibition on shocking or disgusting material, nudity, dangerous and illegal acts, references to children, harassment, impersonation, and threats. Yet despite the YouTube team’s clarity, some YouTube comments violate the community standards methodically and rambunctiously. Sometimes other users wield the spam or down vote feature to remove negative content from the page, but frequently violations are allowed to endure. Regulation depends heavily on the context and mode of speech constituted within that particular virtual environment.

    Functional Roles

    The five functional roles discussed below are extracted directly from my participant observation and the Facebook survey. The most obvious way to identify these separate roles is through the reaction of other users to statements. If the recipient’s reaction was not the one desired by the speaker, they modify their speech to improve reaction conditions. If the speaker continues to act in a certain way with repetition of certain response types, those reactions are the ones desired and sought after. Furthermore, users construct their identity through the choice of their usernames, which assist in signaling identity and speaker intent. Usernames such as “LOVESTOTROLL” or “noob4ever” announce to other users the types of speech to expect from those individuals. “LOVESTOTROLL” will probably act in an aggressive manner, while “noob4ever” will most likely refuse instruction on correct interaction within the community. Discussion of internet aggression is one of the more salient topics emerging from virtual communication today.

    Trolling. The intentional incitement of a negative response from another in a virtual setting, or trolling, is a multifaceted phenomenon without a universally identifiable cause; (more information on the etymology of this term may be found at Trolls can be perceptive and skilled conversationalists and are often the most experienced technology users. In order to elicit emotive responses, these users successfully profile their victims by knowing which content is most sensitive and which approach to the subject is most irreverent. For this reason, those who troll on YouTube are useful from a research perspective, because they undermine the fluidity and cohesion of the speech community. The instigator’s comments clash with accepted standards, revealing the larger foundation for speech norms and a community’s standards of practice. This behavior, where one user disrupts a conversation by replying to another user with something intentionally obscene, is very common on YouTube. Here is an example from a commentary thread posted on an entertainment video, June 23, 2011. The video showed two children under the age of five excitedly showing their father a magic-trick. The camera captures how the youth execute their act, but the father pretends he does not see them. One user in particular did not understand or intentionally ignored this irony:

    Speaker A: “you can see him run off in the background if you look closely you [f------] idiots.  [s---] trick, thumbs down.”
    Speaker B: @Speaker A “I believe you are the ‘[f------] idiot’ here….”
    Speaker C: @Speaker A “no [s---] sherlock”
    Speaker D: @Speaker A “You’re either the worst troll ever or the dumbest human being ever. either way God threw up when he created you.”
    Speaker E: @Speaker A “yes and of all the people calling you a [d------] for not getting sarcasm, the only comeback you can think of is calling us names.”

    Speaker A was intentionally aggressive with his harsh language and a directed insult while he critiqued other users on the thread who were pretending that they too did not see how the “magic” was performed. Although other speakers on the thread had also missed the irony being employed by the majority of speakers, they escaped the groups’ corrective attention. Speaker A’s forceful words created tension, and four other users felt compelled to respond. Speaker D even pointed out that speaker A was possibly trolling by posting such a negatively toned comment on what was an otherwise positive and relevant thread filled with humor. Aggressive behavior similar to this occurrence happens regularly, which has led concerned users and regulation groups to sponsor some critical investigation.

    Current positions within academia and from policy makers label trolling as deviance, anti-social behavior, cyber-bullying, or even as an omen of the disturbing future of American society (Evans, 2001; Jewkes, 2003; McQuade, et al, 2009; Williams 2006). There are several weaknesses to these claims that reveal fundamental misconceptions about aggressive behavior online (Lange, 2007). The first shortcoming to the alarmist perspective is the assertion that trolling or aggressive speech online is a modern phenomenon. Although the caustic trend in communication is often portrayed as a computer mediated problem alone, it is arguably the continuation of older social positions. One example is colloquially known as playing the Devil’s advocate or formally as the Advocatus Diaboli, a social role institutionalized and promoted by the Catholic Church for the canonization process beginning in 1587 (Herbermann, 1907). The advocates legal duties included questioning and arguing against the legitimacy of a candidate, even if the beliefs and arguments he espoused were not his own. That strategy, the use of aggressive argumentation and controversial positions to place someone in an uncomfortable position, is acceptable in countless settings today, including journalism, public debates, or as a conversational style. Trolls, or those who participate in flaming online, are performing in a similar speech practice by placing their fellow users in extremely uncomfortable positions. They are usually motivated by entertainment seeking and are not explicitly testing another user’s legitimacy, but their speech acts and particular speech ways produce the same results as the Devil’s advocate. That is not to say, however, that trolls are motivated solely by their boredom. While that is consistently at least one of their reasons for pursuing the behavior, there are many potential layers of competing motivations.

    Contemporary portrayals of trolling in public media rely on sensationalized rhetoric to claim that limited control of communication online is creating a desensitized and malevolent society. While true that exposure to linguistic adversity does have a desensitizing effect in a communicative sense, there is no substantiated research demonstrating that this freedom of expression is destroying the public social sphere. This controversy encompasses other areas of expression, such as video games and popular music. The opposing perspective contends that free access to information and a lack of censorship mean individuals who were historically oppressed are allowed and encouraged to seize control of their own voices and offer new forms of resistance to institutional power. Blogging and alternative news sources serve an equivalent operation in today’s world to check elite power bases that great historical leaders and social movements served in the past. Trolling is the small and daily manner of resistance that online users have to question authority and express their frustrations. The troll’s satisfaction arises from the concept of LULmining (mining for laughs), which is achieved through harassment and system disruptions until the victim reaches a level of “building tensions beyond stress threshold[s],” at which point the LULZ (laughs) occur (Oh internet, 2011). Laughter is caused by the victim’s anger, frustration, or an emotional response, so users frequently post the phrase, “don’t feed the trolls,” on internet forums. However, while emotive thrill seeking is often the primary catalyst for aggressive behavior, there are underlying links to power and personal expression for some who practice within this speech mode. Using two email interviews and an array of survey responses, I will address the layers of association internet users and YouTube members have with trolling behavior.

    My two email interviews were with self-identified trolls and, in one case, the user received a citation for trolling on a political commentary thread. The first interview with Ben (a randomly selected alias) began with a discussion of identity construction to manipulate other users. When he was a young, single male studying in the United States, Ben would occasionally modify his speech patterns in computer mediated conversation to appear as a young female. On other occasions, he spoke as himself and consciously harassed other users. He remarked to me that he targeted “obnoxious” users who were vague about their identity. For him, lying about one’s identity is acceptable, but indecision is not. I was most intrigued to discover his motives for bothering other users in the virtual setting. He emphasized repeatedly that his trolling was done out of boredom. When I asked him for more detail, he explained:

    “It was just fun watching the person be uncomfortable- to see how long they’d actually sit in an environment that they had control over to leave, and actually sit there and take the stuff. And she [the target] stayed there listening to my crap for fifteen minutes. All she had to do was hit escape and say “I’m out of here.”  But she stayed there putting up with it. She kept replying, “no, no, stop, [f---] off,” and “leave me alone” instead of just leaving and ending it right there. Personally I think she enjoyed it [the harassment].”

    This young male found that eliciting an emotional response from those communicating in a virtual chat room was entertaining. He detached himself from responsibility for his comments by distinguishing that other users could forsake the exchange at any point. After a few months of this aggressive behavior, he eventually became bored and transitioned to other forms of entertainment.

    The next interview participant had a more colorful experience with his trolling behavior. Also a young male living in the United States, Derrick and his brother would often get online to pass the time and socialize when bored; “It was fun getting a rise out of people. Just for the cheap thrill” (interview, July 18, 2011). For several years, this pattern of entertainment through distressing other users continued. Then a major event occurred which challenged his perception of trolling behavior. One day the interviewee accessed a forum for police officers and began calling the other users “pigs” and emphasizing his dislike for the police force. The officers responded by checking for illegal material on his profile, where they did find sexually explicit content. Within minutes his internet provider disabled his AOL account, and Derrick contacted the local office for his service to be reinstated, a challenging and embarrassing process:

    “I had to go through the embarrassment of calling the operator at AOL and explaining why my profile stated those explicit things. The verbal warning’s pretty embarrassing. There was a little bit of reality that the things you say on the internet can catch up with you. It was embarrassing because the anonymity was gone, even though there was still a degree of anonymity with the phone conversation. But the fact that I actually had to speak to someone and kind of explain my comments. It was embarrassing.”

    This anecdote illustrates the divide between face-to-face communication and computer mediated speech. Due to the remoteness of mediated expression, Derrick never came to terms with the consequences of his actions until he was forced to call the internet operator. Psychologist John Suler refers to this as the disinhibition effect, a common psychological occurrence for individuals corresponding online (2004). Because virtual users never have to mitigate their actions by witnessing the tangible effects of their speech acts, they become desensitized to emotional distress and can subsequently exercise callous and combative behavior against others.

    I next asked Derrick why he targets certain people in this virtual environment. According to him, there are two major triggers. The first is someone “taking themselves seriously” while communicating with others. Because they are invested in their speech acts, these individuals signal to trolls that they are likely to give an emotional response. The second set contains individuals who argue from positions of authority and who hold power in governing institutions. Under ordinary circumstances, it is difficult to face and challenge powerful individuals, but online those barriers are removed. He mentions the police as an example of this limitation on expression.

    “[Online] you can say things back to these people that would otherwise have power over you in the real world. Police officers always speak from authority in their arguments, and they just hide behind their authority and don’t discuss the issues. Their big argument is that we can’t criticize police officers because we aren’t police officers.”

    These two triggers for action are remarkable. Derrick’s engagement with entertainment and the struggle for power illustrate that trolling is performed by individuals with layered motivation structures. The trolling act is comparable cognitively to other forms of virtual entertainment where the interaction involves transmitted information. The troll’s intent is not to inflict actual suffering on other users, but rather, to gain amusement by placing short term discomfort on others before they “rejuvenate.” This process is analogous to a multi-player video game where speakers know the consequences for their actions by consenting to play the game, or in this case, by joining YouTube and freely conversing with millions of potential viewers. While trolling, every statement is laden with strategy. The winner is the person who leaves the conversation the least upset from the exchange, while the loser is infuriated or silenced after an intense exchange. The other triggering event of someone arguing superiority and power over others, especially from a position of authority, invokes feelings of defenselessness. My informant Derrick gained personal agency by trolling as a power equalizer and by speaking out against representatives of state and institutional power.

    Although it is impossible to interview every YouTube user engaged in trolling, this behavior follows a singular functional pattern that substantiates my speech mode theory and analysis of how it is perceived. In particular settings, YouTube users will tolerate aggressive speech between two users as a spectacle to be enjoyed. Both parties involved will use obscene language, various content references, and insults in their struggle to see who will stop speaking first or become too emotionally invested in the contest. In other video contexts with small viewership communities, trolling is not allowed and is down voted out of the speech community; such quick rejection does not give trolls fodder to work with, so they usually seek out others to torment. These settings are also the two least likely to be viewed by a main stream audience, which means that most assumptions about online communication do not emerge from places where trolling is controlled. Alternatively, videos based on content distributed to the public by other means, such as radio, news sites, or television, will attract a wider viewership. These videos are also the arena where aggressive behavior most often comes to fruition for trolls. The videos support commentary threads with permeable boundaries and a large viewer base, such as music videos. Because music videos attract a large and diverse audience, trolls can select individuals from the comments that signal their susceptibility to internet aggression. Since the community is large and less inclusive, the conversation between the troll and victim will often go unnoticed by the majority and therefore continues without corrective measures.

    There are users who, based on their speech, trigger the troll’s attention in two distinct ways. The first unwitting and unwilling victim will usually flag their presence by conveying a lack of experience with online communication. Newer users are not as cynical or prepared by past experience to navigate a conversation with trolls. The other types of “victim” are those users who intentionally seek out trolls to engage them in a conversational contest. They will do this by increasing the force of their initial utterance or by directly challenging trolls to attempt to harass them. Overall, trolling speech practices are demonstrative of the aggressive speech mode based within YouTube. Trolling fulfills a functional need for the YouTube user, yet is a socially bound protocol for interacting that manipulates meaning based on the setting. The standards for this practice have existed through time, but continue to change based on social evolution due to and independent of technological innovation. New YouTube viewers and commenters are highly susceptible to becoming victims of trolling until they are socialized by the community into understanding and avoiding it. The community has effective ways of promoting this speech practice simply by tolerating it, or the collective may choose to reject and correct for it. Those collective decisions then create the speech mode which governs interaction for distinct YouTube societies.

    Economic Interests. YouTube offers opportunities for popular channels to earn money through a partnership with the company, but that is not the sole way for users to create careers from the videos they post. It is a common goal for YouTube users to gain notoriety by posting videos, and many young musical artists have been discovered by talent scouts due to their popular content on the site. While MySpace was once considered the premier gateway for new artists, YouTube has seized a large chunk of the market with their ability to highlight popular musical appeal to a mainstream audience, something that record labels pursue as a guarantee of an artist’s investment worth. Justin Bieber is one individual who epitomizes this process. Beginning in 2007, the seventeen year old Canadian posted home-made videos on YouTube. His first few videos went viral with more than ten million views in the first month. That popularity attracted the attention of an agent; he flew Justin out to meet Usher who shortly signed him to his record label. The rest was history for the now platinum selling, globally famous artist.

    Due to the mythical proportions of Justin’s success, other YouTube posters attempt to build their own popular following by commenting on videos advertising their work. YouTube has a large community which normatively allows anyone to express themselves, and economic advancement is not on face a malicious act, yet self-aggrandizement is one of the few things that is universally disliked on YouTube and is immediately labeled as spam. Because of this taboo on the behavior, users attempt to cloak their true intention by referring to the content of the video they are commenting on, while slyly placing a reference to their own work within their comment. This example of covert self-advancement was posted on a popular cover of an Adele song:

    “My band just finished recording a full band cover and video of ‘Set Fire To The Rain’ by Adele. She sings it incredibly and we wanted to pay homage by recording our own take on the song. If you were to take a few minutes out of your day to listen to it we would be a very happy bunch of lads! To listen to our version click on our name underneath this post and let us know what you think. Thanks! :)”(August 7, 2011)

    This post was permitted despite the band’s obvious intent to promote itself due to its catered nature and individualized essence. The appearance of legitimacy and acting in a genuine manner is also a precursor to economic and social success on YouTube. The genuineness of one’s aspect, or authenticity, is a huge component of online communication, especially in an environment where there are no checks on fabricated realities or personalities. The above excerpt adds components, such as “lads” and a smile emoticon to make the comment seem less like an advertisement and instead appear to be a personal narrative with authentic feelings. Users must accept that the material they are consuming is based in truth, or they will be deceived and misled into following false premises. Because users act from this position of assumed honesty, falsehoods are treated as a major transgression. One result of this common ethical demand is the now common practice of notifying viewers if the content they are enjoying is fictional (if it is not obvious). One example from 2006 demonstrates this outrage only one year after YouTube began hosting videos. Two young Americans appeared online as video-bloggers, (referred to as vbloggers in the YouTube vernacular). The boy from Ohio was known as EmoKid21Ohio, and the girl went by EmoGirl21. Over the span of thirty five videos, they appeared to fall in love, and the romance attracted a large following of viewers who were entranced by the potential for human connection in a mediated virtual environment. During their last two videos, however, the couple announced that the entire affair was staged and fictitious. EmoKid21Ohio and EmoGirl21 were, in fact, two Australian actors practicing their craft. Their duplicity caused an extensive backlash among community members frustrated by the deception. Based on video responses and comments, the YouTube users were not upset that these individuals posed as emotive artists, a relatively common occurrence. What bothered them was the false relationship. In many cases it is acceptable to reinvent oneself; all users reconstruct their identity to a greater or lesser degree, but there are community standards for authenticity when portraying social interaction within the site. The following is a contemporary example of the anger that one user, Ray WiIIanm caused when he claimed to be Ray William Johnson, a famous YouTube vblogger most famous for his show, =3, while commenting on an entertainment video. Ray WiIIanm’s first post received twenty-eight up votes until other viewers realized he was not the comedian. It began with a slow progression of users commenting that he might be fake. They then started compiling facts against him. While the real performer had over two hundred videos, this person only had five.

    Speaker A: @RayWiIIanm “Fake Ray is fake.”
    Ray WiIIanm: @Speaker A “What? No.”
    Speaker A: @RayWiIIanm “What? Yes. Moron, you’re not RayWILLAMJohnson”
    Ray WiIIanm: @Speaker A “Of course I am not RayWilliamJohnson. Did I say that?”
    Speaker B: @RayWiIIanm “failwannabename”
    Speaker C: @RayWiIIanm “FAKE AND GAY!”
    Ray WiIIanm: @Speaker C “I am the real RayWiIIanm
    Speaker D: @RayWiIIanm “You are the real RayWIIanm…but not the RayWillanmJohnson ;)”
    Ray WiIIanm: @Speaker D “Who’s that again?”
    Speaker E: @RayWiIIanm “i see what you did there xD”    {Up voted 34 times}
    Speaker F: “oh wait ur a fake…. how depressing”
    Speaker G: @RayWiIIanm “Why you trying to be a faker?”
    Speaker H: @RayWiIIanm “Wow, doesn’t the william guy have like a million subscribers? And you’re a fan boy who has below 5. It’s cool that you’re a fan, but you could be honest about it though.” (July 24, 2011)

    The comment thread continues as people dissect the fake Ray’s claim to be ignorant of Ray William Johnson. It eventually becomes obvious that he was intentionally acting obtuse, and that he was aware of the attention he would receive for having a celebrity’s name. He noticeably received no support for his actions and instead suffered a continual barrage from over twenty other users. Those who made witty responses to his account were up voted into the top section of the web-page, as noted by Speaker E who received 34 votes of approval. Those YouTube users who wish to gain popular notice and advance their economic interests engage in commenting as a means to achieve those goals, but they are limited by communication standards. Their speech is only tolerated or successful if their comment is not obviously mass produced or self-advancing. Furthermore, their identity and/or activities must be genuine in nature to be accepted and respected within these speech communities.

    Social-Networking. Facebook, twitter, and other social networking sites are the most obvious ways for physical communities to interact in the virtual sphere; however, there are some communities that are large enough to escape the purview of these networking sites that YouTube successfully hosts. Users with an institutional association or common interest group are two examples, and they usually support videos that contain a relatively small viewership and thus have exclusive speech modes. Their comments come from individuals that are strangers, but who are affiliated with each other in external capacities; they use established codes for speaking. Within those codes, there is a decreased likelihood of aggressive or abnormal speech acts, since the disinhibition effect is less transformative in these narrowed settings. An example of this arose from a woman’s liberal arts college in New England where students made a video supporting the LGBTQ (Lesbian, Gay, Bisexual, Transgender, and Questioning) community. One alumna confronts other users about the video and ideology that supports its creation. Although the speakers clearly disagree, the conversation remains polite and well-reasoned, especially compared to disagreements occurring in the broader YouTube community. The video has fourteen thousand views, but it contains only sixty-five comments with the majority being supportive and directed towards women the users personally know featured in the video. The confronting alumna, denoted as the “Rival,” had comments marked as spam or too negative for the thread by other users. She begins her controversial position by posting an open letter directed to the entire viewership.

    Rival: “Dear Student(s), Please take your drivel off of the internet. You are making the institution look like a queer-driven sorority. Some graduates might choose to enter fields where the “bubble” and shielding people from reality don’t apply (read: anything except art).”
    Speaker A: @Rival “And some people still live in the bubble and want to enjoy it while they can. And some of us are glad to support them even if we don’t need that particular bubble ourselves!! Back off.”
    Speaker B: @Rival “What about this marks the institution as “queer-driven”? Are you familiar with the purpose of the “It Gets Better” project, at all? On the contrary, I think this reflects positively on our institution. If you’re worried about the students in this video damaging your idea of what our institution should look like, you need to rearrange your priorities. And even if you’re an alumna, parent, faculty or staff, your statement is no more valid than ours.”
    Speaker C: @Rival “Many schools and other institutions have chosen to voice their support of the LGBTQ community in this difficult time. This support does not mean that the LGBTQ women in this video are representative of the entire college. In fact, [this institution's] percentage of LGBTQ students is approximately that of the national average. In addition, sharing supportive experiences with similar students who might be experiencing these difficult issues should NOT be considered “shielding” them.”

    Considering that the subject of sexual orientation and freedom of choice are contentious subjects, this argument remained well reasoned and polite. The speakers presented differing points of view with some force behind them, but without profanity or any of the hallmarks of petulant speech seen elsewhere on YouTube commentary threads. The tangibility of the situation, or knowledge of the speakers outside of the YouTube community, in this case the alumna-network, creates a situation where statements are bound by their institutional standards for disagreements. This is further enforced by the idea that alumna and students are representative of their home institutions, so while it is appropriate to disagree, it would have been inappropriate to engage in threats or salacious commenting.

    Social-networking is not limited to speech communities found in the physical world; this speech practice can also be based on internet cultures with external standards independent of YouTube. The most extreme example of this, due to their unique syntax and divergent vernacular, is the lolcatz community who originate from The community is based on a popular meme (viral cultural material) of a cat with script written over the top. Over the years, any cute photo of an animal began being used, and a language developed for describing the animal’s feelings based on the photo. Here are a few examples of lolcatz language taken from a 2008 LolzCat YouTube compilation video; “furst i playz pianoe.. now i maek yu tube comment,” which translates to, first I played piano, now I made a YouTube comment. Another comment said, “kitthe needz rechrgd. plug in pls kthx. THATS DA AWESOME!” to convey that the cat needed recharg[ing], please plug them in, thank you, and the video is awesome. One user who was unfamiliar with the community attacked their unusual speech practice:

    Speaker A: “What´s with the [f------] spelling and grammar? I´m not english or american but…”I is gots your tail”??? Makes me think of Jamaica…. don´t ask me why haha.”
    Speaker B: @Speaker A “Its because, well it makes the cats look 5 times as cute, how bout that?”
    Speaker A: @Speaker B “Yeah, I´ll let it go for the sake of all the cuteness :D”

    Speaker A was receptive to understanding why the other users were behaving oddly, so when Speaker B explained it was done to enhance the visual impression of the cats, Speaker A agreed and tolerated the seemingly odd behavior from a social group based outside of YouTube. Other internet cultures that occasionally host discussions on YouTube commentary threads can be based on video games, other memes, or even artificial intelligence. These communities often situate their unique speech practices within a social-networking speech mode to create a unique user experience that also educates outsiders on the particulars of their speech community.

    Supportive. Supportive communication, of all the potential speech practices or functional roles to be discovered on YouTube, is the most closely aligned with the posted community guidelines that encourage feedback. The supportive speech provides validation to another user whenever someone comments responding to an original post. Although not necessarily complimentary, the basic act of replying reinforces the other user’s communicative confidence. The following exchange is a representative example of a story that provided a common link between strangers, and therefore receives a large number of supportive responses, in addition to being up voted three hundred and fourteen times:

    Speaker A: “Thumbs up if this is a song that you used to hear on the radio and loved it but never knew the name or artist and u just searched it based off of a few main lyrics in the chorous.”
    Speaker B: @Speaker A “That’s oddly EXACTLY what I did…”
    Speaker A: @Speaker B “I did the same thing and I just wondered how many other people did this too. thanks for the like sir.”
    Speaker C: @Speaker A “gotta love google. But I knew the lyrics and the name of the song. I miss hearing it on the radio while being driven to school in the morning :(”
    Speaker A: @Speaker C “Yea, Google works too. lol”

    Not only does this exchange highlight how strangers can meaningfully support a speaker, but the commentary also demonstrates a common pattern of nostalgia. YouTube users often bond over narratives describing archaic website features and older interaction styles. In this way, they support each other while simultaneously making a claim to legitimacy within the site. Those who have used the site the longest and witnessed the most change are perceived as more adept and skilled with communicating online. One video poster wrote on her July 31, 2011 travel video that, “if you were here, leave your mark, or comment. I miss the interactive YouTube of yore.” When the site initially began hosting in 2005, the smaller community meant that the interactive features were used by a larger proportion of the overall audience. Once YouTube gained a large mainstream audience and became the premier choice for video hosting, the interactive community transitioned into a plurality of various consumer groups. The core coalition still exists in the form of video blogs and other non-economic communities intent on socializing, but the bulk of YouTube has significantly changed, as this user notes in her bittersweet video description. Nostalgic behavior is also apparent in the music video commentary section where older audience members reference music from a better era with comments such as “I miss the 90′s;” and “My childhood, god i miss it.” These nostalgic comments can occasionally receive scrutiny from other users, such as this user who felt that other users where posing as emotionally attached to older content that they were not actually old enough to remember.

    Speaker A: “WAIT. Why are people posting ‘I remember this song from when I was 3. <3′?? You don’t appreciate anything when you’re 3, let alone remember it. :p I grew up with this. I was freaking 8 when this came out and remember going to the skating rink every Saturday hoping they’d play my request for these songs. So unless you really have a ‘memory’ of actually loving this, don’t bother posting. You sound stupid. :p That is all. :] Thank you.”
    Speaker B: @Speaker A “pshh whatever, its all about the ten year olds!”

    These two speakers are jointly skeptical of nostalgic claims because they recognize the potential manipulation of life experience by others to gain YouTube legitimacy. Even though it has no direct effect on their lives, they feel personally cheated when others fabricate life experience to gain social desirability. It diminishes their own ability to enjoy and express nostalgia for that musician if everyone is behaving in a similar way. Their statements loose communicative force.

    Occasionally a user will commit a YouTube breach of social etiquette, yet despite their indiscretion, something within their post resonates with other users. In those cases, the respondents will gently point out the problem, but still support the overarching concept. The following exemplifies how a personal story compelled another person to reply despite the original poster’s request for a positive vote.

    Speaker A: “I’m the guy that girls look right past because for some reason, I’m not enough for them, even though I’m everything they could EVER want and anything they could EVER need. There are too many douche bag guys out there and I treat every girl with respect. Thumbs up if you feel me.”
    Speaker B: @Speaker A “I thumbs up bc we’re the same guy :/ it effin sux. sorry I don’t type with correct grammatical syntax online like you do btw. and I’m also sorry that you go through that, bc I wouldn’t wish this on even my worst enemies….”
    Speaker C: @Speaker A “I hate when people ask for thumbs up, but your rant describes my situation to the mark…so thumbs up chief.”                    (July 25, 2011)

    Beyond Speaker C’s forgiveness for a speech transgression, this example also provides an example of linguistic insecurity. Speaker B recognized that their speech style is not compatible with Speaker A’s when they apologized for not typing “with correct grammatical syntax online;” in spite of this, they still replied to Speaker A and created a socially supportive bond.

    For videos that do not attract a large group of viewers, the same type of supportive speech occurs. Interaction tends to be extremely polite, since the users appear without a large crowd to obscure their actions. The social-robustness that might lead to joking relationships is also lacking, so consideration is made to not offend other users. This is a direct result of the small viewership; those who seek out other users to troll will be uninterested in videos with small audiences, and the small proportion of comments mitigates any disassociation from responsibility. This example taken from the entertainment category presents strangers commenting on a video with only 2,740 views:

    Speaker A: “I love this song too. I love playing with water too.”
    Speaker B: “hi :) can anyone tell me the name of the song, it’s so great”
    Speaker C: “A++++”

    These three, as the only users commenting on this video, were exclusively supportive and polite. Speaker B also noticeably formatted their question as a greeting, something that is patently unusual for videos with a large audience and a posting rate of one comment every few seconds. If they took the time to add the greeting conversational turn in a rapidly expanding commentary thread, their words would be overlooked. In this setting, the greeting is allowed and appreciated by the others who up voted the comment, completing an adjacency pair. Supportive speech practices may be found in numerous settings, but users engaged in these settings signal their intent to each other. It is the behavior least corrected for on YouTube due to its universal appeal and congenial nature.

    Opinion Transmission. The desire to express oneself is tempting for vocal internet users wherever a large audience congregates and an open forum is granted; YouTube commentary threads offer a perfect setting for those individuals intent on voicing their opinions which takes two distinct forms. The first case involves correcting immediate and unreasoned statements. Lay individuals commonly approach internet communication with one glaring question in mind, what is the point of speaking to a large group of complete strangers? From the outsider’s perspective, internet speech can resemble the bickering of young children. When none of the participants agree to disagree and fights devolve rapidly into vulgar altercations, it is understandable that internet commentary has a deplorable reputation. However, that is not the point; YouTube speech is not about winning a debate for the sake of universal Truth. In such an unstructured format, no one wins a disagreement, (an observation which may also be directed towards casual face-to-face debates). Most YouTube users understand the limitations of persuading someone to change their convictions. Yet, there are several reasons why individual users bother to reply to erroneous statements online. It is partly to ensure the public record does not stand incorrectly. The comment feed is not merely a casual conversation addressed to a single person; it is published content that will remain on this video for an extended period of time. Subjectively incorrect statements become more important when comment impressions are taken into account. Based on responses given during the survey, very few individuals actively read entire comment threads. This means it is unlikely a large audience will consider any conversations in the future. The exposure immediately following posting a comment, that salient moment where uninterested users glance over the first responsive lines, is fundamental to correct falsehoods, or else those comments will be the only impression shared from the video. As one user bluntly put it mere moments after a user posted something illogical and misplaced on the thread, “I signed in just to tell you you’re stupid.” When a user posts, that act may facilitate momentary notoriety. There are several facts that support this view. If the motivating principle was simply to troll or debate single users, individuals would seek out and find a well matched opponent to argue. Instead, they comment on recent posts and try to minimize the elapsed time between comment and reply.

    This practice of commenting early and often to fresh comments is not always the case, by any means. Certain subjects lend themselves to more complex debates where users intend to correct behavior, language use, and/or disagree with the content underlying serious topics. Counter to the commonplace assumption about internet debate, some users do admit when they are wrong. These reasoned conversations are usually context specific. On July 12, YouTube users from around the world convened on a video discussing Malaysia’s 709 rally where police broke up the protestors with tear gas and other crowd control devices. Although there were some conversations that revolved around insults and offensive language, a significant number of the users engaged in well-reasoned exchanges (July 12, 2011), such as the following:

    Speaker A: “I agree with the abuse of power. But when will you realise the police weren’t there to hurt the citizens? It’s illegal demonstration. Accept the reality. If you were in Egypt, or Libya or other third world countries (and we are one of them), you would be shot-at point blank. I am a peace loving person and it hurts to see so many people are willing to get themselves killed in order to achieve one mans agenda. It’s just wrong.”
    Speaker B: @Speaker A “u are wrong dude… we are fighting for clean and fair election for Malaysian.. not on anyone’s agenda..”
    Speaker A: @Speaker B “ Yeah. If you say so. But I don’t believe the police were all bad. Most of them are good people. Maybe a few went berserk and hurt some fellow citizens. But of course, as we all should know, they are human they make mistakes just like the rest of us.”
    Speaker C: @Speaker A “ yeah, you are right. But when i saw this scene, my heart was twitching :(“

    These strangers connect and disagree while sharing their ideas in a computer mediated setting, and, quite remarkably, even admit when they are incorrect. Opinion transmission takes these dual forms of expressive power. The first is to correct high profile remarks that are considered incorrect, but the other is to engage in thought provoking conversation that is neither supportive speech nor trolling. Another way to share opinions on subjects while gaining momentary noteriety is through humor.

    Joking. A sense of humor is common enough among users to be considered requisite for conversing on certain YouTube video content categories. Jokes are especially important if the user wants to receive the popular support of other commenters. The more prolific jokes contain material borrowed from the videos being watched; the user posts the funniest lines within quotation marks, and other users will vote for that user if they agree it was the best segment. At other times, users will use extracted material from the broader YouTube culture, other websites, or even popular news items. Both of these joking postures demonstrate what Gary Fine and Michaela Soucey describe as joking culture, but with a unique variation (2005).

    Fine and Soucey discuss the embedding of joking culture as “known humorous themes that are returned to repeatedly throughout group interaction” (2005, p.1). They claim that a humorous remark may be exchanged with a stranger, but for a joking culture to be established, an on-going interpersonal relationship is necessary. Although the foundations of their theory are correct, this focus on interpersonal relationships is not necessary for anonymous and public online communication. Through the creation of internet cultures and speech communities, the users are not necessarily familiar with the person they are corresponding with, but they are intimately aware of the role users perform based on speech signals and site context. These clues direct users to understand speech modes as established community networks instead of individual relationships, enabling them to successfully joke online. Fine argues that relationships give the joker the right to joke and the ability to get away with such antics. Online, everyone is empowered to create witticisms with no concern for who the audience might actually be; furthermore, the separation of speech from the speaker means that the speaker can almost always get away with it. This dualism means that humor is no longer a signal for the social robustness of a relationship, but instead functions as a general practice bound within internet cultures. That also means that the material used must resonate with the virtual audience. Since the users come from contrasting backgrounds, they construct jocular elements from shared internet experience, or what Fine and Soucey refer to as idioculture. In the YouTube setting, joking material is accessed in several ways. It is often from the video being watched, since those who are attracted to that video know that their fellow audience members are also likely to find the same components humorous. The more intriguing jokes come from content that is not video specific and is instead relevant to the entire site. Although not contextually bound, these salient cultural items are time specific. The material is designed to be funny for the largest possible group, which means that even if the jokes are considered aggressive by YouTube’s community standards, they are accepted and admirable in that particular setting. With some frequency, counter movements will develop in the defense of whomever or whatever is on the receiving end of the humor.

    The strongest examples for this phenomenon on YouTube during the summer of 2011 are Justin Bieber and current event references, such as the downgrade of the United States’ credit rating. As mentioned during my discussion of economic engagement, Justin Bieber rose to fame as a result of his YouTube popularity. His use as a contested humor item is apparent after noting the voting record for the majority of his videos. Those cultural items that are unpopular, but that are not transformed into joking material will have a small sub-group that view and support the original poster. Alternatively, for those videos that attract an audience that is intent on disparaging the subject, the down votes will override the up votes with an approximately two-thirds majority. (In certain cases where the voting record is more strongly weighed towards thedown votes, there are usually geopolitical or humanistic concerns involved, e.g. a video showing Hitler would likely receive all down votes, yet he is not considered joking material.) Users often discuss these subjects in great detail. They argue about why someone is hated on, what the correct forum for such statements may be, and other nuances of using these jokes. A few examples appeared on a Justin Bieber video posted February 19, 2011:

    Speaker A: “PLEASE READ THIS COMMENT! the reason this[pejorative] is getting so many views is because millions of people are just looking up this video to read the comments or write a comment making fun of him. so if people just don’t look up his videos and just ignore him then eventually all the other good music videos will out view this one and he will be forgotten.”
    Speaker B: @Speaker A “Please respect the LGBT community and try to refrain from using bigoted/prejudice terms like faggot to describe something you don’t like. You’re not making anything easier for LGBT youth experiencing discrimination. Thank you.”
    Speaker C: @Speaker A “why in the actual [f---] would you even say that. Making comments such as the one you just made is basically asking for [b----] fights. The kids famous, get the [f---] over it. Honestly it’s [f------] retarded, you don’t look cool or tough using the [f------] internet to bash on people you’ve never met, and obviously never will meet. It’s youtube, idiot. I’m sure any respectable artist doesn’t give two [s----]  if they have top views. Grow the [f---] up, and get real. Thanks.” (August 10, 2011)

    These individuals are very opinionated on the correct use of humorous material within the site. Speaker B, however, was more concerned with the use of sexually discriminatory remarks. Despite the best effort of many users, comments that refer to “fags” or “gays” are still common on many threads and are part of the heteronormative culture. That culture is also apparent in the implicit assumption that all users are young white males until the user signals otherwise in their speech style, or they expressly construct their identity as separate from the standard group. The categorization of conversational partners is both helpful and harmful to opinion transmission if the user misconstrues others in the commentary thread, since their speech is then less effective. Ultimately, opinion transmission as a speech practice is common throughout YouTube’s content categories, but it changes form and expressive power depending on the context and intent of the user. Other users also correct for behavior they deem to be inappropriate, as seen in the above LGBT comment.

    Concluding Discussion

    YouTube represents a large potential speech community from which differing speech practices and internet cultures are constituted. As an innovative website at the forefront of user experience, YouTube will continue to attract diverse traffic, and consequently, is a highly fluid and permeable environment. The phenomena I discussed above, trolling, economic advancement, social-networking, supportive speech, and opinion transmission, will continue to be relevant in the future because they are based on the user’s communicative goals. However, it is quite possible that their forms and rules for engagement will change. In the findings section I discuss and describe how these roles operate in pragmatic settings with individual users, but the more abstract discussion involves how those individual decisions can coalesce into a speech mode shared by many.

    My theory relies on the conglomerate force of individual actors exerting social pressure on others through the formation of speech modes. Especially in America where citizens are told they have the personal autonomy and authority granted to them by a democratic system, speakers believe that their conversational choices are a conscious and deliberate thought process that is then translated into individualized action. This constructed belief in unadulterated individual agency is a myth. Émile Durkheim began the process of demonstrating social facts and a collective consciousness in his 1879 monograph Le Suicide. Durkheim believed individual decisions that compose the fabric of society aren’t random, but rather that anyone with method can discover a discernible pattern for behavior. The pattern is governed by a social fact, and I contend that speech modes are related to Durkheim’s theory, but they require a more flexible conceptualization of how they function.

    Raymond Williams further debunks this mythical individuality and provides structural flexibility in his 1978 book Marxism and Literature while describing structures of feeling. He wrote about the ephemeral essence of a social arena that is expressed through media and communication networks as:

    a social experience which is still in progress, often indeed not yet recognized as social but taken to be private, idiosyncratic, and even isolation, but which in analysis… has its emergent, connecting, and dominant characteristics, indeed its specific hierarchies. These are often more recognizable at a later stage, when they have been…formalized, classified, and in many cases built into institutions and formations (132).

    Communicative choices are made by individuals, but they are restricted to a limited range of options by the overarching structure of a social sphere. While Williams described how media can convey feeling and emotional tension using a common understanding of a population, he foreshadows the creation of online communities where users create a similar social dynamic. Structures of feeling capture the lived experience of members while they are interacting in the present. Most other social theories falsely solidify cultural structures after they are useful to a social group. Speech modes occupy the present, and therefore are beyond the structural theories of older descriptive models. Online cultures and their derivative effects on behavior are bounded, similar to the social phenomena Williams observed in the physical world. Those who occupy the internet’s social sphere prefer independence and individuality, which in turn drives the technology’s structure. The speech is anonymous, forceful, and unique for that very reason. These principles are shared beliefs, the feelings of a community, but the individual users do not approach conversations and explicitly ask others what is happening and why. Only through practice and experience do they gain an intuition for the social interaction around them and begin exhibiting particular speech practices. The use of these speech ways do not require excessive expertise to utilize correctly, and nothing contained in this research should surprise a YouTube community member. The generative component is explaining how speech modes form and why they are shared, a task initiated by this research and hopefully continued with future investigation.

    Limitations and Future Research

    A large body of contemporary social science research pursues one of two interesting tasks. Many studies either attempt to reinvent the wheel using newer methods, or they focus on cutting edge phenomena that impact a miniscule part of the population. While both are necessary to correct historical fallacies and pave the way for future investigation, I believe insufficient time is spent scrutinizing the territory between these two extremes where the majority of people engage with their own humanity. Previous work within YouTube falls victim to this critique. With the advent of newer and more accessible technology, communication research has focused on novel video-responses while overlooking the user commentary; that trend towards the sexier research subjects continues today. Technology is developing at an unsustainable rate for cultural immersion and amalgamation to occur within a larger audience. Sub-groups (known as early adopters in marketing) develop that continue to incorporate this material culture, while the bulk of society struggles with reformulating their lives around older forms of technology. It is comparable to studying aviation over vehicular travel in the 1920′s, five-star restaurants over a local chain of bakeries, or face-to-face phone chatting over standard cell-phone communication, and then claiming to have found the answers to social investigation’s grand questions. I believe it is important to focus on common practice with greater attention, because it is there that unconscious and intuitive social behavior occurs. It is accordingly closer to the answers social scientists wish to find, the answers that are imbedded beyond the consideration of a single rational mind.

    In this manner, I believe that future computer mediated communications research would benefit most from a thorough understanding of the divide between those within a public setting who use interactive features, and those who do not. The large group of silent viewers are noted by page and video view counters, which subsequently impact popularity and represent an understudied component of memes (viral cultural material). Further investigations would also benefit from more rigorous statistical analysis of meta-data provided by public sites with interactive features. By studying demographic and user data, scientists can discover accurate trends that will inspire useful and complementary ethnographic investigation.


    • Anderson, B. (1983). Imagined communities: Reflections on the origin and spread of nationalism. London: Verso.
    • Austin, J.L. (1975). How to do things with words. Boston: Harvard University Press.
    • Bolter, J.D., & Grusin, R. (2000). Remediation: Understanding new media. Boston: The MIT Press.
    • Bourdieu, P. (1977). Outline of a theory of practice. Cambridge, MA: Cambridge University Press.
    • Certeau, M.D. (1984). The practice of everyday life. Berkeley: University of California Press.
    • Clark, J. (2009). Public media 2.0: Dynamic, engaged publics. Washington, D.C.: American University, School of Communication, Center for Social Media. Retrieved from http://www.centerforsocial
    • Durkheim, E. (1897). Suicide: A study in sociology. trans. Spaulding, J., and Simpson, G. (1951). New York: Free Press of Glenco.
    • Evans, A. (2001). This virtual life: Escapism and simulation in our media world. London: Fusion Press.
    • Fine, G., & Soucey, M. (2005). Joking cultures: humor themes as social regulation in group life. International Journal of Humor Research, 18, 1-22.
    • Goode, L. (2005). Jurgen Habermas: Democracy and the public sphere. London: Pluto Press.
    • Green, L. (2002). Communication, technology and society. London: SAGE Publications Ltd.
    • Herbermann, C.G. (1907). The Catholic encyclopedia. New York: The Encyclopedia Press.
    • Hymes, D. (1964). Language in culture and society: a reader in linguistics and anthropology. New York: Harper & Row.
    • Internet World Stats. (2011, March 23). Retrieved July 27, 2011 from Miniwatts Marketing Group:
    • Jewkes, Y. (2003). Dot.cons: Crime, deviance and identity on the internet. Cullompton: Willan.
    • Lange, P.G. (2007). Commenting on comments: Investigating responses to antagonism on YouTube. Paper contributed to the Society for Applied Anthropology Conference, FL.
    • McQuade, S. C., Colt, J. P., & Meyer, N. B. (2009). Cyber bullying: Protecting kids and adults from online bullies. Westport, Conn.: Praeger Publishers.
    • Oh Internet. (2011). Retrieved from
    • Poole, C. (February 2010). Christopher “moot” poole: The case for anonymity online. TED talks. Retrieved from
    • Schiffrin, D., Tannen, D., & Hamilton, H.E. (2003). The handbook of discourse analysis. Malden, MA: Wiley-Blackwell.
    • Scott, J. (June 2011). YouTube testing new “reaction” buttons: Omg, epic, lol, fail, wtf, & cute. Retrieved from
    • Stephens, B. (2011, July 19). News of the world vs. wikileaks. The Wall Street Journal. Retrieved from
    • Suler, J.R. (2004). The online disinhibition effect. In CyberPsychology and Behavior7, 321-326. Retrieved from
    • The YouTube Team. YouTube community guidelines. Retrieved from
    • Thompson, J.B. (1995). The media and modernity: a social theory of the media. Palo Alto, CA: Stanford University Press.
    • Watkins, S.C. (2009). The young and the digital: What the migration to social-network sites, games, and anytime, anywhere media means for our future. Boston: Beacon Press.
    • Williams, M. (2006). Virtually criminal crime, deviance and regulation online. London: Routledge.
    • Williams, R. (1978). Marxism and literature. New York: Oxford University Press, USA.

    Figure 1: Speech Mode Formation

    Figure 2: YouTube Decision Tree- Primary Speech Act

    Figure 3: YouTube Decision Tree- Secondary Speech Act