QUOTUS


The Structure of Political Media Coverage as Revealed by Quoting Patterns

Vlad Niculae*, Caroline Suen*, Justine Zhang*, Cristian Danescu-Niculescu-Mizil, and Jure Leskovec
Proceedings of WWW 2015. PDF



Media outlets are commonly accused of exhibiting systematic bias in their selection and portrayal of issues to cover. We seek to capture this bias by examining the set of quotes various media outlets cite from presidential addresses. Quotes are especially suitable since they correspond to an outlet’s explicit choices of whether or not to cover specific parts of a larger statement. For instance, the above image illustrates differences and similarities in quoting volume by outlets commonly considered to be liberal and conservative, for the 2010 State of the Union address. Check out the respective quotes and outlets citing them in our interactive visualization.
Scroll down for more information.

About


We propose a framework based on quoting patterns for characterizing and quantifying the degree to which media outlets exhibit systematic bias. We apply this framework to a massive dataset of news articles spanning Barack Obama's presidency. By encoding quoting patterns in a low-rank space we provide an analysis of the structure of political media coverage. More details in our paper.



Quoting patterns reveal a latent media bias space that aligns surprisingly well with political ideology and outlet type. Here we show a projection of media outlets, and linguistic features of the cited quotes, onto the first two latent bias dimensions. From left to right: 1. a selection of media outlets (more in the paper); 2. sentiment of the quoted paragraphs; 3. proportion of cited quotes containing negation. We can see that outlets mapped to the mainstream conservative side of the latent space focus on quotes that portray a presidential persona disproportionately characterized by negativity.

Click here for an interactive visualization of quote patterns.



Download QUOTUS Data


Quotes: quotes.tar.gz (37 MB)

Data contains quotes, their locations within the corresponding White House speech, and information about the articles which cite them. Details in this README.


White House speech transcripts: transcripts.tar.gz (33 MB)

Data contains preprocessed transcripts of speeches which were originally downloaded from the White House archive. Details in this README.

Abstract


Given the extremely large pool of events and stories available, media outlets need to focus on a subset of issues and aspects to convey to their audience. Outlets are often accused of exhibiting a systematic bias in this selection process, with different outlets portraying different versions of reality. However, in the absence of objective measures and empirical evidence, the direction and extent of systematicity remains widely disputed.

In this paper we propose a framework based on quoting patterns for quantifying and characterizing the degree to which media outlets exhibit systematic bias. We apply this framework to a massive dataset of news articles spanning the six years of Obama's presidency and all of his speeches, and reveal that a systematic pattern does indeed emerge from the outlet's quoting behavior. Moreover, we show that this pattern can be successfully exploited in an unsupervised prediction setting, to determine which new quotes an outlet will select to broadcast. By encoding bias patterns in a low-rank space we provide an analysis of the structure of political media coverage. This reveals a latent media bias space that aligns surprisingly well with political ideology and outlet type. A linguistic analysis exposes striking differences across these latent dimensions, showing how the different types of media outlets portray different realities even when reporting on the same events. For example, outlets mapped to the mainstream conservative side of the latent space focus on quotes that portray a presidential persona disproportionately characterized by negativity.

BibTeX entry:

            @InProceedings{Niculae+al:15a,

            author={Vlad Niculae, Caroline Suen, Justine Zhang,

            Cristian Danescu-Niculescu-Mizil, Jure Leskovec},

            title={QUOTUS: The Structure of Political Media Coverage

            as Revealed by Quoting Patterns},

            booktitle={Proceedings of WWW 2015},

            year={2015}

            }