AI methods for studying media homogenization
This article briefly explains one case study where media and journalism scholars productively come together with computer science scholars.
Background:
Media pluralisation is important. News framework ought to hold powerful accountable, inform citizens with diverse perspectives, expose falsehoolds, and enable effective public participation in self-governance by fostering a shared understanding of the world. We wanted to analyze the plurality of the journalism system and homogenization tendencies within it.
In this study we wanted to analyze key dimensions and patterns of diversity in media production and distribution in Slovenia. The research question is to what extent is the content in slovenian media homogenised and pluralised.
How do we approach this from a methodological standpoint? We do so by analyzing weekly news production of 213 Slovenian media outlets (radio, TV, and internet) comprising altogether 15,338 news items.
This would be a time consuming research but the help of AI methods proved to be helpful.
The paper analyzes the weekly news production of all (213) relevant Slovenian media outlets (radio, TV, and internet), comprising 15,338 news items. It approaches the study of homogenization from three distinct angles: the homogenization of topics, sources, and opinions. By transcribing audio and video content and applying topic recognition and clustering techniques, the study identifies the topics and issues covered by each media outlet and examines the extent of their homogenization within the Slovenian media system. The topic recognition process employs natural language processing (NLP) techniques to detect recurring themes, while clustering is conducted using hierarchical and k-means methods to group similar news items.
We do this by examining the extent and patterns dedicated to public topics and events and the space they dedicate to different societal actors. We analyse slovenian tv, radio, print and online media content. For methods, we use large dataset analysis.
The study investigates media source diversity by identifying and categorizing sources through entity recognition models and manual coding. It assesses source dispersion, which measures the distribution of affiliations and status positions, and content dispersion, which evaluates the range of ideas, perspectives, and opinions present in a news item (Voakes et al., 1996). Ultimately, the study introduces a media homogeneity index, offering an original perspective on the evolving role of journalism in ensuring democratic accountability within a specific societal context.
This is an example of how social sciences and computer science researchers can come together to get insights into society.
This research will be presented at Futre of Journalism conference, Conflicting Journalisms: Resistance, Struggle, and Prospects in Cardiff 11-12 September. With authors: Igor Vobič, Boris Mance, Aleš Žagar and Marko Robnik Šikonja,