Large Language Models and Provenance Metadata for Determining the Relevance of Images and Videos in News Stories

Authors: Tomas Peterka, Matyas Bohacek

Published: 2025-02-13 16:48:27+00:00

AI Summary

This paper proposes a system using a large language model (LLM) to assess the relevance of images and videos in news stories by analyzing article text and media provenance metadata. The system determines if the media's origin and any edits are relevant to the news article, providing an overall relevance assessment.

Abstract

The most effective misinformation campaigns are multimodal, often combining text with images and videos taken out of context -- or fabricating them entirely -- to support a given narrative. Contemporary methods for detecting misinformation, whether in deepfakes or text articles, often miss the interplay between multiple modalities. Built around a large language model, the system proposed in this paper addresses these challenges. It analyzes both the article's text and the provenance metadata of included images and videos to determine whether they are relevant. We open-source the system prototype and interactive web interface.


Key findings
The proposed method effectively combines LLM analysis with provenance metadata to assess media relevance. A prototype web interface demonstrates the system's functionality. Limitations include potential LLM hallucinations and the current lack of widespread provenance metadata adoption.
Approach
The system leverages a large language model (LLM) to analyze news article text, image/video captions, and provenance metadata (origin, edits, etc.). The LLM determines the relevance of the media's origin and any edits, providing an overall relevance assessment. A prototype web interface allows for interaction.
Datasets
UNKNOWN
Model(s)
Phi-3 LLM
Author countries
Czech Republic, United States