research product . 2022

Conditional Neural Headline Generation for Finnish

Koppatz, Maximilian;
Open Access English
  • Published: 01 Jan 2022
  • Publisher: Helsingin yliopisto
  • Country: Finland
Abstract
Automatic headline generation has the potential to significantly assist editors charged with head- lining articles. Approaches to automation in the headlining process can range from tools as creative aids, to complete end to end automation. The latter is difficult to achieve as journalistic require- ments imposed on headlines must be met with little room for error, with the requirements depending on the news brand in question. This thesis investigates automatic headline generation in the context of the Finnish newsroom. The primary question I seek to answer is how well the current state of text generation using deep neural language models can be applied to the headlining process in Finnish news media. To answer this, I have implemented and pre-trained a Finnish generative language model based on the Transformer architecture. I have fine-tuned this language model for headline generation as autoregression of headlines conditioned on the article text. I have designed and implemented a variation of the Diverse Beam Search algorithm, with additional parameters, to perform the headline generation in order to generate a diverse set of headlines for a given text. The evaluation of the generative capabilities of this system was done with real world usage in mind. I asked domain-experts in headlining to evaluate a generated set of text-headline pairs. The task was to accept or reject the individual headlines in key criteria. The responses of this survey were then quantitatively and qualitatively analyzed. Based on the analysis and feedback, this model can already be useful as a creative aid in the newsroom despite being far from ready for automation. I have identified concrete improvement directions based on the most common types of errors, and this provides interesting future work.
Subjects
ACM Computing Classification System: ComputingMilieux_THECOMPUTINGPROFESSION
free text keywords: text generation, natural language processing, deep learning, algorithms, headline generation, ei opintosuuntaa, no specialization, ingen studieinriktning, Datatieteen maisteriohjelma, Master's Programme in Data Science, Magisterprogrammet i data science
Related Organizations
Communities
  • Digital Humanities and Cultural Heritage
Any information missing or wrong?Report an Issue