Analyst News model prediction study paper

Recognising non-named spatial entities in literary texts: a novel spatial entities classifier

Explosion AI Blogby ExplosionDecember 4, 20242 min read2 views

🧒Explain Like I'm 5Simple language

Hey there, little explorer! Imagine you have a super-smart robot friend who loves to read old storybooks, like from your great-grandma's time! 🤖📚

This robot friend is learning a special trick. It's trying to find all the secret places in the stories! Not places with names like "the park" or "Grandma's house," but places like "under the big tree" or "behind the tall bush." 🌳 hiding spot!

The robot uses a special brain, like a super-duper puzzle solver, to sniff out these hidden spots in old Swiss stories. It's like a treasure hunt, but for places! This helps grown-ups understand the stories even better. Isn't that cool? 🎉

In this paper, we present a case study on the prediction of what we call ‘non-named spatial entities’ (NNSE) in a historical corpus of Swiss-German novels using a deep learning model in conjunction with BERT and Prodigy.

Recognising non-named spatial entities in literary texts: a novel spatial entities classifier

(short paper)

Authors: Daniel Kababgi, Giulia Grisot, Federico Pennino and Berenike Herrmann

Presented in Session 2A: Literature

Paper: Download PDF

Abstract

Predicting spatial representations in literature is a challenging task that requires advanced machine learning methods and manual annotations. In this paper, we present a study that leverages manual annotations and a BERT language model to automatically detect and recognise non-named spatial entities in a historical corpus of Swiss novels. The annotated data, consisting of Swiss narrative texts in German from the period of 1840 to 1950, was used to train the machine learning model and fine-tune a deep learning model specifically for literary German. The annotation process, facilitated by the use of Prodigy, enabled iterative improvement of the model’s predictions by selecting informative instances from the unlabelled data. Our evaluation metrics (F1 score) demonstrate the model’s ability to predict various categories of spatial entities in our corpus. This new method enables researchers to explore spatial representations in literary text, contributing both to digital humanities and literary studies. While our study shows promising results, we acknowledge challenges such as representativeness of the annotated data, biases in manual annotations, and domain-specific language. By addressing these limitations and discussing the implications of our findings, we provide a foundation for future research in sentiment and spatial analysis in literature. Our findings not only contribute to the understanding of literary narratives but also demonstrate the potential of automated spatial analysis in historical and literary research.

Original source

Explosion AI Blog

https://2024.computational-humanities-research.org/papers/paper59/

Was this article helpful?