Program Overview

Type of Event Dates Location
Workshop / Tutorial / Shared Task Day 1 10. September 2024 Radetzkystraße 2, 1030 Vienna
Main Conference Day 1 11. September 2024 Währingerstraße 29, 1090 Vienna
GSCL-DGfs-CL Networking Lunch* 11. September 2024 (12:00) Restaurant Rebhuhn, Berggasse 24, 1090 Vienna
Conference dinner 11. September 2024 (18:45) Zum Martin Sepp, Cobenzlgasse 34, 1190 Vienna
Main Conference Day 2 12. September 2024 Währingerstraße 29, 1090 Vienna
GSCL Members Meeting 12. September 2024 (17:30) Währingerstraße 29, 1090 Vienna
GSCL PhD AWARD 12. September 2024 Währingerstraße 29, 1090 Vienna
PhD dinner 12. September 2024 (19:30) Restaurant Hansy, Heinestraße 42, 1020 Vienna
Workshop / Tutorial / Shared Task Day 2 13. September 2024 Währingerstraße 29, 1090 Vienna

GermEval Shared Task 1: GERMS-DETECT Workshop, 10. September 2024

Time Session/Event Location
10. Sept, 12:00-16:00 Paper presentations and discussions Radetzkystraße 2, 1030 Vienna
10.Sept,16:30-17:30 Podium discussion Radetzkystraße 2, 1030 Vienna

Please find the entire program of the workshop here.

The paper presentations and discussions are followed by a panel discussion with representatives from the media, law and ethics on "The role of AI in media production".

Please find details on the panel discussion here in English and in German.

Please bring your passport or identity card to get access to the event space (workshop and panel discussion)!

Main conference

All abstracts for the main conference can be found here

Proceedings (pdf)

Please, go to this Google calendar for a detailed schedule!

11. September 2024

Time Session/Event
11. Sept., 8:30-9:30 Registration
11. Sept., 9:30-10:00 Opening
11. Sept., 10:00-11:00 Keynote 1, Leonie Weißweiler
11. Sept., 11:00-12:00 Poster 1 + Coffee
11. Sept., 12:00-14:00 Break
11. Sept., 14:00-15:00 Keynote 2: Sebastian Schuster
What does it mean for a language model to exhibit a language understanding ability?
11. Sept., 15:00-16:00 Poster 2 + Coffee
11. Sept., 16:00-17:00 Oral 1
11. Sept., 17:00-19:00 Social event**
11. Sept., 19:00-21:00 Dinner

** Social event: organized bus trip to the viewing platform Cobenzl on top of Vienna's vineyards, followed by a short walk (30 mins) downhill through the vineyards and to the dinner location. Alternatively, if you do not wish to join the walk, you can only join the trip to the viewing platform and take the private bus down to the dinner location.

12. September 2024

Time Session/Event
12. Sept., 8:30-9:30 Registration
12. Sept., 9:30-10:30 Keynote 3, Jana Diesner
12. Sept., 10:30-11:00 Coffee
12. Sept., 11:00-12:00 Oral 2
12. Sept., 12:00-14:00 Break
12. Sept., 14:00-15:00 Oral 3
12. Sept., 15:00-16:00 Poster 3 + Coffee
12. Sept., 16:00-17:00 GSCL PhD Award
12. Sept., 17:00-17:30 Closing session
12. Sept., 17:30 GSCL Members Meeting

Oral presentations & posters*

Paper number Title Authors Presentation slot
3 CO-Fun: A German Dataset on Company Outsourcing in Fund Prospectuses for Named Entity Recognition and Relation Extraction Neda Foroutan, Markus Schröder, Andreas Dengel Poster 2
8 Lex2Sent: A bagging approach to unsupervised sentiment analysis Kai-Robin Lange, Jonas Rieger, Carsten Jentsch Oral 3
9 Discourse-Level Features in Spoken and Written Communication Hannah J. Seemann, Sara Shahmohammadi, Manfred Stede, Tatjana Scheffler Oral 3
10 Semiautomatic Data Generation for Academic Named Entity Recognition in German Text Corpora Pia Schwarz Poster 2
11 GERestaurant: A German Dataset of Annotated Restaurant Reviews for Aspect-Based Sentiment Analysis Nils-Constantin Hellwig, Jakob Fehle, Markus Bink, Christian Wolff Poster 2
12 Revisiting the Phenomenon of Syntactic Complexity Convergence on German Dialogue Data Yu Wang, Hendrik Buschmeier Poster 1
13 OMoS-QA: A Dataset for Cross-lingual Extractive Question Answering in a German Migration Context Steffen Kleinle, Jakob Prange, Annemarie Friedric Oral 2
16 Few-Shot Prompting for Subject Indexing of German Medical Book Titles Lisa Kluge, Maximilian Kähler Poster 2
19 Querying Repetitions in Spoken Language Corpora Elena Frick, Henrike Helmer, Dolores Lemmenmeier-Batinić Poster 3
21 A comparison of data filtering techniques for English-Polish neural machine translation in the biomedical domain Jorge del Pozo Lérida, Kamil Kojs, Janos Mate, Mikołaj Antoni Barański, Christian Hardmeier Poster 3
22 Linguistic and extralinguistic factors in automatic speech recognition of German atypical speech Eugenia Rykova, Mathias Walther Poster 3
23 An Improved Method for Class-specific Keyword Extraction: A Case Study in the German Business Registry Stephen Meisenbacher, Tim Schopf, Weixin Yan, Patrick Holl, Florian Matthes Poster 2
29 A Multilingual Dataset of Adversarial Attacks to Automatic Content Scoring Systems Ronja Laarmann-Quante, Christopher Chandler, Noemi Incirkus, Vitaliia Ruban, Alona Solopov, Luca Steen Poster 3
30 Version Control for Speech Corpora Vlad Dumitru, Matthias Boehm, Martin Hagmüller, Barbara Schuppler Poster 3
31 Tabular JSON: A Proposal for a Pragmatic Linguistic Data Format Adam Roussel Poster 2
37 Redundancy Aware Multiple Reference Based Gainwise Evaluation of Extractive Summarization Mousumi Akter, Shubhra Kanti Karmaker Sant Oral 1
38 Complexity of German Texts Written by Primary School Children Jammila Laâguidi, Dana Neumann, Ronja Laarmann-Quante, Stefanie Dipper, Mihail Chifligarov Poster 1
42 Using GermaNet for the Generation of Crossword Puzzles Claus Zinn, Marie Hinrichs, Erhard Hinrichs Poster 1
46 A Crosslingual Approach to Dependency Parsing for Middle High German Cora Haiber Poster 1
47 Discourse Parsing for German with new RST Corpora Sara Shahmohammadi, Manfred Stede Poster 3
48 Fine-grained quotation detection and attribution in German news articles Fynn Petersen-Frey, Chris Biemann Oral 1
50 AustroTox: A Dataset for Target-Based Austrian German Offensive Language Detection Pia Pachinger, Janis Goldzycher, Anna Maria Planitzer, Wojciech Kusa, Allan Hanbury, Julia Neidhardt Poster 3
51 OneLove beyond the field - A few-shot pipeline for topic and sentiment analysis during the FIFA World Cup in Qatar Christoph Rauchegger, Sonja Mei Wang, Pieter Delobelle Poster 3
53 How to Translate SQuAD to German? A Comparative Study of Answer Span Retrieval Methods for Question Answering Dataset Creation Jens Kaiser, Agnieszka Falenska Poster 2
55 LLM-based Translation Across 500 Years. The Case for Early New High German Martin Volk, Dominic P. Fischer, Patricia Scheurer, Raphael Schwitter, Phillip Benjamin Ströbel Poster 3
56 Binary indexes for optimising corpus queries Peter Ljunglöf, Nicholas Smallbone, Mijo Thoresson, Victor Salomonsson Poster 2
58 Leveraging Cross-Lingual Transfer Learning in Spoken Named Entity Recognition Systems Moncef Benaicha, David Thulke, Mehmet Ali Tuğtekin Turan Poster 2
60 Evaluating and Fine-Tuning Retrieval-Augmented Language Models to Generate Text with Accurate Citations Vinzent Penzkofer, Timo Baumann Poster 1
62 Decoding 16th-Century Letters: From Topic Models to GPT-Based Keyword Mapping Phillip Benjamin Ströbel, Stefan Aderhold, Ramona Roller Oral 1
63 Towards Improving ASR Outputs of Spontaneous Speech with LLMs Karner Manuel, Julian Linke, Mark Kröll, Barbara Schuppler, Bernhard C Geiger Poster 3
64 Exploring Data Acquisition Strategies for the Domain Adaptation of QA Models Maurice Falk, Adrian Ulges, Dirk Krechel Poster 2
66 Estimating Word Concreteness from Contextualized Embeddings Christian Wartena Poster 1
67 Features and Detectability of German Texts Generated with Large Language Models Verena Irrgang, Veronika Solopova, Steffen Zeiler, Robert M. Nickel, Dorothea Kolossa Oral 3
68 Exploring Automatic Text Simplification for Lithuanian Justina Mandravickaitė, Egle Rimkiene, Danguolė Kalinauskaitė, Danguolė Kotryna Kapkan Poster 1
69 Large Language Models as Evaluators for Scientific Synthesis Julia Evans, Jennifer D'Souza, Sören Auer Poster 1
71 Role-Playing LLMs in Professional Communication Training: The Case of Investigative Interviews with Children Don Tuggener, Teresa Schneider, Ariana Huwiler, Tobias Kreienbühl, Simon Hischier, Pius von Däniken, Susanna Niehaus Oral 2
72 Analysing Effects of Inducing Gender Bias in Language Models Stephanie Gross, Brigitte Krenn, Craig Lincoln, Lena Holzwarth Oral 2
73 Exploring Phonetic Features in Language Embeddings for Unseen Language Varieties of Austrian German Lorenz Gutscher, Michael Pucher Poster 3
76 Word alignment in Discourse Representation Structure parsing Christian Obereder, Gabor Recski Poster 1

*All abstracts for the main conference can be found here

Workshops / Shared Tasks, 13. September 2024 (preliminary)

Time Event Location
13.Sept, 9:00-18:00
Coffee breaks:
11:00-11:30 15:00-15:30
GermEval Shared Task 2: Statement in German Easy Language (StaGE) SR 3
13.Sept, 9:00-18:00
Coffee breaks:
11:00-11:30 15:00-15:30
Workshop on Linguistic Insights from and for Multimodal Language Processing (LIMO) SR 4
13.Sept, 9:00-18:00
Coffee breaks:
11:00-11:30 15:00-15:30
Workshop on Computational Linguistics for Political Text Analysis (CPSS) Lecture Hall 1 (HS1)

Accepted Workshops and Shared Tasks

GermEval Shared Task 1: GERMS-DETECT Sexism Detection in German Online News Fora (GERMS-DETECT); planned for 10. September 2024; see https://ofai.github.io/GermEval2024-GerMS/

GermEval Shared Task 2: Statement in German Easy Language (StaGE); planned for 13. September 2024; see https://german-easy-to-read.github.io/statements/ for more information!

Workshop on Computational Linguistics for Political Text Analysis (CPSS); planned for 13. September 2024; see https://sites.google.com/view/cpss2024konvens/home-page for more information!

Workshop on Linguistic Insights from and for Multimodal Language Processing (LIMO); planned for 13. September 2024; https://sites.google.com/view/limo-2024 for more information!

Keynotes

Keynote 1: Leonie Weissweiler, UT Austin

Constructions all the way down: rethinking compositionality in LLMs

Abstract:Why are LLMs still not modelling all aspects of language perfectly? Previous works suggested is their deficits in compositionality, regularly building the meaning of an expression as a function of its parts. But in fact, human language is not compositional in this way. Rather, meaning is combined compositionally using constructions, which are pairings of form and function that vary wildly in shape and scope. This means that to achieve the full creativity and flexibility of human language, LLMs will have to assign meaning to constructions and use this to build the meaning of expressions. I will show that this is still not adequately handled by LLMs, and elaborate why construction-compositionality is one of the last remaining challenges that we must solve on our way to more cognitively plausible language models.

Short Bio: Leonie Weissweiler is a postdoc at UT Austin Linguistics where she works with Kyle Mahowald on the computational learnability of rare linguistic phenomena. She received her PhD from LMU Munich in July 2024, where she worked with Hinrich Schütze on the contributions of Construction Grammar and Morphology to NLP, and vice versa. Her research now focuses on using language models to discover and test hypotheses in Linguistics, while using insights from Linguistics to point out issues with language models.

Keynote 2: Sebastian Schuster, University College London

What does it mean for a language model to exhibit a language understanding ability?

Abstract: Large language models (LLMs) such as GPTs, Gemini or Llama often provide answers that fulfil user requests, which suggests that the model is at least to a large extent able to infer the user’s intent and to generate appropriate responses. However, given the open-ended nature of user requests and model responses, it has been quite challenging to systematically evaluate to what extent models exhibit specific language understanding abilities. In my talk, I will focus on one such ability, namely keeping track of how the states of entities change as a discourse unfolds. I will use this ability as a case study for how different evaluation methods can lead to different conclusions about model abilities, I will discuss challenges in evaluating understanding abilities in LLMs and I will consider some recommendations on how to overcome some of these challenges.

Short Bio:
Sebastian Schuster is currently a lecturer in computational linguistics at University College London, and he will start a WWTF-funded research group at the University of Vienna in mid-2025. Before joining UCL, he was a postdoc at New York University and at Saarland University, after completing his PhD at Stanford University. His research focuses on computational semantics and pragmatics and he builds and evaluates computational models of interpreting language in context. His work has won awards at ACL and he has been a senior area chair and program chair at several *ACL conferences and workshops.

Keynote 3: Jana Diesner, Technical University of Munich

Using Natural Language Processing to Advance Social Science, Responsibly

Abstract: Leveraging natural processing techniques to consider the content of information at scale allows us to discover and re-evaluate theories and patterns of societal behavior. This process requires researchers to make a multitude of decisions that require expertise from multiple fields, including how to sample, represent, and preprocess data, implement algorithms, and validate results. I present findings and lessons learned from using NLP techniques, especially entity disambiguation and relation extraction, to study how and why people collaborate and respond to crises. I discuss sources of biases and strategies for mitigating them.

Short bio: Jana Diesner is a Full Professor at the Technical University of Munich, School of Social Science and Technology. There, she leads the Human Centered Computing group. Her interdisciplinary group works on methods from network analysis, natural language processing, machine learning and AI, and integrates them with theories from the social sciences to advance our knowledge about complex societal systems and responsible computing. Before joining TU Munich in 2024, she was a tenured professor at the School of Information Sciences at the University of Illinois Urbana Champaign. Jana earned her Ph.D. at Carnegie Mellon, School of Computer Science.