Program Overview

Type of Event	Dates	Location
Workshop / Tutorial / Shared Task Day 1	10. September 2024	Radetzkystraße 2, 1030 Vienna
Main Conference Day 1	11. September 2024	Währingerstraße 29, 1090 Vienna
GSCL-DGfs-CL Networking Lunch*	11. September 2024 (12:00)	Restaurant Rebhuhn, Berggasse 24, 1090 Vienna
Conference dinner	11. September 2024 (18:45)	Zum Martin Sepp, Cobenzlgasse 34, 1190 Vienna
Main Conference Day 2	12. September 2024	Währingerstraße 29, 1090 Vienna
GSCL Members Meeting	12. September 2024 (17:30)	Währingerstraße 29, 1090 Vienna
GSCL PhD AWARD	12. September 2024	Währingerstraße 29, 1090 Vienna
PhD dinner	12. September 2024 (19:30)	Restaurant Hansy, Heinestraße 42, 1020 Vienna
Workshop / Tutorial / Shared Task Day 2	13. September 2024	Währingerstraße 29, 1090 Vienna

* Only for members of the organizations GSCL/DGfs-CL.

Detailed schedule: https://calendar.google.com/calendar/u/0?cid=NzBmMjllN2FiYmYzMzJjY2E1MGY2MzQyYWVmYTJjMTJmNDZlYWEwOGVhZGQ4NmJmNDFjYjI2YjU2OGQ3MDljNkBncm91cC5jYWxlbmRhci5nb29nbGUuY29t

GermEval Shared Task 1: GERMS-DETECT Workshop, 10. September 2024

Time	Session/Event	Location
10. Sept, 12:00-16:00	Paper presentations and discussions	Radetzkystraße 2, 1030 Vienna
10.Sept,16:30-17:30	Podium discussion	Radetzkystraße 2, 1030 Vienna

Please find the entire program of the workshop here.

The paper presentations and discussions are followed by a panel discussion with representatives from the media, law and ethics on "The role of AI in media production".

Please find details on the panel discussion here in English and in German.

Please bring your passport or identity card to get access to the event space (workshop and panel discussion)!

Main conference

All abstracts for the main conference can be found here.

Proceedings (pdf)

Please, go to this Google calendar for a detailed schedule!

11. September 2024

Time	Session/Event
11. Sept., 8:30-9:30	Registration
11. Sept., 9:30-10:00	Opening
11. Sept., 10:00-11:00	Keynote 1, Leonie Weißweiler
11. Sept., 11:00-12:00	Poster 1 + Coffee
11. Sept., 12:00-14:00	Break
11. Sept., 14:00-15:00	Keynote 2: Sebastian Schuster What does it mean for a language model to exhibit a language understanding ability?
11. Sept., 15:00-16:00	Poster 2 + Coffee
11. Sept., 16:00-17:00	Oral 1
11. Sept., 17:00-19:00	Social event**
11. Sept., 19:00-21:00	Dinner

** Social event: organized bus trip to the viewing platform Cobenzl on top of Vienna's vineyards, followed by a short walk (30 mins) downhill through the vineyards and to the dinner location. Alternatively, if you do not wish to join the walk, you can only join the trip to the viewing platform and take the private bus down to the dinner location.

12. September 2024

Time	Session/Event
12. Sept., 8:30-9:30	Registration
12. Sept., 9:30-10:30	Keynote 3, Jana Diesner
12. Sept., 10:30-11:00	Coffee
12. Sept., 11:00-12:00	Oral 2
12. Sept., 12:00-14:00	Break
12. Sept., 14:00-15:00	Oral 3
12. Sept., 15:00-16:00	Poster 3 + Coffee
12. Sept., 16:00-17:00	GSCL PhD Award
12. Sept., 17:00-17:30	Closing session
12. Sept., 17:30	GSCL Members Meeting

Oral presentations & posters*

Paper number	Title	Authors	Presentation slot
3	CO-Fun: A German Dataset on Company Outsourcing in Fund Prospectuses for Named Entity Recognition and Relation Extraction	Neda Foroutan, Markus Schröder, Andreas Dengel	Poster 2
8	Lex2Sent: A bagging approach to unsupervised sentiment analysis	Kai-Robin Lange, Jonas Rieger, Carsten Jentsch	Oral 3
9	Discourse-Level Features in Spoken and Written Communication	Hannah J. Seemann, Sara Shahmohammadi, Manfred Stede, Tatjana Scheffler	Oral 3
10	Semiautomatic Data Generation for Academic Named Entity Recognition in German Text Corpora	Pia Schwarz	Poster 2
11	GERestaurant: A German Dataset of Annotated Restaurant Reviews for Aspect-Based Sentiment Analysis	Nils-Constantin Hellwig, Jakob Fehle, Markus Bink, Christian Wolff	Poster 2
12	Revisiting the Phenomenon of Syntactic Complexity Convergence on German Dialogue Data	Yu Wang, Hendrik Buschmeier	Poster 1
13	OMoS-QA: A Dataset for Cross-lingual Extractive Question Answering in a German Migration Context	Steffen Kleinle, Jakob Prange, Annemarie Friedric	Oral 2
16	Few-Shot Prompting for Subject Indexing of German Medical Book Titles	Lisa Kluge, Maximilian Kähler	Poster 2
19	Querying Repetitions in Spoken Language Corpora	Elena Frick, Henrike Helmer, Dolores Lemmenmeier-Batinić	Poster 3
21	A comparison of data filtering techniques for English-Polish neural machine translation in the biomedical domain	Jorge del Pozo Lérida, Kamil Kojs, Janos Mate, Mikołaj Antoni Barański, Christian Hardmeier	Poster 3
22	Linguistic and extralinguistic factors in automatic speech recognition of German atypical speech	Eugenia Rykova, Mathias Walther	Poster 3
23	An Improved Method for Class-specific Keyword Extraction: A Case Study in the German Business Registry	Stephen Meisenbacher, Tim Schopf, Weixin Yan, Patrick Holl, Florian Matthes	Poster 2
29	A Multilingual Dataset of Adversarial Attacks to Automatic Content Scoring Systems	Ronja Laarmann-Quante, Christopher Chandler, Noemi Incirkus, Vitaliia Ruban, Alona Solopov, Luca Steen	Poster 3
30	Version Control for Speech Corpora	Vlad Dumitru, Matthias Boehm, Martin Hagmüller, Barbara Schuppler	Poster 3
31	Tabular JSON: A Proposal for a Pragmatic Linguistic Data Format	Adam Roussel	Poster 2
37	Redundancy Aware Multiple Reference Based Gainwise Evaluation of Extractive Summarization	Mousumi Akter, Shubhra Kanti Karmaker Sant	Oral 1
38	Complexity of German Texts Written by Primary School Children	Jammila Laâguidi, Dana Neumann, Ronja Laarmann-Quante, Stefanie Dipper, Mihail Chifligarov	Poster 1
42	Using GermaNet for the Generation of Crossword Puzzles	Claus Zinn, Marie Hinrichs, Erhard Hinrichs	Poster 1
46	A Crosslingual Approach to Dependency Parsing for Middle High German	Cora Haiber	Poster 1
47	Discourse Parsing for German with new RST Corpora	Sara Shahmohammadi, Manfred Stede	Poster 3
48	Fine-grained quotation detection and attribution in German news articles	Fynn Petersen-Frey, Chris Biemann	Oral 1
50	AustroTox: A Dataset for Target-Based Austrian German Offensive Language Detection	Pia Pachinger, Janis Goldzycher, Anna Maria Planitzer, Wojciech Kusa, Allan Hanbury, Julia Neidhardt	Poster 3
51	OneLove beyond the field - A few-shot pipeline for topic and sentiment analysis during the FIFA World Cup in Qatar	Christoph Rauchegger, Sonja Mei Wang, Pieter Delobelle	Poster 3
53	How to Translate SQuAD to German? A Comparative Study of Answer Span Retrieval Methods for Question Answering Dataset Creation	Jens Kaiser, Agnieszka Falenska	Poster 2
55	LLM-based Translation Across 500 Years. The Case for Early New High German	Martin Volk, Dominic P. Fischer, Patricia Scheurer, Raphael Schwitter, Phillip Benjamin Ströbel	Poster 3
56	Binary indexes for optimising corpus queries	Peter Ljunglöf, Nicholas Smallbone, Mijo Thoresson, Victor Salomonsson	Poster 2
58	Leveraging Cross-Lingual Transfer Learning in Spoken Named Entity Recognition Systems	Moncef Benaicha, David Thulke, Mehmet Ali Tuğtekin Turan	Poster 2
60	Evaluating and Fine-Tuning Retrieval-Augmented Language Models to Generate Text with Accurate Citations	Vinzent Penzkofer, Timo Baumann	Poster 1
62	Decoding 16th-Century Letters: From Topic Models to GPT-Based Keyword Mapping	Phillip Benjamin Ströbel, Stefan Aderhold, Ramona Roller	Oral 1
63	Towards Improving ASR Outputs of Spontaneous Speech with LLMs	Karner Manuel, Julian Linke, Mark Kröll, Barbara Schuppler, Bernhard C Geiger	Poster 3
64	Exploring Data Acquisition Strategies for the Domain Adaptation of QA Models	Maurice Falk, Adrian Ulges, Dirk Krechel	Poster 2
66	Estimating Word Concreteness from Contextualized Embeddings	Christian Wartena	Poster 1
67	Features and Detectability of German Texts Generated with Large Language Models	Verena Irrgang, Veronika Solopova, Steffen Zeiler, Robert M. Nickel, Dorothea Kolossa	Oral 3
68	Exploring Automatic Text Simplification for Lithuanian	Justina Mandravickaitė, Egle Rimkiene, Danguolė Kalinauskaitė, Danguolė Kotryna Kapkan	Poster 1
69	Large Language Models as Evaluators for Scientific Synthesis	Julia Evans, Jennifer D'Souza, Sören Auer	Poster 1
71	Role-Playing LLMs in Professional Communication Training: The Case of Investigative Interviews with Children	Don Tuggener, Teresa Schneider, Ariana Huwiler, Tobias Kreienbühl, Simon Hischier, Pius von Däniken, Susanna Niehaus	Oral 2
72	Analysing Effects of Inducing Gender Bias in Language Models	Stephanie Gross, Brigitte Krenn, Craig Lincoln, Lena Holzwarth	Oral 2
73	Exploring Phonetic Features in Language Embeddings for Unseen Language Varieties of Austrian German	Lorenz Gutscher, Michael Pucher	Poster 3
76	Word alignment in Discourse Representation Structure parsing	Christian Obereder, Gabor Recski	Poster 1

*All abstracts for the main conference can be found here.

Workshops / Shared Tasks, 13. September 2024 (preliminary)

Time	Event	Location
13.Sept, 9:00-18:00 Coffee breaks: 11:00-11:30 15:00-15:30	GermEval Shared Task 2: Statement in German Easy Language (StaGE)	SR 3
13.Sept, 9:00-18:00 Coffee breaks: 11:00-11:30 15:00-15:30	Workshop on Linguistic Insights from and for Multimodal Language Processing (LIMO)	SR 4
13.Sept, 9:00-18:00 Coffee breaks: 11:00-11:30 15:00-15:30	Workshop on Computational Linguistics for Political Text Analysis (CPSS)	Lecture Hall 1 (HS1)

Accepted Workshops and Shared Tasks

GermEval Shared Task 1: GERMS-DETECT Sexism Detection in German Online News Fora (GERMS-DETECT); planned for 10. September 2024; see https://ofai.github.io/GermEval2024-GerMS/

GermEval Shared Task 2: Statement in German Easy Language (StaGE); planned for 13. September 2024; see https://german-easy-to-read.github.io/statements/ for more information!

Workshop on Computational Linguistics for Political Text Analysis (CPSS); planned for 13. September 2024; see https://sites.google.com/view/cpss2024konvens/home-page for more information!

Workshop on Linguistic Insights from and for Multimodal Language Processing (LIMO); planned for 13. September 2024; https://sites.google.com/view/limo-2024 for more information!

Keynotes

Keynote 1: Leonie Weissweiler, UT Austin

Keynote 1: Leonie Weissweiler, UT Austin

Constructions all the way down: rethinking compositionality in LLMs

Abstract:Why are LLMs still not modelling all aspects of language perfectly? Previous works suggested is their deficits in compositionality, regularly building the meaning of an expression as a function of its parts. But in fact, human language is not compositional in this way. Rather, meaning is combined compositionally using constructions, which are pairings of form and function that vary wildly in shape and scope. This means that to achieve the full creativity and flexibility of human language, LLMs will have to assign meaning to constructions and use this to build the meaning of expressions. I will show that this is still not adequately handled by LLMs, and elaborate why construction-compositionality is one of the last remaining challenges that we must solve on our way to more cognitively plausible language models.

Short Bio: Leonie Weissweiler is a postdoc at UT Austin Linguistics where she works with Kyle Mahowald on the computational learnability of rare linguistic phenomena. She received her PhD from LMU Munich in July 2024, where she worked with Hinrich Schütze on the contributions of Construction Grammar and Morphology to NLP, and vice versa. Her research now focuses on using language models to discover and test hypotheses in Linguistics, while using insights from Linguistics to point out issues with language models.

Keynote 2: Sebastian Schuster, University College London

Keynote 2: Sebastian Schuster, University College London

What does it mean for a language model to exhibit a language understanding ability?

Abstract: Large language models (LLMs) such as GPTs, Gemini or Llama often provide answers that fulfil user requests, which suggests that the model is at least to a large extent able to infer the user’s intent and to generate appropriate responses. However, given the open-ended nature of user requests and model responses, it has been quite challenging to systematically evaluate to what extent models exhibit specific language understanding abilities. In my talk, I will focus on one such ability, namely keeping track of how the states of entities change as a discourse unfolds. I will use this ability as a case study for how different evaluation methods can lead to different conclusions about model abilities, I will discuss challenges in evaluating understanding abilities in LLMs and I will consider some recommendations on how to overcome some of these challenges.

Short Bio: Sebastian Schuster is currently a lecturer in computational linguistics at University College London, and he will start a WWTF-funded research group at the University of Vienna in mid-2025. Before joining UCL, he was a postdoc at New York University and at Saarland University, after completing his PhD at Stanford University. His research focuses on computational semantics and pragmatics and he builds and evaluates computational models of interpreting language in context. His work has won awards at ACL and he has been a senior area chair and program chair at several *ACL conferences and workshops.

Keynote 3: Jana Diesner, Technical University of Munich

Keynote 3: Jana Diesner, Technical University of Munich

Using Natural Language Processing to Advance Social Science, Responsibly

Abstract: Leveraging natural processing techniques to consider the content of information at scale allows us to discover and re-evaluate theories and patterns of societal behavior. This process requires researchers to make a multitude of decisions that require expertise from multiple fields, including how to sample, represent, and preprocess data, implement algorithms, and validate results. I present findings and lessons learned from using NLP techniques, especially entity disambiguation and relation extraction, to study how and why people collaborate and respond to crises. I discuss sources of biases and strategies for mitigating them.

Short bio: Jana Diesner is a Full Professor at the Technical University of Munich, School of Social Science and Technology. There, she leads the Human Centered Computing group. Her interdisciplinary group works on methods from network analysis, natural language processing, machine learning and AI, and integrates them with theories from the social sciences to advance our knowledge about complex societal systems and responsible computing. Before joining TU Munich in 2024, she was a tenured professor at the School of Information Sciences at the University of Illinois Urbana Champaign. Jana earned her Ph.D. at Carnegie Mellon, School of Computer Science.