Program Overview
Type of Event | Dates | Location |
---|---|---|
Workshop / Tutorial / Shared Task Day 1 | 10. September 2024 | Radetzkystraße 2, 1030 Vienna |
Main Conference Day 1 | 11. September 2024 | Währingerstraße 29, 1090 Vienna |
GSCL-DGfs-CL Networking Lunch* | 11. September 2024 (12:00) | Restaurant Rebhuhn, Berggasse 24, 1090 Vienna |
Conference dinner | 11. September 2024 (18:45) | Zum Martin Sepp, Cobenzlgasse 34, 1190 Vienna |
Main Conference Day 2 | 12. September 2024 | Währingerstraße 29, 1090 Vienna |
GSCL Members Meeting | 12. September 2024 (17:30) | Währingerstraße 29, 1090 Vienna |
GSCL PhD AWARD | 12. September 2024 | Währingerstraße 29, 1090 Vienna |
PhD dinner | 12. September 2024 (19:30) | Restaurant Hansy, Heinestraße 42, 1020 Vienna |
Workshop / Tutorial / Shared Task Day 2 | 13. September 2024 | Währingerstraße 29, 1090 Vienna |
* Only for members of the organizations GSCL/DGfs-CL.
Detailed schedule: https://calendar.google.com/calendar/u/0?cid=NzBmMjllN2FiYmYzMzJjY2E1MGY2MzQyYWVmYTJjMTJmNDZlYWEwOGVhZGQ4NmJmNDFjYjI2YjU2OGQ3MDljNkBncm91cC5jYWxlbmRhci5nb29nbGUuY29t
GermEval Shared Task 1: GERMS-DETECT Workshop, 10. September 2024
Time | Session/Event | Location |
---|---|---|
10. Sept, 12:00-16:00 | Paper presentations and discussions | Radetzkystraße 2, 1030 Vienna |
10.Sept,16:30-17:30 | Podium discussion | Radetzkystraße 2, 1030 Vienna |
Please find the entire program of the workshop here.
The paper presentations and discussions are followed by a panel discussion with representatives from the media, law and ethics on "The role of AI in media production".
Please find details on the panel discussion here in English and in German.
Please bring your passport or identity card to get access to the event space (workshop and panel discussion)!
Main conference
All abstracts for the main conference can be found here.
Proceedings (pdf)
Please, go to this Google calendar for a detailed schedule!
11. September 2024
Time | Session/Event |
---|---|
11. Sept., 8:30-9:30 | Registration |
11. Sept., 9:30-10:00 | Opening |
11. Sept., 10:00-11:00 | Keynote 1, Leonie Weißweiler |
11. Sept., 11:00-12:00 | Poster 1 + Coffee |
11. Sept., 12:00-14:00 | Break |
11. Sept., 14:00-15:00 | Keynote 2: Sebastian Schuster What does it mean for a language model to exhibit a language understanding ability? |
11. Sept., 15:00-16:00 | Poster 2 + Coffee |
11. Sept., 16:00-17:00 | Oral 1 |
11. Sept., 17:00-19:00 | Social event** |
11. Sept., 19:00-21:00 | Dinner |
** Social event: organized bus trip to the viewing platform Cobenzl on top of Vienna's vineyards, followed by a short walk (30 mins) downhill through the vineyards and to the dinner location. Alternatively, if you do not wish to join the walk, you can only join the trip to the viewing platform and take the private bus down to the dinner location.
12. September 2024
Time | Session/Event |
---|---|
12. Sept., 8:30-9:30 | Registration |
12. Sept., 9:30-10:30 | Keynote 3, Jana Diesner |
12. Sept., 10:30-11:00 | Coffee |
12. Sept., 11:00-12:00 | Oral 2 |
12. Sept., 12:00-14:00 | Break |
12. Sept., 14:00-15:00 | Oral 3 |
12. Sept., 15:00-16:00 | Poster 3 + Coffee |
12. Sept., 16:00-17:00 | GSCL PhD Award |
12. Sept., 17:00-17:30 | Closing session |
12. Sept., 17:30 | GSCL Members Meeting |
Oral presentations & posters*
Paper number | Title | Authors | Presentation slot |
---|---|---|---|
3 | CO-Fun: A German Dataset on Company Outsourcing in Fund Prospectuses for Named Entity Recognition and Relation Extraction | Neda Foroutan, Markus Schröder, Andreas Dengel | Poster 2 |
8 | Lex2Sent: A bagging approach to unsupervised sentiment analysis | Kai-Robin Lange, Jonas Rieger, Carsten Jentsch | Oral 3 |
9 | Discourse-Level Features in Spoken and Written Communication | Hannah J. Seemann, Sara Shahmohammadi, Manfred Stede, Tatjana Scheffler | Oral 3 |
10 | Semiautomatic Data Generation for Academic Named Entity Recognition in German Text Corpora | Pia Schwarz | Poster 2 |
11 | GERestaurant: A German Dataset of Annotated Restaurant Reviews for Aspect-Based Sentiment Analysis | Nils-Constantin Hellwig, Jakob Fehle, Markus Bink, Christian Wolff | Poster 2 |
12 | Revisiting the Phenomenon of Syntactic Complexity Convergence on German Dialogue Data | Yu Wang, Hendrik Buschmeier | Poster 1 |
13 | OMoS-QA: A Dataset for Cross-lingual Extractive Question Answering in a German Migration Context | Steffen Kleinle, Jakob Prange, Annemarie Friedric | Oral 2 |
16 | Few-Shot Prompting for Subject Indexing of German Medical Book Titles | Lisa Kluge, Maximilian Kähler | Poster 2 |
19 | Querying Repetitions in Spoken Language Corpora | Elena Frick, Henrike Helmer, Dolores Lemmenmeier-Batinić | Poster 3 |
21 | A comparison of data filtering techniques for English-Polish neural machine translation in the biomedical domain | Jorge del Pozo Lérida, Kamil Kojs, Janos Mate, Mikołaj Antoni Barański, Christian Hardmeier | Poster 3 |
22 | Linguistic and extralinguistic factors in automatic speech recognition of German atypical speech | Eugenia Rykova, Mathias Walther | Poster 3 |
23 | An Improved Method for Class-specific Keyword Extraction: A Case Study in the German Business Registry | Stephen Meisenbacher, Tim Schopf, Weixin Yan, Patrick Holl, Florian Matthes | Poster 2 |
29 | A Multilingual Dataset of Adversarial Attacks to Automatic Content Scoring Systems | Ronja Laarmann-Quante, Christopher Chandler, Noemi Incirkus, Vitaliia Ruban, Alona Solopov, Luca Steen | Poster 3 |
30 | Version Control for Speech Corpora | Vlad Dumitru, Matthias Boehm, Martin Hagmüller, Barbara Schuppler | Poster 3 |
31 | Tabular JSON: A Proposal for a Pragmatic Linguistic Data Format | Adam Roussel | Poster 2 |
37 | Redundancy Aware Multiple Reference Based Gainwise Evaluation of Extractive Summarization | Mousumi Akter, Shubhra Kanti Karmaker Sant | Oral 1 |
38 | Complexity of German Texts Written by Primary School Children | Jammila Laâguidi, Dana Neumann, Ronja Laarmann-Quante, Stefanie Dipper, Mihail Chifligarov | Poster 1 |
42 | Using GermaNet for the Generation of Crossword Puzzles | Claus Zinn, Marie Hinrichs, Erhard Hinrichs | Poster 1 |
46 | A Crosslingual Approach to Dependency Parsing for Middle High German | Cora Haiber | Poster 1 |
47 | Discourse Parsing for German with new RST Corpora | Sara Shahmohammadi, Manfred Stede | Poster 3 |
48 | Fine-grained quotation detection and attribution in German news articles | Fynn Petersen-Frey, Chris Biemann | Oral 1 |
50 | AustroTox: A Dataset for Target-Based Austrian German Offensive Language Detection | Pia Pachinger, Janis Goldzycher, Anna Maria Planitzer, Wojciech Kusa, Allan Hanbury, Julia Neidhardt | Poster 3 |
51 | OneLove beyond the field - A few-shot pipeline for topic and sentiment analysis during the FIFA World Cup in Qatar | Christoph Rauchegger, Sonja Mei Wang, Pieter Delobelle | Poster 3 |
53 | How to Translate SQuAD to German? A Comparative Study of Answer Span Retrieval Methods for Question Answering Dataset Creation | Jens Kaiser, Agnieszka Falenska | Poster 2 |
55 | LLM-based Translation Across 500 Years. The Case for Early New High German | Martin Volk, Dominic P. Fischer, Patricia Scheurer, Raphael Schwitter, Phillip Benjamin Ströbel | Poster 3 |
56 | Binary indexes for optimising corpus queries | Peter Ljunglöf, Nicholas Smallbone, Mijo Thoresson, Victor Salomonsson | Poster 2 |
58 | Leveraging Cross-Lingual Transfer Learning in Spoken Named Entity Recognition Systems | Moncef Benaicha, David Thulke, Mehmet Ali Tuğtekin Turan | Poster 2 |
60 | Evaluating and Fine-Tuning Retrieval-Augmented Language Models to Generate Text with Accurate Citations | Vinzent Penzkofer, Timo Baumann | Poster 1 |
62 | Decoding 16th-Century Letters: From Topic Models to GPT-Based Keyword Mapping | Phillip Benjamin Ströbel, Stefan Aderhold, Ramona Roller | Oral 1 |
63 | Towards Improving ASR Outputs of Spontaneous Speech with LLMs | Karner Manuel, Julian Linke, Mark Kröll, Barbara Schuppler, Bernhard C Geiger | Poster 3 |
64 | Exploring Data Acquisition Strategies for the Domain Adaptation of QA Models | Maurice Falk, Adrian Ulges, Dirk Krechel | Poster 2 |
66 | Estimating Word Concreteness from Contextualized Embeddings | Christian Wartena | Poster 1 |
67 | Features and Detectability of German Texts Generated with Large Language Models | Verena Irrgang, Veronika Solopova, Steffen Zeiler, Robert M. Nickel, Dorothea Kolossa | Oral 3 |
68 | Exploring Automatic Text Simplification for Lithuanian | Justina Mandravickaitė, Egle Rimkiene, Danguolė Kalinauskaitė, Danguolė Kotryna Kapkan | Poster 1 |
69 | Large Language Models as Evaluators for Scientific Synthesis | Julia Evans, Jennifer D'Souza, Sören Auer | Poster 1 |
71 | Role-Playing LLMs in Professional Communication Training: The Case of Investigative Interviews with Children | Don Tuggener, Teresa Schneider, Ariana Huwiler, Tobias Kreienbühl, Simon Hischier, Pius von Däniken, Susanna Niehaus | Oral 2 |
72 | Analysing Effects of Inducing Gender Bias in Language Models | Stephanie Gross, Brigitte Krenn, Craig Lincoln, Lena Holzwarth | Oral 2 |
73 | Exploring Phonetic Features in Language Embeddings for Unseen Language Varieties of Austrian German | Lorenz Gutscher, Michael Pucher | Poster 3 |
76 | Word alignment in Discourse Representation Structure parsing | Christian Obereder, Gabor Recski | Poster 1 |
*All abstracts for the main conference can be found here.
Workshops / Shared Tasks, 13. September 2024 (preliminary)
Time | Event | Location |
---|---|---|
13.Sept, 9:00-18:00 Coffee breaks: 11:00-11:30 15:00-15:30 |
GermEval Shared Task 2: Statement in German Easy Language (StaGE) | SR 3 |
13.Sept, 9:00-18:00 Coffee breaks: 11:00-11:30 15:00-15:30 |
Workshop on Linguistic Insights from and for Multimodal Language Processing (LIMO) | SR 4 |
13.Sept, 9:00-18:00 Coffee breaks: 11:00-11:30 15:00-15:30 |
Workshop on Computational Linguistics for Political Text Analysis (CPSS) | Lecture Hall 1 (HS1) |
Accepted Workshops and Shared Tasks
GermEval Shared Task 1: GERMS-DETECT Sexism Detection in German Online News Fora (GERMS-DETECT); planned for 10. September 2024; see https://ofai.github.io/GermEval2024-GerMS/
GermEval Shared Task 2: Statement in German Easy Language (StaGE); planned for 13. September 2024; see https://german-easy-to-read.github.io/statements/ for more information!
Workshop on Computational Linguistics for Political Text Analysis (CPSS); planned for 13. September 2024; see https://sites.google.com/view/cpss2024konvens/home-page for more information!
Workshop on Linguistic Insights from and for Multimodal Language Processing (LIMO); planned for 13. September 2024; https://sites.google.com/view/limo-2024 for more information!
Keynotes
Keynote 1: Leonie Weissweiler, UT Austin
Keynote 1: Leonie Weissweiler, UT Austin
Constructions all the way down: rethinking compositionality in LLMs
Abstract:Why are LLMs still not modelling all aspects of language perfectly? Previous works suggested is their deficits in compositionality, regularly building the meaning of an expression as a function of its parts. But in fact, human language is not compositional in this way. Rather, meaning is combined compositionally using constructions, which are pairings of form and function that vary wildly in shape and scope. This means that to achieve the full creativity and flexibility of human language, LLMs will have to assign meaning to constructions and use this to build the meaning of expressions. I will show that this is still not adequately handled by LLMs, and elaborate why construction-compositionality is one of the last remaining challenges that we must solve on our way to more cognitively plausible language models.
Short Bio: Leonie Weissweiler is a postdoc at UT Austin Linguistics where she works with Kyle Mahowald on the computational learnability of rare linguistic phenomena. She received her PhD from LMU Munich in July 2024, where she worked with Hinrich Schütze on the contributions of Construction Grammar and Morphology to NLP, and vice versa. Her research now focuses on using language models to discover and test hypotheses in Linguistics, while using insights from Linguistics to point out issues with language models.
Keynote 2: Sebastian Schuster, University College London
Keynote 2: Sebastian Schuster, University College London
What does it mean for a language model to exhibit a language understanding ability?
Abstract: Large language models (LLMs) such as GPTs, Gemini or Llama often provide answers that fulfil user requests, which suggests that the model is at least to a large extent able to infer the user’s intent and to generate appropriate responses. However, given the open-ended nature of user requests and model responses, it has been quite challenging to systematically evaluate to what extent models exhibit specific language understanding abilities. In my talk, I will focus on one such ability, namely keeping track of how the states of entities change as a discourse unfolds. I will use this ability as a case study for how different evaluation methods can lead to different conclusions about model abilities, I will discuss challenges in evaluating understanding abilities in LLMs and I will consider some recommendations on how to overcome some of these challenges.
Short Bio: Sebastian Schuster is currently a lecturer in computational linguistics at University College London, and he will start a WWTF-funded research group at the University of Vienna in mid-2025. Before joining UCL, he was a postdoc at New York University and at Saarland University, after completing his PhD at Stanford University. His research focuses on computational semantics and pragmatics and he builds and evaluates computational models of interpreting language in context. His work has won awards at ACL and he has been a senior area chair and program chair at several *ACL conferences and workshops.
Keynote 3: Jana Diesner, Technical University of Munich
Keynote 3: Jana Diesner, Technical University of Munich
Using Natural Language Processing to Advance Social Science, Responsibly
Abstract: Leveraging natural processing techniques to consider the content of information at scale allows us to discover and re-evaluate theories and patterns of societal behavior. This process requires researchers to make a multitude of decisions that require expertise from multiple fields, including how to sample, represent, and preprocess data, implement algorithms, and validate results. I present findings and lessons learned from using NLP techniques, especially entity disambiguation and relation extraction, to study how and why people collaborate and respond to crises. I discuss sources of biases and strategies for mitigating them.
Short bio: Jana Diesner is a Full Professor at the Technical University of Munich, School of Social Science and Technology. There, she leads the Human Centered Computing group. Her interdisciplinary group works on methods from network analysis, natural language processing, machine learning and AI, and integrates them with theories from the social sciences to advance our knowledge about complex societal systems and responsible computing. Before joining TU Munich in 2024, she was a tenured professor at the School of Information Sciences at the University of Illinois Urbana Champaign. Jana earned her Ph.D. at Carnegie Mellon, School of Computer Science.