Programme


Merci de noter que ce programme est susceptible de subir quelques modifications mineures.

09:30 — Café d'accueil


10:00 — Session d'ouverture du colloque par Yannick Toussaint (directeur adjoint du Loria) et Jean-Marie Pierrel (ancien directeur des laboratoires Loria et ATILF)


10:30 — Séminaire invité : Bonnie Webber, Université d'Edimbourg, récipiendaire du prix ACL Lifetime Achievement Award (2020) [Diaporama]

    Title: Supporting Further Advances in Discourse-based Sentence Splitting.

    Abstract: In recent work, Claire together with her student Liam Cripwell and colleague Joël Legrand explored sentence-splitting (mapping a complex sentence into a sequence of simpler sentences) from the dual perspectives of sentence-level syntax and discourse) [Cripwell et al, 2021; 2022]. I found the work particularly interesting, and have started speculating on whether the effort could be taken further by taking account of properties of version 3 of the Penn Discourse TreeBank (PDTB 3.0), which annotates several thousand more instances of intra-sentential discourse relations, many modified forms of discourse connectives, and cases where two discourse spans (sentences or clauses) have both an explicitly marked relation between them and one that has been left unmarked.

Liam Cripwell, Joël Legrand, and Claire Gardent (2021). Discourse-based sentence splitting. Findings of the Association for Computational Linguistics (EMNLP 2021), pages 261–273.

Liam Cripwell, Joël Legrand, and Claire Gardent (2022). Controllable Sentence Simplification via Operation Classification. Findings of the Association for Computational Linguistics (NAACL 2022), pages 2091–2103.


11:30 — Séminaire invité : Marc Dymetman, Naverlabs, Grenoble [Diaporama]

    Title: Controlling the Quality of Large Language Models: a Distributional Approach

    Abstract: I will cover a line of work and collaborations, started a few years ago at NAVER Labs, where one augments a standard neural language model with constraints over the generative distribution. These help account for aspects of the training data that may be missed by these models (descriptive dimension) but also permit to introduce normative criteria (prescriptive dimension) controlling for biases, offensiveness, or other deficiencies of the standard training process.


12:30 — Déjeuner


14:00 — Séminaire invité : Shashi Narayan, Google Brain, Londres [Diaporama]

    Title: Conditional Generation with Question-Answering Blueprint

    Abstract: The ability to convey relevant and faithful information is critical for many tasks in conditional generation and yet remains elusive for neural seq-to-seq models whose outputs often reveal hallucinations and fail to correctly cover important details. In this work, we advocate planning as a useful intermediate representation for rendering conditional generation less opaque and more grounded. We propose a new conceptualization of text plans as a sequence of question-answer (QA) pairs and enhance existing datasets (e.g., for summarization) with a QA blueprint operating as a proxy for content selection (i.e., what to say) and planning (i.e., in what order). We obtain blueprints automatically by exploiting state-of-the-art question generation technology and convert input-output pairs into input-blueprint-output tuples. We develop Transformer-based models, each varying in how they incorporate the blueprint in the generated output (e.g., as a global plan or iteratively). Evaluation across metrics and datasets demonstrates that blueprint models are more factual than alternatives which do not resort to planning and allow tighter control of the generation output.


15:00 — Séminaire invité : Benoit Crabbé, Université Paris Cité, membre honoraire de l'Institut Universitaire de France (2014) [Diaporama]

    Title: The promise of language models for language sciences ? let's chat !

    Abstract: The field of Computational linguistics is currently is going through a period of paradigm shift Large language models are now ubiquitous with chat GPT creating the last buzz. If you ask chat GPT its promises for the future of language sciences, you get the somewhat confident reply: "Large language models like myself hold great promise for the field of linguistics. They offer improved language understanding, access to vast amounts of data, automatic language analysis, and the ability to test linguistic theories. These tools can help linguists to gain new insights into how language works, identify patterns in language usage, and refine their linguistic theories." In this talk I will put in perspective some key modeling directions in computational linguistics: modeling language structure and modeling language in relation with the world knowledge. And I will explain how we eventually end up with the current language models. We will show that given what they are, current language models achieve sometimes surprising results with respect to the modeling of language structure and highlight some potential research perspectives in language sciences and some of their current limitations.


16:00 — Pause café


16:30 — Séminaire invité : Mark Steedman, Université d'Edimbourg, récipiendaire du prix ACL Lifetime Achievement Award (2018) [Diaporama]

    Title: Inference in the Time of GPT ☆

    Abstract: Large pretrained Language Models (LLM) such as GPT3 have upended NLP, calling into question many established methods. In particular, they have been claimed to be capable of doing logical inference when fine-tuned on entailment datasets, or prompted with small numbers of examples of inferential tasks.
The talk will review and assess these claims, and propose that we should not give up on alternative methods.

☆ With apologies to Gabriel Garcia Marquez.


17:30 — Session de clotûre par Christian Rétoré (Université de Montpellier / LIRMM)


18:00Photo de groupe et cocktail