Skip to content

WebNLG Challenge 2023

Info

The challenge has started. The deadline for the submission of systems output is June 15th, 2023.

The new edition of WebNLG focuses on four under-resourced languages which are severely under-represented in research on text generation, namely Maltese, Irish, Breton and Welsh. In addition, WebNLG 2023 will once again include Russian, which was first featured in WebNLG 2020.

Task

The challenge focuses on RDF-to-text generation, similarly to WebNLG 2017 but targeting Breton, Irish, Maltese, Welsh, and Russian;

Given the four RDF triples shown in (1), the aim is to generate a text such as (b) or (c).

Example

Example

(a) Set of RDF triples

<entry category="Company" eid="Id21" shape="(X (X) (X) (X) (X))" shape_type="sibling" size="4">
    <modifiedtripleset>
        <mtriple>Trane | foundingDate | 1913-01-01</mtriple>
        <mtriple>Trane | location | Ireland</mtriple>
        <mtriple>Trane | foundationPlace | La_Crosse,_Wisconsin</mtriple>
        <mtriple>Trane | numberOfEmployees | 29000</mtriple>
    </modifiedtripleset>
</entry>

(b) English text

Trane, which was founded on January 1st 1913 in La Crosse, Wisconsin, is based in Ireland. It has 29,000 employees.

(c) Russian text

Компания "Тране", основанная 1 января 1913 года в Ла-Кроссе в штате Висконсин, находится в Ирландии. В компании работают 29 тысяч человек.

Data

The WebNLG 2023 dataset for training comprises 1,399 data-text pairs for Breton and 1,665 for Welsh, Irish and Maltese. The Russian data includes all data made available for the WebNLG 2020 Challenge.

See corpus documentation for the WebNLG format.

Important Dates

  • 24 February 2023: Release of noisy training , gold development data and evaluation scripts.
  • 8 June 2023: Release of test data
  • 15 June 2023: Deadline for submission of system outputs.
  • 30 June 2023: Automatic evaluation results are released to participants
  • 15 August 2023: Deadline for submission of short papers describing systems.

The final presentation of results will be held during a workshop. Current plans are to hold this in September 2023.

Contacts

webnlg-challenge@inria.fr

Organising Committee

  • Enrico Aquilina, University of Malta
  • Anya Belz, Dublin City University, Ireland
  • Claudia Borg, University of Malta, Malta
  • Liam Cripwell, CNRS/LORIA and Lorraine University, France
  • Claire Gardent, CNRS/LORIA, France
  • Albert Gatt, Utrech University, The Netherlands
  • John Judge, Dublin City University, Ireland
  • William Soto-Martinez, CNRS/LORIA and Lorraine University, France

Evaluation

System outputs are assessed with automatic and human evaluation.

Automatic Evaluation

Generation is evaluated with automatic metrics: BLEU, METEOR, chrF++, TER, and BERT-Score. The evaluation scripts can be found here.

Human Evaluation

System outputs are assessed according to criteria such as grammaticality/correctness, appropriateness/adequacy, fluency/naturalness, etc., by native speakers.

Submission Format

Your submission file must be a .txt file (UTF-8 encoding) where each text is true-cased and detokenised. Example for English.

Each line should correspond to a verbalisation of a DBpedia triple set. Line 1 should represent the verbalisation of the DBpedia triple set with the ID=1, line 2 — the DBpedia triple set with the ID=2, etc.

Participant FAQ

Which resources are allowed?

There are no restrictions for any task. E.g., you may use a pre-trained language model, external corpora, etc.

Can I submit multiple outputs?

Yes, given that they stem from substantially different systems. However, for human assessment we may ask you to provide a primary system that will be evaluated.

Can I participate in for one language only?

Yes. You can participate only in, say, RDF-to-text generation for Breton.

Can I download the data without participating in the challenge?

Yes.

Will it be possible to withdraw my results if my team's performance is unsatisfactory?

Yes. We will first announce the results to participants anonymously, and you will have an opportunity to withdraw your results.

Workshop

The challenge results will be presented at the MM-NLG 2023 workshop to take place at INLG 2023 on September 12th, 2023 in Prague.