2024:Program/Wikipedia's Role in Preserving Minority Languages through Open Technology

View on Commons

Session title: Wikipedia's Role in Preserving Minority Languages through Open Technology

Session type: Lecture
Track: Technology
Language: en

The preservation and promotion of minority languages has become crucial as the number of endangered languages continues to grow. Open communities such as Wikipedia are of the essence to preserve linguistic diversity through collective wisdom. Language technology has the capacity to play a significant role in supporting these languages. In this session, we will bridge the gap between Wikimedians and language technology research, by exploring how Wikipedia is used in current cutting-edge approaches and focusing on open-source initiatives, as opposed to proprietary systems. The outcome of this merge will hopefully lead to a fully open-source powered Wikipedia.

Description

edit

In a world where the digital divide is not only about access to technology but also about representation within it, minority languages face the threat of digital extinction. The preservation and promotion of minority languages has become crucial as the number of endangered languages continues to grow. Open communities such as Wikipedia are of the essence to preserve linguistic diversity through collective wisdom.

Furthermore, language technology has the capacity to play a significant role in supporting these languages. By developing tools for under-resourced languages, we are prolonging their life in the digital era by enabling users to actively use that language online. Machine Translation (MT), in particular, has the potential to greatly benefit these languages by increasing access to information and resources, as well as facilitating communication between speakers. In fact, MT is a central tool within the Wikiverse.

In this session, we will bridge the gap between Wikimedians and language technology research, by exploring how Wikipedia is used in current state-of-the-art approaches and by focusing on open-source initiatives, as opposed to proprietary systems. The outcome of this merge will hopefully lead to a fully open-source powered Wikipedia.

Session recording: https://www.youtube.com/watch?v=BbGrkYK8FEk&list=PLhV3K_DS5YfJ1xyY0LNDNX3RKyRQEXOdB&t=8587

How does your session relate to the event theme, Collaboration of the Open?

The session directly aligns with the event theme of "Collaboration of the Open" by highlighting the collaborative efforts between open communities such as Wikipedia and language technology researchers to promote minority languages, ultimately contributing to the greater good of preserving linguistic diversity.

What is the experience level needed for the audience for your session?

Everyone can participate in this session

Etherpad link

https://etherpad.wikimedia.org/p/WM2024_Day3_Ochrid_-_Room_9

Resources

edit

Speakers

edit
  • Ona de Gibert
I am PhD student in Machine Translation at the University of Helsinki, where I contribute to the development of language technologies for minority languages. I graduated in Modern Languages and Literatures (University of Barcelona) and hold master's degree in Language Analysis and Processing (UPV-EHU).
I am a language and scientific activist and believe in the power of the collective to transform society. I am a member of the Wikipedia community in the Catalan language and would like to develop open-source systems for Wikipedia to achieve technological sovereignity.