Become a member

Language Magazine is a monthly print and online publication that provides cutting-edge information for language learners, educators, and professionals around the world.

﹢ Subscribe

― Advertisement ―

HomeLanguage NewsnewsSwedish AI Models Preserve 500 Years of History

Swedish AI Models Preserve 500 Years of History

Athina Kontos reports on the pioneering of AI to analyze text in Sweden

January 24, 2023

3shares

The National Library of Sweden is harnessing AI technology developed by NVIDIA to preserve almost half a millennium of literature in digital form.

The library, renowned for archiving ancient and modern Swedish literature, is now working on converting millions of documents into accessible digital assets. The project will benefit researchers in humanities subjects, linguistics, history and media studies, but provides a principal role in the preservation and showcase of medieval manuscripts.

Swedish law requires that a copy of everything officially published in Swedish is submitted to the National Library of Sweden (Kungliga Biblioteket) for public record. This includes state documentation, journals, books, plays, internet content, menus, all TV/film/radio media, and even video games. This enormous body of data – 26 petabytes in total, has provided a plethora of information for NVIDIA GDX systems and everything needed for a comprehensive Swedish-language training program for AI models.

Researchers are currently developing over 24 open-source transformer models to enable research at the library building in Humlegården, Stockholm and other academic institutions around the country.

In 2019, the Kungliga Biblioteket (KB) established a department called the KBLab. Researchers began experimenting on just 5GB of Swedish-language text and sought inspiration from early language processing models created by Google. Soon after, the lab began testing AI training methods on an international data set of Dutch, German, and Norwegian language text. This work continues efforts towards computing larger models for international language research and content translation.

As results grew more positive, researchers at KBLab began to focus more on their own body of Swedish-language data and upgrading systems.

The current GDX models are effective in helping researchers create specialized data sets to understand the specific context and nature of every piece of Swedish-language content. From postcards to blog posts, videos, and social media, this technology will also enable language analysts to review how written and spoken Swedish has evolved over time, its societal influences, and distinction from other European languages.

In addition to the transformer models, KBLab is working on an AI sound-transcription tool, to create a written record of existing digital media.

Partnering with the University of Gothenburg, KBLab has also announced an upcoming project to support the Swedish Academy’s work to modernize data-driven techniques for creating Swedish-language dictionaries.

3shares

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Unlocking the Vietnamese Language: A Student’s Perspective in Saigon

Fastest Growth in US International Students in 40 Years

International Students Returning to US

Study Abroad Grant Program Reintroduced to Senate

Unlocking the Vietnamese Language: A Student’s Perspective in Saigon

Fastest Growth in US International Students in 40 Years

International Students Returning to US

Study Abroad Grant Program Reintroduced to Senate

Forever

Recommended

1-Year

1-Month

Become a member

California District Introduces Hindi

California Bill Requiring Kindergarten Exempts Early English Testing

Mastering Reading

Iñupiaq in Action

New Guidance to Honor ʻŌlelo Hawaiʻi

Laws to Preserve Cherokee Renewed

Google and RAE Join Forces to Improve Spanish Searches and Keyboard

Is the English Rhotic ‘r’ Disappearing?

Swedish AI Models Preserve 500 Years of History

Project Seeks to Preserve Syriac

UN to Promote News in Local African Languages

Taiwan Chinese Test Breaks Record

Subscribe for exclusive content

Subscribe to Liberty Case

Forever

Recommended

1-Year

1-Month

Become a member

Swedish AI Models Preserve 500 Years of History

Subscribe for exclusive content