Language Technology

Some of the technology behind language revitalization

This summer I set out on what I thought was an admirable and achievable goal: to raise the profile of an endangered language. In this article I’m going to explain the software, programs, and websites out there that I tried to use to do this. The success of my goal remains in the future, but hopefully I will be able to succeed in explaining these programs to you in the present. The language I set out to promote was Faetar, which is a language with fewer than 1,000 speakers, and a large proportion of the community having emigrated from southwestern Italy to Canada following the Second World War. The purpose of attempting to raise awareness of Faetar was twofold: it is a minority language here in Toronto that is at-risk of disappearing, and, somewhat selfishly, I wanted to learn how to use these programs so I can hopefully contribute to documentation and revitalization efforts of other languages in the future. The increased inter-connectivity and use of technology in the Western World is often seen as a huge challenge for endangered languages, but I strongly believe those factors can be used to preserve and promote these languages in a way that was impossible just ten years ago.

Wikipedia: The first stop on any endangered outreach project is to get a Wikipedia page published. Which is quite a remarkable sentence, but in today’s world, if it doesn’t exist in Wikipedia it doesn’t exist. To write a Wikipedia article is much easier than I had thought it was going to be. One simply makes an account, and goes to their profile’s “Sandbox”, and writes an article. Now the part that is much more difficult for a layperson, which I was and am, is the coding/formatting necessary for a good Wikipedia article. I don’t know how to code, however, I managed to make sections and subsections on my page, and I even built one of those nifty tables on any language’s page that shows the “family tree” from which the language is descended. This was not a short process; however, the help sections and forums on Wikipedia made it much more manageable. Once the page is ready, and you have seen the preview and you like it, you submit the article for review. Now, Wikipedia is a pretty popular website, and so when I submitted my article I was told there were 2,637 articles ahead of mine awaiting review. If you are simply improving and expanding on a page that already exists, this is much quicker, but to create a new page takes time for it to be reviewed. The reward for your patience is that, in the opinion of the internet, your language, or cause, exists because it exists in Wikipedia.

Memrise, Quizlet, Anki, etc.: These are websites that I think anyone can, and should, use.  These websites, and others like them, are a modern take on a flashcard-style of learning, except the “cards” are digital. This makes a pack of these “digital flashcards” immediately available to all, portable, and much more customizable than cards of paper. The advantages are many. First, you can find just about any subject you’d like from Doric (a variety of Scots), to the Cree Syllabics (the writing system used instead of an alphabet), to the flags of the world (not language-y I know, but still a fun thing to know). All of these websites simply require you to make an account, or link it to your Facebook account, and you can then choose your courses, and your progress is tracked wherever you log in. The usefulness of this is that with a simple spreadsheet of words and translations you can upload, and create your own course, on any subject or language you’d like. Creating a course is very easy. If you create a course in your desired language, anyone on earth can take it up and decide to learn the vocabulary, sentences, or expressions you’ve put into the system. Quizlet and Anki are more basic, but with all three you can include audio of the words or phrases you want to learn. You can also include photos of the words, so that when attempting to learn the Cree word for “tree” you can hear the word pronounced, see a picture of a tree, and see how it is written.

FLEX, Webonary : This is getting into the realm of more technical, but I’ll do my best to make it clear. FLEX is a program for creating a dictionary of a language while conducting fieldwork. It allows words to be added all-at-once or one-at-a-time, and the user can add as much information about the words as desired. My goal this summer was to get the wordlist I had created as a spreadsheet for Faetar into a very basic dictionary, to make it more accessible. To accomplish this, the goal was to create a Webonary, which allows you to host a website of your own for an endangered language. I applied for a page for the Faetar language, and within three days I had been given the login information for my own website hosted by Webonary. It is very easy to customize and setup the Webonary page; however words have to be uploaded through the FLEX software. FLEX allows for an enormous amount of detail to be added for a word. For example, in FLEX you can include whether a word is a noun, a sample phrase using the word, its pronunciation, a brief translation, and very much more. So to keep all of those sections of a word’s entry organized Webonary asks that the words be added from FLEX. The end result is that the endangered language now has its own website, and the user can then customize the pages of that website to contain any information they’d like: an introduction to the dictionary, links to other websites related to the language, and anything else. All of this customization does not require formatting or coding, it uses a click-and-type system that makes things very easy for the user.

Mukurtu and other archiving websites: Mukurtu, and websites like it, allow languages and communtiies to archive culturally significant and sensitive information. Mukurtu is still in beta testing, but in using it this summer, I found it effective and easy-to-use. It allows users to upload anything from songs, to pictures, to articles, to recordings of interviews, onto a language subpage. What makes Mukurtu different from many other community-archiving websites is that a whole series of pages, or individual pages, or even an individual song, can be set to different levels of security and protection. This is based on ideas of who traditionally can and should have access to which pieces of cultural information, and so, depending on the culture and language in question, you can select that some songs are for women only, or elders only, or should not be heard when one is using the website off of the native land. For example, there was a song recorded in an area of Alaska, and the singer, before beginning her song, said “this song should not be heard outside of this community”. So, with Mukurtu, those wishes can be respected, and different pages can be accessible to different people.

I hope I have shown that there are a plethora of websites and programs available to those interested in learning more about language documentation and revitalization, and I know there are many more than I never mentioned, or have not yet heard of, but the overall message of this is to explore the digital tools available to speakers of any language. I promise if I could figure them out over one summer, then I’d be willing to bet you can as well. Even if none of these programs seems applicable to the work you do, check them out! At the very least, you may find yourself learning basic Gascon, or the difference between fresh herbs based on appearance with online flashcards.


Take care eh,

– Michael Iannozzi, and the Canadian Language Museum team

(If you have any ideas for topics of future posts please email


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s