November 11, 2007
By Janna Quitney Anderson, Director of Imagining the Internet and Assistant Professor of Communications, Elon University
Rio de Janeiro, Brazil –You can find almost anything on the Internet but when you find it chances are it’s written or spoken in English. According to most estimates, more than half and perhaps as much as 80 percent of Internet content is in English.
According to recent statistics, nearly 70 percent of the world’s Internet domains are hosted in the United States, about 11 times more than the second-place nation – Germany. One reason for this is the lack of internationalized domain names, but there are changes in the works to make them available soon.
Since only about a third of the world’s 1.1 billion Internet users speak English as their first language, two-thirds of the people using the Internet have to have some understanding of a non-native language if they want to read the rich content available online.
Many see this relative monoculture of the Internet as a tremendous advantage. For example, people from all over the world can take free online courses at MIT. With English serving as a sort of default cyber-language we can avoid the complexities and inefficiencies that have prevailed in pre-Internet communications.
But others say it is time for the Internet to become more diverse, and things are starting to change as the Internet expands beyond its Western origins to reach people across the world.
That is why the issue of Internet “diversity” is high on the agenda here at the second annual Internet Governance Forum in Rio de Janeiro, Brazil. A globe-spanning group of people from many different ethnicities, religions, economies and cultures has gathered here to talk about ways to broaden the spectrum of people who go online and encourage the development of more content that represents diverse languages, ages, abilities and perspectives.
In short, they want the Internet to look more like the world.
Today, Internet engineers are working to overcome the biases that favor English and western cultures. They are adapting the Internet’s main traffic directories that keep track of the computer host names used in Web and e-mail addresses. Up to now, these addresses have only been able to use the 26 letters of the Latin alphabet used in English.
But the Internet Corporation for Assigned Names and Numbers, known as ICANN, is now testing internationalized domain names that allow Internet users to make subpages of the ‘net in their own languages. The test includes Arabic, Chinese, Greek, Hindi, Japanese, Korean, Persian, Russian, Tamil and Yiddish.
“There is today a test under way with 11 scripts that are not using Latin characters in order to evaluate the effect of those kinds of top-level domains on the various applications, the browsers, the e-mail applications, and the like, that might encounter such domain names,” said former ICANN chairman Vint Cerf. “The intention is to reach the point where ICANN can invite proposals for top-level domains in these new character sets somewhere around the middle of 2008. And this objective is for both the country code TLDs (top-level domains) and also for the generic ones.”
These are important steps toward building a truly multi-lingual Internet.
As more non-English-speaking people begin to use the Internet, these efforts are gaining momentum, especially because of the necessity for governments, companies and organizations to reach out to people in their native languages.
At the IGF main-room session on Diversity, Ben Petrazzini of the International Development Research Center spoke about efforts to use new technologies to encourage positive development. He said it is important to develop more local languages in fonts.
“We have invested $2 million in a project that is hosted at the national University of Lahore in Pakistan to develop and adapt 11 languages in the region,” he said. “It has developed localized versions of the open source operating system Linux. It has developed optical character recognition and text-to-speech software and a wide range of supporting applications and utilities such as lexicons and fonts in eight languages. This is an example of the kind of things we should start investing in if we aim to narrow the digital divide. In Africa we are in the early stages of a project that will develop localized terms, software and keyboard development for 24 African languages. We are developing digital local content.”
The numbers tell the story of a rapid change in the languages of the digital revolution.
While the number of English-speaking users on the Internet increased by about 150 percent since 2000, the number of Arabic–speaking users increased nearly ten-fold. Portuguese speakers increased more than 500 percent and Chinese, French and Spanish-speaking users weren’t far behind.
And of course China’s more than 1.3 billion people are in the early adoption phase of the Internet revolution. Fewer than 14 percent of Chinese are online, and as more of them get connected to the network they’ll be looking for sites in their native language.
So where is all this push toward a multilingual internet taking us? If I can’t understand your Web site and you can’t understand mine. What have we accomplished? “Of the 40,000 languages we conceived on the planet only about 6,000 to 9,000 remain, and of them less than 500 have digital existence,” said Daniel Pimienta, president of Networks in Development Foundation. “Of these, less than 50 languages gather more than 99 percent of the content of the Internet. To reduce cultural diversity is to jeopardize the possibility for our species to evolve and adapt. If the Internet is truly for everyone, then its responsibility is to embrace the issue of diversity and give it the priority and attention it deserves. We must turn the virtual Babel into the model of respect and diversity that collective intelligence is capable of building to feed human creativity and development.”
Adama Samassékou, a leader of the World Summit on the Information Society and president of the African Academy of Languages, said there are three primary challenges when discussing diversity at the global level. “The first is how can we transform what is commonly called the digital divide into the ‘digital for everyone,'” he said. “The second is how can we use information communication technologies to accelerate the process of achieving the Millennium Development Goals. And the third is how can we strengthen, promote and develop cultural and linguistic diversity, which is the main universal common good? The diversity we experience is the best instrument for dialogue between cultures and languages. Culture lies at the core of any discussion of identity, social cohesion and the development of any economy based on knowledge.”
Developers are trying to design more sophisticated translation software. An approach called “statistical machine translation” is being used to develop a 20-billion-word world translation base. In the meantime, Babelfish, Google Translate and other extremely rough translation systems are in use, and more Web sites are being developed to provide content in multiple languages.
Diversity also means opening the Web up to people who have disabilities, and those who are illiterate.
The Daisy Consortium expands usability by creating Digital Talking Books. DeafPlanet is a site that emphasizes the use of large symbols, bright colors and loud noises. Optional features also allow those with hearing disabilities to follow audio on the site through sign language.
Work continues to fully include the disabled and illiterate online. For instance, the Internet Society and other organizations support a global standard of usability known as “universal design.” At the major IGF session on Diversity, Monthian Buntan, president of the Thailand Association for the Blind, read from a script in Braille as he spoke as an advocate for the 650 million people with disabilities. “For us,” he said, “Internet accessibility could be made through the concept of universal design and the use of assistive technology. By embracing these we can truly accept and practice and respect diversity and move toward an inclusive and accessible information society which is caring and peaceful and barrier-free and happiness-based for all. Such a standard is open, nonproprietary and it embraces synchronized multimedia.”
Providing content that is useful to people of all abilities presents significant technical and economic challenges. But the prospect of bringing a more diverse mix of users to the Internet offers rich benefits.
The promise of diverse users contributing more diverse content can bring forward the wisdom of indigenous people, the magic of ideas that are a bit – or even a lot – out of the mainstream .the vision of an isolated, but talented thinker or artist. The cross-pollination of concepts that can advance humanity in ways we may not now imagine.
When people at the Internet Governance Forum discuss diversity, they are also discussing the need for open software and internet architecture standards that everyone can use. Bringing the internet to other under-served groups, including older persons and, in some cultures, women. And public policies that support user-generated content online.
Samassékou urged direct action moving forward. “We must leave here with a determination to show that between now and next year’s IGF in New Delhi we see improvement in freeware translation software,” he said. “Technologies can promote dialogue between people speaking several languages.”
From the beginning, the concepts of the Internet have been built around empowerment, allowing people everywhere, in every circumstance, the power to make a difference globally. This, then, is the promise of diversity, and moving all of us toward that goal is the reason meetings such as this are so important.
(Dannika Lewis and Dan Anderson were contributing writers for this article.)