
The internet is on the verge of one of the most fundamental changes in its history. The Internet Corporation for Assigned Names and Numbers (ICANN) is expected to agree on the use of internet addresses in non-Latin characters during this week’s ICANN convention in Seoul. If all goes according to plan, it will be possible to use Greek, Cyrllic, Arabic, Chinese, Korean and many other characters in the internet browser’s address bar. More than half of the 1.6 billion internet users in the world are using a character set which is not Latin. Therefore, ICANN expects that the number of non-Latin domain names, and thus the number of new internet usersm, will increase rapidly.
This far-reaching change in the use of he internet is based on a system that can “translate” or “convert” different writing systems (with sometimes different writing directions, i.a Arabic and Hebrew). On a high level, it would look a little like this, I would imagine:
|
عربي |
中文 |
English |
日本語 |
Deutsch |
Français |
Español |
Русский |
Português |
한국어 |
Italiano |
|
AR |
ZH |
EN |
JA |
DE |
FR |
ES |
RU |
PT |
KO |
IT |
Naturally, this phenomenon raises questions concerning the matching of internet addresses. Is ووو.هُمَنِنفِرِرِنسِ.كُم the same as www.humaninference.com? It appears that generic multilingual data matching issues also apply in this particular case. How do we handle these comparisons? For a couple of thoughts, please read this…….