A Quick Overview of IDN and Punycode

September 7, 2011 Posted by Tyler Cruz

About a week ago, I was approving a new publisher to 1upAds.com when I noticed that his submitted website domains were very suspicious. He had entered half a dozen domains that looked like the following:

xn—k2j02n2alsdkf0d.com, xn—a0j0n323kflkjd.com, xn—pqopwpqgjbk2jblka.com, etc.

My initial thought was that these sites were garbage – very low quality sites that were most likely receiving untargeted traffic through some cheap source such as PPV traffic. I was concerned about the traffic quality that would stem from these sites, and thought there was a good chance that this particular publisher would send in fraudulent traffic.

However, the domains were so strange-looking that I thought something must be up… that they must serve some useful purpose (because who would register such throw-away domains?), and so I did a bit of research.

Many of you may already know what these domains are, and perhaps this is fairly common knowledge these days, but I sure didn’t recognize what these domains were at the time, and so that means that some of you probably don’t either. So here’s an introduction to IDN and Punycode.

IDN (IDNA)

IDN, or IDNA as it is often referred to, is an abbreviation for Internationalized Domain Name.

There are plenty of articles and explanations as to what an IDN is, but basically it is a domain name that contains a character (or characters) outside of the standard ASCII set that is approved for “standard” domains (a-z, 0-9, and a dash).

They are used in countries whose language(s) contain non-Latin scripts and diacritics (ex. é) such as Arabic, Hebrew, Korea, etc.

IKEA was apparently quick to get on the IDN bandwagon and offer IDN domains to their customers from around the world.

Here is what their IDN domains look like in the following languages:

  • Arabic – ايكيا.com
  • Chinese – 宜家.com
  • Japanese – イケア.com
  • Greek – Ελλάδα

    Assuming your browser supports these special UNICODE characters (which it should, unless it’s severely out of date), you should see the example IDN domains above.

Punycode

Punycode is a type of encoding (RFC 3492) that was created exclusively to convert IDN domains to standard domain ASCII characters (a-z, 0-9, and a dash).

This encoding was created so that multilingual IDN domains could be standardized into the WHOIS registry, used properly as network host names, and used in other situations where non-ASCII characters wouldn’t function properly.

In short, Punycode is an encoded version of an IDN domain.

The Punycode version of the IKEA IDN domains mentioned earlier are as follows:

Note: Punycode domains won’t work in older browsers.

Converted IDN domains will always begin with the xn— prefix.

When you visit an IDN domain with non-Latin characters, your browser will convert it to Punycode automatically.

There are many Punycode converters out there, such as this one.

And Now You Know!

Again, you may have already known about Punycode, but I sure hadn’t. But now that I do, the next time I see a publisher with submitted website domains looking like:

 xn—k2j02n2alsdkf0d.com, xn—a0j0n323kflkjd.com, xn—pqopwpqgjbk2jblka.com, etc.

I will know what’s going on 🙂

If you enjoyed this post, please consider leaving a comment below, subscribing to my RSS feed, or following me on Twitter.
Posted: September 7th, 2011 under Articles  

30 Responses to “A Quick Overview of IDN and Punycode”

  1. Rhys Davies says:

    What’s the benefit of this though?

    So he can target countries that most people can’t?

    • The QWERTY keyboard is not the only keyboard around 😉

      There are many different keyboards especially for people from different countries so IDN domains are useful to these users.

      Hence using punycode and idn domains allows you to target those users. Additionally, all the generic punycode / idn domains aren’t registered so you can buy and resell them 😉

      • Sam says:

        Lol… Really? Do we have any other keyboard other than qwerty? This is the first time I am hearing. I thought it is universally followed. I have qwert in my BB, iphone, ipad and in all devices. So wondering….

    • Tyler Cruz says:

      There is no benefit to the publisher by using submitting the Punycode version of his IDN domains. I believe he did this to make sure that we could visit his sites properly, in case the special characters didn’t work in the application form, for example.

    • Sam says:

      Not just one benefit. There are tonns of them. Mostly they are used for selling it after some time. When a new tld comes to the market, people try to buy good domains names with it and then sell it later. Imagine if you can buy loan.co and then sell it to a loan company for thousands of $$$. That’s the concept.

  2. As a Greek I find it very usefull, thanks for sharing!

  3. we learn new things everyday, thanks for this

  4. stranger says:

    I just picked up ☺☺☺☺.com

  5. Wow, good information, Tyler. I had never heard of IDN or Punycode either.

  6. Dresses says:

    Doesn’t seem to work on a pretty recent update of Chrome!

    • “Google Chrome decides if it should show IDN or punycode for each component of a hostname separately. To decide if a component should be shown in IDN form, Google Chrome uses an algorithm that depends on the languages that the user claims to understand. On Windows and Linux, these languages can be configured in the Google Chrome’s Fonts and Languages dialog.”

      From: chromium.org/developers/design-documents/idn-in-google-chrome

  7. Michael says:

    Very interesting. I never heard of these domains either. Some new you learn everyday. Very weird looking though. I’d definitely be suspicious at first as well.

  8. I’ve seen these domains before. Someone has one registered as a shorturl service and basically it looks like an arrow then .com. Pretty cool.

  9. Jasmine says:

    Good introduction to IDN. Yes, it’s definitely a good idea to invest in some good idn now.

    • Potentially, but only if you’re going to develop them.

      Having seen some of the prices for these type of domains it is worthwhile to develop them rather, then let them just sit there on a parked page.

    • Sam says:

      Yeah, I know people who has spent $10,000+ when this new .co came up and now they are reaping profit. One of my friend sold a domain for $9000 which he got in auction for just $90. I missed that period.

  10. Interesting, but at the same time are the big companies in those countries using Latin so they do not miss some business, so for the internal market, can this be good.

  11. That is really interesting. I love when I find things like that. it is a great learning experience.

  12. IDN domains have been steadily growing for a while, with more bloggers such as yourself highlighting them people are seeing the true value in them and acquiring/developing them 🙂

  13. Stop Smoking says:

    This is an extremely informative article. You were able to grab my interest with your original ideas. I agree with most of your insightful views.

  14. we learn new things everyday, thanks for this

  15. Chris says:

    Great to know, thanks for sharing

  16. Aipmt 2012 says:

    Thanks for sharing. very informative.

  17. Wow, this is certainly something new for me. I never knew that they had different characters for domain names.

  18. Continue on inspiring us with your writing! Keep on sharing.
    Train Services

PeerFly

Leave a Reply to Michael