Internationalized Domain Names in Applications (IDNA) ===================================================== A library to support the Internationalised Domain Names in Applications (IDNA) protocol as specified in `RFC 5891 `_. This version of the protocol is often referred to as “IDNA2008” and can produce different results from the earlier standard from 2003. The library is also intended to act as a suitable drop-in replacement for the “encodings.idna” module that comes with the Python standard library but currently only supports the older 2003 specification. Its basic functions are simply executed: .. code-block:: pycon >>> import idna >>> idna.encode(u'ドメイン.テスト') 'xn--eckwd4c7c.xn--zckzah' >>> print idna.decode('xn--eckwd4c7c.xn--zckzah') ドメイン.テスト Packages -------- The latest tagged release version is published in the PyPI repository: .. image:: https://badge.fury.io/py/idna.svg :target: http://badge.fury.io/py/idna Installation ------------ To install this library, you can use PIP: .. code-block:: bash $ pip install idna Alternatively, you can install the package using the bundled setup script: .. code-block:: bash $ python setup.py install This library should work with Python 2.7, and Python 3.3 or later. Usage ----- For typical usage, the ``encode`` and ``decode`` functions will take a domain name argument and perform a conversion to an A-label or U-label respectively. .. code-block:: pycon >>> import idna >>> idna.encode(u'ドメイン.テスト') 'xn--eckwd4c7c.xn--zckzah' >>> print idna.decode('xn--eckwd4c7c.xn--zckzah') ドメイン.テスト You may use the codec encoding and decoding methods using the ``idna.codec`` module. .. code-block:: pycon >>> import idna.codec >>> print u'домена.испытание'.encode('idna') xn--80ahd1agd.xn--80akhbyknj4f >>> print 'xn--80ahd1agd.xn--80akhbyknj4f'.decode('idna') домена.испытание Conversions can be applied at a per-label basis using the ``ulabel`` or ``alabel`` functions if necessary: .. code-block:: pycon >>> idna.alabel(u'测试') 'xn--0zwm56d' Compatibility Mapping (UTS #46) +++++++++++++++++++++++++++++++ As described in RFC 5895, the IDNA specification no longer including mappings from different forms of input that a user may enter, to the form that is provided to the IDNA functions. This functionality is now a local user-interface issue distinct from the IDNA functionality. The Unicode Consortium has developed one such user-level mapping, known as `Unicode IDNA Compatibility Processing `_. It provides for both transitional mapping and non-transitional mapping described in this document. .. code-block:: pycon >>> import idna >>> idna.encode(u'Königsgäßchen') ... idna.core.InvalidCodepoint: Codepoint U+004B at position 1 of u'K\xf6nigsg\xe4\xdfchen' not allowed >>> idna.encode(u'Königsgäßchen', uts46=True) 'xn--knigsgchen-b4a3dun' >>> idna.encode(u'Königsgäßchen', uts46=True, transitional=True) 'xn--knigsgsschen-lcb0w' Note that implementors should use transitional processing with caution as the outputs of the functions may differ from what is expected, as noted in the example. ``encodings.idna`` Compatibility ++++++++++++++++++++++++++++++++ Function calls from the Python built-in ``encodings.idna`` module are mapping to their IDNA 2008 equivalents using the ``idna.compat`` module. Simply substitute the ``import`` clause in your code to refer to the new module name. Exceptions ---------- All errors raised during the conversion following the specification should raise an exception derived from the ``idna.IDNAError`` base class. More specific exceptions that may be generated as ``idna.IDNABidiError`` when the error reflects an illegal combination of left-to-right and right-to-left characters in a label; ``idna.InvalidCodepoint`` when a specific codepoint is an illegal character in an IDN label (i.e. INVALID); and ``idna.InvalidCodepointContext`` when the codepoint is illegal based on its positional context (i.e. it is CONTEXTO or CONTEXTJ but the contextual requirements are not satisfied.) Testing ------- The library has a test suite based on each rule of the IDNA specification, as well as test that are provided as part of the Unicode Technical Standard 46, `Unicode IDNA Compatibility Processing `_. The tests are run automatically on each commit to the master branch of the idna git repository at Travis CI: .. image:: https://travis-ci.org/kjd/idna.svg?branch=master :target: https://travis-ci.org/kjd/idna