table of contents
Encode::IMAPUTF7(3pm) | User Contributed Perl Documentation | Encode::IMAPUTF7(3pm) |
NAME¶
Encode::IMAPUTF7 - modification of UTF-7 encoding for IMAP
VERSION¶
version 1.07
SYNOPSIS¶
use Encode qw/encode decode/; use Encode::IMAPUTF7; print encode('IMAP-UTF-7', 'Répertoire'); print decode('IMAP-UTF-7', 'R&AOk-pertoire');
PERL VERSION¶
This library should run on perls released even a long time ago. It should work on any version of perl released in the last five years.
Although it may work on older versions of perl, no guarantee is made that the minimum required version will not be increased. The version may be increased for any reason, and there is no promise that patches will be accepted to lower the minimum required perl.
ABSTRACT¶
IMAP mailbox names are encoded in a modified UTF-7 when names contains international characters outside of the printable ASCII range. The modified UTF-7 encoding is defined in RFC2060 (section 5.1.3).
RFC2060 - section 5.1.3 - Mailbox International Naming Convention¶
By convention, international mailbox names are specified using a modified version of the UTF-7 encoding described in [UTF-7]. The purpose of these modifications is to correct the following problems with UTF-7:
- 1.
- UTF-7 uses the "+" character for shifting; this conflicts with the common use of "+" in mailbox names, in particular USENET newsgroup names.
- 2.
- UTF-7's encoding is BASE64 which uses the "/" character; this conflicts with the use of "/" as a popular hierarchy delimiter.
- 3.
- UTF-7 prohibits the unencoded usage of "\"; this conflicts with the use of "\" as a popular hierarchy delimiter.
- 4.
- UTF-7 prohibits the unencoded usage of "~"; this conflicts with the use of "~" in some servers as a home directory indicator.
- 5.
- UTF-7 permits multiple alternate forms to represent the same string; in particular, printable US-ASCII chararacters can be represented in encoded form.
In modified UTF-7, printable US-ASCII characters except for "&" represent themselves; that is, characters with octet values 0x20-0x25 and 0x27-0x7e. The character "&" (0x26) is represented by the two-octet sequence "&-".
All other characters (octet values 0x00-0x1f, 0x7f-0xff, and all Unicode 16-bit octets) are represented in modified BASE64, with a further modification from [UTF-7] that "," is used instead of "/". Modified BASE64 MUST NOT be used to represent any printing US-ASCII character which can represent itself.
"&" is used to shift to modified BASE64 and "-" to shift back to US- ASCII. All names start in US-ASCII, and MUST end in US-ASCII (that is, a name that ends with a Unicode 16-bit octet MUST end with a "- ").
For example, here is a mailbox name which mixes English, Japanese, and Chinese text: "~peter/mail/&ZeVnLIqe-/&U,BTFw-"
AUTHOR¶
Sava Chankov <sava@cpan.org>
CONTRIBUTOR¶
Ricardo Signes <rjbs@semiotic.systems>
COPYRIGHT AND LICENSE¶
This software is copyright (c) 2005 by Sava Chankov.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
2025-05-29 | perl v5.40.1 |