NAME¶
Unicode::Japanese::JA - XXXXXXXXXX
use Unicode::Japanese;
use Unicode::Japanese qw(unijp);
# convert utf8 -> sjis
print Unicode::Japanese->new($str)->sjis;
print unijp($str)->sjis; # same as above.
# convert sjis -> utf8
print Unicode::Japanese->new($str,'sjis')->get;
# convert sjis (imode_EMOJI) -> utf8
print Unicode::Japanese->new($str,'sjis-imode')->get;
# convert zenkaku (utf8) -> hankaku (utf8)
print Unicode::Japanese->new($str)->z2h->get;
Unicode::Japanese XXXXXXXXXXXXXXXXXXXXXXXXXXX
- •
- Unicode::Japanese XXXXXXXXXUTF-8 XXXXXXXXXXX
- •
- XS XX/XXXXXXXXXXXXXXXX XS XXXXXXXXXXXXXXXXX No-XS
XXXXXXXXXXXXXXXXXXXX (Japanese.pm XXXXXXXXXXXXXX)X
- •
- XXXXXXXXXXXXXXXXXXXXXXXXXXXX
- •
- XXXX (DoCoMo i-modeXKDDI AU, Softbank Mobile, ASTEL dot-i)
XXXXX Unicode XXXXXXXXXXXXXXXXDB XXXXXXXXXXXXXXX
- •
- XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
- •
- SJIS XX MS-CP932 XXXXX Unicode XXXXXXXXXXXX
- •
- Unicode -> SJISXXXEUC-JP/JISX XXXXXXXXSJIS XXXXXXXXXX
&#dddd; XXXXXXXXX XXXUnicodeXXXXXXXXXXXXXXX '?'XXXXX. XX,
XXXXXXXXXXXX, XXXXXXXXXXXX'?'XXXXX.
- •
- Perl-5.8.0 XXXXXX, utf8 XXXXXXXXXXXXX. utf-8 `XXX'X XXXXX
utf8() XXXXX, utf-8 `XX'X XXXXX getu() XXXXXXXXX.
get() XXXXXXXXXX utf-8 `XXX'X XXXXX (XXXXXXXXXXXXXXXXX).
sjis(), jis(), utf8(), etc.. XXXXXXXXXXXXXXX. new,
set, getcode XXXXXXXXX, utf8-flaged/bytes XXXXXX.
XXXXXXXX¶
- •
- perl 5.10.x, 5.8.x, etc. (5.004 XX).
- •
- (XXXXOK) C XXXXX. XXXXXXXX XS X Pure Perl XXXXXXXXXX. C
XXXXXXXXXXXX, Unicode::Japanese X Pure Perl XXXXXXXXXXXXXXXXXX.
- •
- (XXXXOK) XXXXX Test.pm XX Test::More.
XXXXXXXXXXXXXXXXXX.
XXXX¶
- $s = Unicode::Japanese->new($str [, $icode [,
$encode]])
- XXX Unicode::Japanese XXXXXXXXXXXXX
XXXXXXXXXXXX"set" XXXXXXXXXXX
- $s = unijp($str [, $icode [, $encode]])
- Unicode::Janaese->new(...) XXX.
- $s->set($str [, $icode [, $encode]])
- $str: XXX
- $icode: XXXXXXXXXXXXXXXX 'utf8'
- $encode: XXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXX XXXXXXXXXXXXX UTF-8 XXXXXXXX
XXXXXXXXXX:
auto
utf8 ucs2 ucs4
utf16-be utf16-le utf16
utf32-be utf32-le utf32
sjis cp932 euc euc-jp jis
sjis-imode sjis-imode1 sjis-imode2
utf8-imode utf8-imode1 utf8-imode2
sjis-doti sjis-doti1
sjis-jsky sjis-jsky1 sjis-jsky2
jis-jsky jis-jsky1 jis-jsky2
utf8-jsky utf8-jsky1 utf8-jsky2
sjis-au sjis-au1 sjis-au2
jis-au jis-au1 jis-au2
sjis-icon-au sjis-icon-au1 sjis-icon-au2
euc-icon-au euc-icon-au1 euc-icon-au2
jis-icon-au jis-icon-au1 jis-icon-au2
utf8-icon-au utf8-icon-au1 utf8-icon-au2
ascii binary
( XXX.)
XXXXXXXXXXXXXXXX'auto' XXXXXXXXXXXXXX 'auto'
XXXXXXXXXXXXXgetcode()
XXXXXXX XXXXXX
XXXXXXXXXXXX'base64' XXXXXXXXX base64 XXXXXXXXXbase64 XXXXXXXXX
Unicode::Japanese XXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 'binary' XXXXXXX
sjis-imodeXsjis-dotiXXXXXXXXXX &#dddd; X XXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXX
sjis, utf8 XXXXXXXXXXXXXXXXXsjisX sjis-auXsjis-doti XXXXXXXXXXXXXXXXXsjis-auX
XXXXXX
- $str = $s->get
XXXX UTF-8 XXXXXXXXXXX
XXX `XXX' X XXXXXX, XXXXXXXXXXXXXXXXX.
XXXXXXXXX
utf8() XXXXX, XXXXXXXX
getu() XXXXXXXXXXXXXXXXX.
- $str = $s->getu
XXXX UTF-8 XXXXXXXXXXX
Perl-5.8.0 XXXXXXX, utf-8 XXXXXXX utf-8 XXXXXX XXXX.
- $code = $s->getcode($str)
- $str: XXX
- $code: XXXXXXXXXXX
XXXXXXX(
$str)XXXXXXXXXXXXXXX
XXXXXX, XXXX, XXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
(PurePerlX)
- 1.
- UTF-32 X BOM XXXXXutf32 XXXXXXX
- 2.
- UTF-16 X BOM XXXXXutf16 XXXXXXX
- 3.
- UTF-32BE XXXXXXXXXXXutf32-be XXXXXXX
- 4.
- UTF-32LE XXXXXXXXXXXutf32-le XXXXXXX
- 5.
- ESC XX XXX 8 XXXXXXXXXXXXXXXXXXXXXXXascii XXXXX XXESC XXXX
ASCII XXXX (0x00-0x1F XX 0x7F) X ascii XXXXXXXXX XX
- 6.
- JISXXXXXXXXXXXXXXXXXXXjis XXXXXXX
- 7.
- J-PHONE XXXXXXXXXXXXXsjis-jsky XXXXXXX
- 8.
- EUC-JP XXXXXXXXXXXXXXeuc XXXXXXX
- 9.
- SJIS XXXXXXXXXXXXXXsjis XXXXXXX
- 10.
- SJIS XXXX au XXXXXXXXXXXXXXXsjis-au XXXXXXX
- 11.
- SJIS X i-mode XXXXXXXXXXXXXXXsjis-imode XXXXXXX
- 12.
- SJIS X dot-i XXXXXXXXXXXXXXXsjis-doti XXXXXXX
- 13.
- UTF-8 XXXXXXXXXXXutf8 XXXXXXX
- 14.
- XXXXXXXXXXXXXXXunknown XXXXXXX
(XSX)
- 1.
- UTF-32 X BOM XXXXXutf32 XXXXXXX
- 2.
- UTF-16 X BOM XXXXXutf16 XXXXXXX
- 3.
- XXXXXXXXXX, XXXXXXXXXXXXXXXXXXXXXXXXXX.
ascii / euc / sjis / jis / utf8 / utf32-be / utf32-le / sjis-jsky /
sjis-imode / sjis-au / sjis-doti
- 4.
- XXXXXXXXXXXXXXX, XXXXXXXXXXXXXX, XXXXXXXX.
utf32-be / utf32-le / ascii / jis / euc / sjis / sjis-jsky / sjis-imode /
sjis-au / sjis-doti / utf8
- 5.
- XXXXXXXXXXXXXXXunknown XXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXX
- •
- UTF-8 XXXXXXSJISXXXXXXXXXXXXXXXXXX
- •
- UCS2 XXXXXXXXXXXX
- •
- UTF-16 X BOM XXXXXXXXXXXXXXX
- •
- XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
&#dddd; XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XSXPurePerlXX, XXXXXXXXXXXXXXXXX, XXXXXXXXXXXXXXXX. XX, XXXXXXXXXXXXXsjisXXX,
PurePerlXXsjisXXXXXXX XSXXXXXXXX. XXXsjis-jskyXXXXXXXXXXXXX. XX, XX
XXXXXXXXXXXXX, euc-jpXXXXX, XXXXXXXXXXXXXXXXX XXXXXX.
- $code = $s->getcodelist($str)
- $str: XXX
- $code: XXXXXXXXXXX
XXXXXXX(
$str)XXXXXXXXXXXXXXX
getcode XXXX, XXXXXXXXXXXXXXX XXXXXXX.
- $str = $s->conv($ocode, $encode)
- $ocode: XXXXX (XXXXXX)
-
utf8 ucs2 ucs4 utf16
sjis cp932 euc euc-jp jis
sjis-imode sjis-imode1 sjis-imode2
utf8-imode utf8-imode1 utf8-imode2
sjis-doti sjis-doti1
sjis-jsky sjis-jsky1 sjis-jsky2
jis-jsky jis-jsky1 jis-jsky2
utf8-jsky utf8-jsky1 utf8-jsky2
sjis-au sjis-au1 sjis-au2
jis-au jis-au1 jis-au2
sjis-icon-au sjis-icon-au1 sjis-icon-au2
euc-icon-au euc-icon-au1 euc-icon-au2
jis-icon-au jis-icon-au1 jis-icon-au2
utf8-icon-au utf8-icon-au1 utf8-icon-au2
binary
( XXX.)
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
- $encode: XXXXXXXXXXXXXX
- $str: XXX
XXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXX'base64' XXXXXXXXX base64 XXXXXXXXXbase64 XXXXXXXX XXXXXXXXXX
perl-5.8.0 XXXXXX, XXX utf-8 XXXXXXXXXXXXXXXXX.
- $s->tag2bin
- XXXXXXXXX &#dddd; XXXXXXXXXXXXXXXXXXXXXXXXX
- $s->z2h
- XXXXXXXXXXXX
- $s->h2z
- XXXXXXXXXXXX
- $s->hira2kata
- XXXXXXXXXXXXXXXX
- $s->kata2hira
- XXXXXXXXXXXXXXXX
- $str = $s->jis
- $str: JIS XXXXXXXXXXXXXXX
XXXX JISXISO-2022-JPX XXXXXXXXXXX
- $str = $s->euc
- $str: euc-jp XXXXXXXXXXXXXXX
XXXX EUC-JP XXXXXXXXXXX
- $str = $s->utf8
- $str: utf-8 XXXXXXXXXXXXXXX
XXXX UTF-8 XXXXXXXXXXX
perl-5.8.0 XXXXXXX, XXXXXXXXX.
- $str = $s->ucs2
- $str: ucs2 XXXXXXXXXXXXXXX
XXXX UCS2 XXXXXXXXXXX
- $str = $s->ucs4
- $str: ucs4 XXXXXXXXXXXXXXX
XXXX UCS4 XXXXXXXXXXX
- $str = $s->utf16
- $str: ucs-16 XXXXXXXXXXXXXXX
XXXX UTF-16 XXXXXXXXXXX BOMXXXXXXX XXXXXXXXXXXXXXXXXX
- $str = $s->sjis
- $str: sjis XXXXXXXXXXXXXXX
XXXX SJISXMS-CP932X XXXXXXXXXXX
- $str = $s->sjis_imode
- $str: sjis/imodeXXX XXXXXXXXXXXXXXX
XXXX i-mode XXXXX SJIS XXXXXXXXXXX XXXimodeXXXXXXXX.
- $str = $s->sjis_imode1
- $str: sjis/imode XXX XXXXXXXXXXXXXXX
XXXX i-mode XXXXX SJIS XXXXXXXXXXX XXXXXXXXXXXXX.
- $str = $s->sjis_imode2
- $str: sjis/imode XXX XXXXXXXXXXXXXXX
XXXX i-mode XXXXX SJIS XXXXXXXXXXX XXXXX, XXXXXXXXXX.
- $str = $s->sjis_doti
- $str: sjis/dot-i XXX XXXXXXXXXXXXXXX
XXXX dot-i XXXXX SJIS XXXXXXXXXXX
- $str = $s->sjis_jsky
- $str: sjis/j-sky XXX XXXXXXXXXXXXXXX
XXXX j-sky XXXXX SJIS XXXXXXXXXXX XXXj-skyXXX(VERSION 0.15 XX,
jsky2)XXXXX.
- $str = $s->sjis_jsky1
- $str: sjis/j-sky XXX XXXXXXXXXXXXXXX
XXXX j-sky XXXXX SJIS XXXXXXXXXXX Page 1X3 XXXXXXXXXXX.
- $str = $s->sjis_jsky
- $str: sjis/j-sky XXX XXXXXXXXXXXXXXX
XXXX j-sky XXXXX SJIS XXXXXXXXXXX Page 1X6 XXXXXXXXX.
- $str = $s->sjis_icon_au
- $str: sjis/AU iconXX XXXXXXXXXXXXXXX
XXXX AU XXXXX SJIS XXXXXXXXXXX
- $str_arrayref = $s->strcut($len)
- $len: XXXXXXX(XXXX)
- $str_arrayref: XXX
$lenXXXXXXXXX(XX)XXXXXXXXXXXXXXXX
XXXXXXX, utf-8 XXXXXXXutf-8XXXXX.
- $len = $s->strlen
- $len: XXXXXXX
UTF-8 XXXXXX length() XXXXXXXXXXXXXXXXX 3 XXXXXXXXXXX
XXXXXXXXXXXXXXXX SJIS XXXXXXXXXXXXXXXXXX 2 XXXXXX
- $s->join_csv(@values);
- @values: XXXXX
XXX CSV XXXXXXXXXXXXXXXXXXXXX XXXXXXXXXX("\n")XXXXXXXX
- @values = $s->split_csv;
- @values: XXXXX
XXXXXXXXXXXXXXXXXX CSV XXXXXXXXXXXXXX
XXXXXXXXXXX("\n")XXXXXXXXXXXXXXXX
XXX binary XXXXX utf-8 XXXXXXXX. binary XXXXXXXXXXXXXXX.
XXXXXXXXXXXXXXXXX¶
+---------------+----+-----+-------+
|encoding | in | out | guess |
+---------------+----+-----+-------+
|auto : OK : -- | ----- |
+---------------+----+-----+-------+
|utf8 : OK : OK | OK |
|ucs2 : OK : OK | ----- |
|ucs4 : OK : OK | ----- |
|utf16-be : OK : -- | ----- |
|utf16-le : OK : -- | ----- |
|utf16 : OK : OK | OK(#) |
|utf32-be : OK : -- | OK |
|utf32-le : OK : -- | OK |
|utf32 : OK : -- | OK(#) |
+---------------+----+-----+-------+
|sjis : OK : OK | OK |
|cp932 : OK : OK | ----- |
|euc : OK : OK | OK |
|euc-jp : OK : OK | ----- |
|jis : OK : OK | OK |
+---------------+----+-----+-------+
|sjis-imode : OK : OK | OK |
|sjis-imode1 : OK : OK | ----- |
|sjis-imode2 : OK : OK | ----- |
|utf8-imode : OK : OK | ----- |
|utf8-imode1 : OK : OK | ----- |
|utf8-imode2 : OK : OK | ----- |
+---------------+----+-----+-------+
|sjis-doti : OK : OK | OK |
|sjis-doti1 : OK : OK | ----- |
+---------------+----+-----+-------+
|sjis-jsky : OK : OK | OK |
|sjis-jsky1 : OK : OK | ----- |
|sjis-jsky2 : OK : OK | ----- |
|jis-jsky : OK : OK | ----- |
|jis-jsky1 : OK : OK | ----- |
|jis-jsky2 : OK : OK | ----- |
|utf8-jsky : OK : OK | ----- |
|utf8-jsky1 : OK : OK | ----- |
|utf8-jsky2 : OK : OK | ----- |
+---------------+----+-----+-------+
|sjis-au : OK : OK | OK |
|sjis-au1 : OK : OK | ----- |
|sjis-au2 : OK : OK | ----- |
|jis-au : OK : OK | ----- |
|jis-au1 : OK : OK | ----- |
|jis-au2 : OK : OK | ----- |
|sjis-icon-au : OK : OK | ----- |
|sjis-icon-au1 : OK : OK | ----- |
|sjis-icon-au2 : OK : OK | ----- |
|euc-icon-au : OK : OK | ----- |
|euc-icon-au1 : OK : OK | ----- |
|euc-icon-au2 : OK : OK | ----- |
|jis-icon-au : OK : OK | ----- |
|jis-icon-au1 : OK : OK | ----- |
|jis-icon-au2 : OK : OK | ----- |
|utf8-icon-au : OK : OK | ----- |
|utf8-icon-au1 : OK : OK | ----- |
|utf8-icon-au2 : OK : OK | ----- |
+---------------+----+-----+-------+
|ascii : OK : -- | OK |
|binary : OK : OK | ----- |
+---------------+----+-----+-------+
(#): guessed when it has bom.
XXXXXXXX¶
1. utf32 (#)
2. utf16 (#)
3. utf32-be
4. utf32-le
5. ascii
6. jis
7. sjis-jsky (pp)
8. euc
9. sjis
10. sjis-jsky (xs)
11. sjis-au
12. sjis-imode
13. sjis-doti
14. utf8
15. unknown
DESCRIPTION OF UNICODE MAPPING¶
Unicode XXXXXXXXXXXXXXXXXXXX
- Shift_JIS
- MS-CP932 XXX Unicode XXXXXXXXXXXX
XXXXXXXXXXXXXURLXXXXXXXXXXXX
<ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT>
Unicode XX SJIS XXXXXXXXXXXXXXXXXXXXXXXXX XXXXX &#dddd; XXXXXXXXX
XXXXXXXXXXX?XXXXXXXXX
XXXXXXXX SJIS XXXXXXXXXXXXXXXXXXXXXX?XXXXXXXXX
- EUC-JP/ISO-2022-JP
- XXSJISXXXXXXXXXXXUnicode XXXXXXXXXX XXXXXSJIS
XXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXX
- DoCoMo i-mode
- F800 - F9FF XXXXXXXXXXXXXXXXXXU+0FF800 - U+0FF9FF
XXXXXXXXXXXXX
- ASTEL dot-i
- F000 - F4FF XXXXXXXXXXXXXXXXXXU+0FF000 - U+0FF4FF
XXXXXXXXXXXXX
- J-PHONE J-SKY
- J-SKY XXXXXXXXXXXXXXXX "\e\$" XXXXXXX1XXXXX
1XXXXXXX2XXXXX"\x0f"XXXXXXX 1XXXXXXXXXXXXXXXXX2XXXXXXXXXXXXXXXX
XXXXXXXXXXXX
XX1XXXXX2XXXXXXXX1XXXXXXXX4500 - 47FF XXXXX U+0FFB00 - U+0FFDFF
XXXXXXXXXXXXX
Unicode::Japanese XXXUnicode XX J-SKY XXXXXXXXXXXXXX
1XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
- AU
- XXXXXXXXXXXXU+0FF500 - U+0FF6FF XXXXXXXXXXXXX
PurePerl mode¶
use Unicode::Japanese qw(PurePerl);
use XXXXX 'PurePerl' XXXXXXX, XSXXXXXXXXXXXXXXXXXX.
XXXXXX "bug-unicode-japanese at rt.cpan.org" XX XXXXXXXX. XXXX
http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Unicode-Japanese
<
http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Unicode-Japanese>. XXX
web XXXXXXXXXXXXXXXXXX. XXXXXXXXXX, XXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXX.
- •
- EUC-JPXJIS XXXXXSJIS XXXXXXXX UTF-8 XXXXXXXXX SJIS
XXXXXXXXXXXXXXXXXXXXXXXXXXX
- •
- XSXXXXXXXXXXEUC-JPXSJIS(XXXXX)XXXXXXXXX \e
XXXXXXXEUC-JPXSJIS XXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXX
- •
- Japanese.pm XXXXXXXXXXXXXXXXXXFTP X ASCII XXXX
XXXXXXXXXXXXXXX
XXXX¶
XXXXXXXXXXXXXXX perldoc XXXXXXXXXXXXXX.
perldoc Unicode::Japanese
XX, XXXXXXXXXXXXXXXX:
- •
- AnnoCPAN: Annotated CPAN documentation
http://annocpan.org/dist/Unicode-Japanese
<http://annocpan.org/dist/Unicode-Japanese>
- •
- CPAN Ratings
http://cpanratings.perl.org/d/Unicode-Japanese
<http://cpanratings.perl.org/d/Unicode-Japanese>
- •
- RT: CPAN's request tracker
http://rt.cpan.org/NoAuth/Bugs.html?Dist=Unicode-Japanese
<http://rt.cpan.org/NoAuth/Bugs.html?Dist=Unicode-Japanese>
- •
- Search CPAN
http://search.cpan.org/dist/Unicode-Japanese
<http://search.cpan.org/dist/Unicode-Japanese>
CREDITS¶
Thanks very much to:
NAKAYAMA Nao
SUGIURA Tatsuki & Debian JP Project
XXXXXXXXXX¶
Copyright 2001-2008 SANO Taku (SAWATARI Mikage) and YAMASHINA Hio, all rights
reserved.
XXXXXXXXXXXXXXXXXXXXXXXX Perl XXX XXXXXX XXXXXXXXXXXXXXXXXXX.