NAME¶
Tcl_GetEncoding, Tcl_FreeEncoding, Tcl_GetEncodingFromObj,
  Tcl_ExternalToUtfDString, Tcl_ExternalToUtf, Tcl_UtfToExternalDString,
  Tcl_UtfToExternal, Tcl_WinTCharToUtf, Tcl_WinUtfToTChar, Tcl_GetEncodingName,
  Tcl_SetSystemEncoding, Tcl_GetEncodingNameFromEnvironment,
  Tcl_GetEncodingNames, Tcl_CreateEncoding, Tcl_GetEncodingSearchPath,
  Tcl_SetEncodingSearchPath, Tcl_GetDefaultEncodingDir,
  Tcl_SetDefaultEncodingDir - procedures for creating and using encodings
SYNOPSIS¶
#include <tcl.h>
Tcl_Encoding
Tcl_GetEncoding(interp, name)
void
Tcl_FreeEncoding(encoding)
int
Tcl_GetEncodingFromObj(interp, objPtr, encodingPtr)
char *
Tcl_ExternalToUtfDString(encoding, src, srcLen, dstPtr)
char *
Tcl_UtfToExternalDString(encoding, src, srcLen, dstPtr)
int
Tcl_ExternalToUtf(interp, encoding, src, srcLen, flags, statePtr,
                  dst, dstLen, srcReadPtr, dstWrotePtr, dstCharsPtr)
int
Tcl_UtfToExternal(interp, encoding, src, srcLen, flags, statePtr,
                  dst, dstLen, srcReadPtr, dstWrotePtr, dstCharsPtr)
char *
Tcl_WinTCharToUtf(tsrc, srcLen, dstPtr)
TCHAR *
Tcl_WinUtfToTChar(src, srcLen, dstPtr)
const char *
Tcl_GetEncodingName(encoding)
int
Tcl_SetSystemEncoding(interp, name)
const char *
Tcl_GetEncodingNameFromEnvironment(bufPtr)
void
Tcl_GetEncodingNames(interp)
Tcl_Encoding
Tcl_CreateEncoding(typePtr)
Tcl_Obj *
Tcl_GetEncodingSearchPath()
int
Tcl_SetEncodingSearchPath(searchPath)
const char *
Tcl_GetDefaultEncodingDir(void)
void
Tcl_SetDefaultEncodingDir(path)
ARGUMENTS¶
  - Tcl_Interp *interp (in)
 
  - Interpreter to use for error reporting, or NULL if no error reporting is
      desired.
 
  - const char *name (in)
 
  - Name of encoding to load.
 
  - Tcl_Encoding encoding (in)
 
  - The encoding to query, free, or use for converting text. If
      encoding is NULL, the current system encoding is used.
 
  - Tcl_Obj *objPtr (in)
 
  - Name of encoding to get token for.
 
  - Tcl_Encoding *encodingPtr (out)
 
  - Points to storage where encoding token is to be written.
 
  - const char *src (in)
 
  - For the Tcl_ExternalToUtf functions, an array of bytes in the
      specified encoding that are to be converted to UTF-8. For the
      Tcl_UtfToExternal and Tcl_WinUtfToTChar functions, an array
      of UTF-8 characters to be converted to the specified encoding.
 
  - const TCHAR *tsrc (in)
 
  - An array of Windows TCHAR characters to convert to UTF-8.
 
  - int srcLen (in)
 
  - Length of src or tsrc in bytes. If the length is negative,
      the encoding-specific length of the string is used.
 
  - Tcl_DString *dstPtr (out)
 
  - Pointer to an uninitialized or free Tcl_DString in which the
      converted result will be stored.
 
  - int flags (in)
 
  - Various flag bits OR-ed together. TCL_ENCODING_START signifies that
      the source buffer is the first block in a (potentially multi-block) input
      stream, telling the conversion routine to reset to an initial state and
      perform any initialization that needs to occur before the first byte is
      converted. TCL_ENCODING_END signifies that the source buffer is the
      last block in a (potentially multi-block) input stream, telling the
      conversion routine to perform any finalization that needs to occur after
      the last byte is converted and then to reset to an initial state.
      TCL_ENCODING_STOPONERROR signifies that the conversion routine
      should return immediately upon reading a source character that does not
      exist in the target encoding; otherwise a default fallback character will
      automatically be substituted.
 
  - Tcl_EncodingState *statePtr (in/out)
 
  - Used when converting a (generally long or indefinite length) byte stream
      in a piece-by-piece fashion. The conversion routine stores its current
      state in *statePtr after src (the buffer containing the
      current piece) has been converted; that state information must be passed
      back when converting the next piece of the stream so the conversion
      routine knows what state it was in when it left off at the end of the last
      piece. May be NULL, in which case the value specified for flags is
      ignored and the source buffer is assumed to contain the complete string to
      convert.
 
  - char *dst (out)
 
  - Buffer in which the converted result will be stored. No more than
      dstLen bytes will be stored in dst.
 
  - int dstLen (in)
 
  - The maximum length of the output buffer dst in bytes.
 
  - int *srcReadPtr (out)
 
  - Filled with the number of bytes from src that were actually
      converted. This may be less than the original source length if there was a
      problem converting some source characters. May be NULL.
 
  - int *dstWrotePtr (out)
 
  - Filled with the number of bytes that were actually stored in the output
      buffer as a result of the conversion. May be NULL.
 
  - int *dstCharsPtr (out)
 
  - Filled with the number of characters that correspond to the number of
      bytes stored in the output buffer. May be NULL.
 
  - Tcl_DString *bufPtr (out)
 
  - Storage for the prescribed system encoding name.
 
  - const Tcl_EncodingType *typePtr (in)
 
  - Structure that defines a new type of encoding.
 
  - Tcl_Obj *searchPath (in)
 
  - List of filesystem directories in which to search for encoding data
    files.
 
  - const char *path (in)
 
  - A path to the location of the encoding file.
    
    
     
   
INTRODUCTION¶
These routines convert between Tcl's internal character representation, UTF-8,
  and character representations used by various operating systems or file
  systems, such as Unicode, ASCII, or Shift-JIS. When operating on strings, such
  as such as obtaining the names of files or displaying characters using
  international fonts, the strings must be translated into one or possibly
  multiple formats that the various system calls can use. For instance, on a
  Japanese Unix workstation, a user might obtain a filename represented in the
  EUC-JP file encoding and then translate the characters to the jisx0208 font
  encoding in order to display the filename in a Tk widget. The purpose of the
  encoding package is to help bridge the translation gap. UTF-8 provides an
  intermediate staging ground for all the various encodings. In the example
  above, text would be translated into UTF-8 from whatever file encoding the
  operating system is using. Then it would be translated from UTF-8 into
  whatever font encoding the display routines require.
Some basic encodings are compiled into Tcl. Others can be defined by the user or
  dynamically loaded from encoding files in a platform-independent manner.
DESCRIPTION¶
Tcl_GetEncoding finds an encoding given its 
name. The name may
  refer to a built-in Tcl encoding, a user-defined encoding registered by
  calling 
Tcl_CreateEncoding, or a dynamically-loadable encoding file.
  The return value is a token that represents the encoding and can be used in
  subsequent calls to procedures such as 
Tcl_GetEncodingName,
  
Tcl_FreeEncoding, and 
Tcl_UtfToExternal. If the name did not
  refer to any known or loadable encoding, NULL is returned and an error message
  is returned in 
interp.
The encoding package maintains a database of all encodings currently in use. The
  first time 
name is seen, 
Tcl_GetEncoding returns an encoding
  with a reference count of 1. If the same 
name is requested further
  times, then the reference count for that encoding is incremented without the
  overhead of allocating a new encoding and all its associated data structures.
When an 
encoding is no longer needed, 
Tcl_FreeEncoding should be
  called to release it. When an 
encoding is no longer in use anywhere
  (i.e., it has been freed as many times as it has been gotten)
  
Tcl_FreeEncoding will release all storage the encoding was using and
  delete it from the database.
Tcl_GetEncodingFromObj treats the string representation of 
objPtr
  as an encoding name, and finds an encoding with that name, just as
  
Tcl_GetEncoding does. When an encoding is found, it is cached within
  the 
objPtr value for future reference, the 
Tcl_Encoding token is
  written to the storage pointed to by 
encodingPtr, and the value
  
TCL_OK is returned. If no such encoding is found, the value
  
TCL_ERROR is returned, and no writing to 
*encodingPtr
  takes place. Just as with 
Tcl_GetEncoding, the caller should call
  
Tcl_FreeEncoding on the resulting encoding token when that token will
  no longer be used.
Tcl_ExternalToUtfDString converts a source buffer 
src from the
  specified 
encoding into UTF-8. The converted bytes are stored in
  
dstPtr, which is then null-terminated. The caller should eventually
  call 
Tcl_DStringFree to free any information stored in 
dstPtr.
  When converting, if any of the characters in the source buffer cannot be
  represented in the target encoding, a default fallback character will be used.
  The return value is a pointer to the value stored in the DString.
Tcl_ExternalToUtf converts a source buffer 
src from the specified
  
encoding into UTF-8. Up to 
srcLen bytes are converted from the
  source buffer and up to 
dstLen converted bytes are stored in
  
dst. In all cases, 
*srcReadPtr is filled with the number of
  bytes that were successfully converted from 
src and 
*dstWrotePtr
  is filled with the corresponding number of bytes that were stored in
  
dst. The return value is one of the following:
  - TCL_OK
 
  - All bytes of src were converted.
 
  - TCL_CONVERT_NOSPACE
 
  - The destination buffer was not large enough for all of the converted data;
      as many characters as could fit were converted though.
 
  - TCL_CONVERT_MULTIBYTE
 
  - The last few bytes in the source buffer were the beginning of a multibyte
      sequence, but more bytes were needed to complete this sequence. A
      subsequent call to the conversion routine should pass a buffer containing
      the unconverted bytes that remained in src plus some further bytes
      from the source stream to properly convert the formerly split-up multibyte
      sequence.
 
  - TCL_CONVERT_SYNTAX
 
  - The source buffer contained an invalid character sequence. This may occur
      if the input stream has been damaged or if the input encoding method was
      misidentified.
 
  - TCL_CONVERT_UNKNOWN
 
  - The source buffer contained a character that could not be represented in
      the target encoding and TCL_ENCODING_STOPONERROR was
    specified.
 
 
Tcl_UtfToExternalDString converts a source buffer 
src from UTF-8
  into the specified 
encoding. The converted bytes are stored in
  
dstPtr, which is then terminated with the appropriate encoding-specific
  null. The caller should eventually call 
Tcl_DStringFree to free any
  information stored in 
dstPtr. When converting, if any of the characters
  in the source buffer cannot be represented in the target encoding, a default
  fallback character will be used. The return value is a pointer to the value
  stored in the DString.
Tcl_UtfToExternal converts a source buffer 
src from UTF-8 into the
  specified 
encoding. Up to 
srcLen bytes are converted from the
  source buffer and up to 
dstLen converted bytes are stored in
  
dst. In all cases, 
*srcReadPtr is filled with the number of
  bytes that were successfully converted from 
src and 
*dstWrotePtr
  is filled with the corresponding number of bytes that were stored in
  
dst. The return values are the same as the return values for
  
Tcl_ExternalToUtf.
Tcl_WinUtfToTChar and 
Tcl_WinTCharToUtf are Windows-only
  convenience functions for converting between UTF-8 and Windows strings. On
  Windows 95 (as with the Unix operating system), all strings exchanged between
  Tcl and the operating system are “char” based. On Windows NT,
  some strings exchanged between Tcl and the operating system are
  “char” oriented while others are in Unicode. By convention, in
  Windows a TCHAR is a character in the ANSI code page on Windows 95 and a
  Unicode character on Windows NT.
If you planned to use the same “char” based interfaces on both
  Windows 95 and Windows NT, you could use 
Tcl_UtfToExternal and
  
Tcl_ExternalToUtf (or their 
Tcl_DString equivalents) with an
  encoding of NULL (the current system encoding). On the other hand, if you
  planned to use the Unicode interface when running on Windows NT and the
  “char” interfaces when running on Windows 95, you would have to
  perform the following type of test over and over in your program (as
  represented in pseudo-code):
if (running NT) {
    encoding <- Tcl_GetEncoding("unicode");
    nativeBuffer <- Tcl_UtfToExternal(encoding, utfBuffer);
    Tcl_FreeEncoding(encoding);
} else {
    nativeBuffer <- Tcl_UtfToExternal(NULL, utfBuffer);
}
 
Tcl_WinUtfToTChar and 
Tcl_WinTCharToUtf automatically handle this
  test and use the proper encoding based on the current operating system.
  
Tcl_WinUtfToTChar returns a pointer to a TCHAR string, and
  
Tcl_WinTCharToUtf expects a TCHAR string pointer as the 
src
  string. Otherwise, these functions behave identically to
  
Tcl_UtfToExternalDString and 
Tcl_ExternalToUtfDString.
Tcl_GetEncodingName is roughly the inverse of 
Tcl_GetEncoding.
  Given an 
encoding, the return value is the 
name argument that
  was used to create the encoding. The string returned by
  
Tcl_GetEncodingName is only guaranteed to persist until the
  
encoding is deleted. The caller must not modify this string.
Tcl_SetSystemEncoding sets the default encoding that should be used
  whenever the user passes a NULL value for the 
encoding argument to any
  of the other encoding functions. If 
name is NULL, the system encoding
  is reset to the default system encoding, 
binary. If the name did not
  refer to any known or loadable encoding, 
TCL_ERROR is returned and an
  error message is left in 
interp. Otherwise, this procedure increments
  the reference count of the new system encoding, decrements the reference count
  of the old system encoding, and returns 
TCL_OK.
Tcl_GetEncodingNameFromEnvironment provides a means for the Tcl library
  to report the encoding name it believes to be the correct one to use as the
  system encoding, based on system calls and examination of the environment
  suitable for the platform. It accepts 
bufPtr, a pointer to an
  uninitialized or freed 
Tcl_DString and writes the encoding name to it.
  The 
Tcl_DStringValue is returned.
Tcl_GetEncodingNames sets the 
interp result to a list consisting
  of the names of all the encodings that are currently defined or can be
  dynamically loaded, searching the encoding path specified by
  
Tcl_SetDefaultEncodingDir. This procedure does not ensure that the
  dynamically-loadable encoding files contain valid data, but merely that they
  exist.
Tcl_CreateEncoding defines a new encoding and registers the C procedures
  that are called back to convert between the encoding and UTF-8. Encodings
  created by 
Tcl_CreateEncoding are thereafter visible in the database
  used by 
Tcl_GetEncoding. Just as with the 
Tcl_GetEncoding
  procedure, the return value is a token that represents the encoding and can be
  used in subsequent calls to other encoding functions.
  
Tcl_CreateEncoding returns an encoding with a reference count of 1. If
  an encoding with the specified 
name already exists, then its entry in
  the database is replaced with the new encoding; the token for the old encoding
  will remain valid and continue to behave as before, but users of the new token
  will now call the new encoding procedures.
The 
typePtr argument to 
Tcl_CreateEncoding contains information
  about the name of the encoding and the procedures that will be called to
  convert between this encoding and UTF-8. It is defined as follows:
typedef struct Tcl_EncodingType {
        const char * encodingName;
        Tcl_EncodingConvertProc * toUtfProc;
        Tcl_EncodingConvertProc * fromUtfProc;
        Tcl_EncodingFreeProc * freeProc;
        ClientData  clientData;
        int  nullSize;
} Tcl_EncodingType;  
 
The 
encodingName provides a string name for the encoding, by which it can
  be referred in other procedures such as 
Tcl_GetEncoding. The
  
toUtfProc refers to a callback procedure to invoke to convert text from
  this encoding into UTF-8. The 
fromUtfProc refers to a callback
  procedure to invoke to convert text from UTF-8 into this encoding. The
  
freeProc refers to a callback procedure to invoke when this encoding is
  deleted. The 
freeProc field may be NULL. The 
clientData contains
  an arbitrary one-word value passed to 
toUtfProc, 
fromUtfProc,
  and 
freeProc whenever they are called. Typically, this is a pointer to
  a data structure containing encoding-specific information that can be used by
  the callback procedures. For instance, two very similar encodings such as
  
ascii and 
macRoman may use the same callback procedure, but use
  different values of 
clientData to control its behavior. The
  
nullSize specifies the number of zero bytes that signify end-of-string
  in this encoding. It must be 
1 (for single-byte or multi-byte encodings
  like ASCII or Shift-JIS) or 
2 (for double-byte encodings like Unicode).
  Constant-sized encodings with 3 or more bytes per character (such as CNS11643)
  are not accepted.
The callback procedures 
toUtfProc and 
fromUtfProc should match the
  type 
Tcl_EncodingConvertProc:
typedef int Tcl_EncodingConvertProc(
        ClientData  clientData,
        const char * src, 
        int  srcLen, 
        int  flags, 
        Tcl_EncodingState * statePtr,
        char * dst, 
        int  dstLen, 
        int * srcReadPtr,
        int * dstWrotePtr,
        int * dstCharsPtr);
 
The 
toUtfProc and 
fromUtfProc procedures are called by the
  
Tcl_ExternalToUtf or 
Tcl_UtfToExternal family of functions to
  perform the actual conversion. The 
clientData parameter to these
  procedures is the same as the 
clientData field specified to
  
Tcl_CreateEncoding when the encoding was created. The remaining
  arguments to the callback procedures are the same as the arguments, documented
  at the top, to 
Tcl_ExternalToUtf or 
Tcl_UtfToExternal, with the
  following exceptions. If the 
srcLen argument to one of those high-level
  functions is negative, the value passed to the callback procedure will be the
  appropriate encoding-specific string length of 
src. If any of the
  
srcReadPtr, 
dstWrotePtr, or 
dstCharsPtr arguments to one
  of the high-level functions is NULL, the corresponding value passed to the
  callback procedure will be a non-NULL location.
The callback procedure 
freeProc, if non-NULL, should match the type
  
Tcl_EncodingFreeProc:
typedef void Tcl_EncodingFreeProc(
        ClientData  clientData);
 
This 
freeProc function is called when the encoding is deleted. The
  
clientData parameter is the same as the 
clientData field
  specified to 
Tcl_CreateEncoding when the encoding was created.
Tcl_GetEncodingSearchPath and 
Tcl_SetEncodingSearchPath are called
  to access and set the list of filesystem directories searched for encoding
  data files.
The value returned by 
Tcl_GetEncodingSearchPath is the value stored by
  the last successful call to 
Tcl_SetEncodingSearchPath. If no calls to
  
Tcl_SetEncodingSearchPath have occurred, Tcl will compute an initial
  value based on the environment. There is one encoding search path for the
  entire process, shared by all threads in the process.
Tcl_SetEncodingSearchPath stores 
searchPath and returns
  
TCL_OK, unless 
searchPath is not a valid Tcl list, which causes
  
TCL_ERROR to be returned. The elements of 
searchPath are not
  verified as existing readable filesystem directories. When searching for
  encoding data files takes place, and non-existent or non-readable filesystem
  directories on the 
searchPath are silently ignored.
Tcl_GetDefaultEncodingDir and 
Tcl_SetDefaultEncodingDir are
  obsolete interfaces best replaced with calls to
  
Tcl_GetEncodingSearchPath and 
Tcl_SetEncodingSearchPath. They
  are called to access and set the first element of the 
searchPath list.
  Since Tcl searches 
searchPath for encoding data files in list order,
  these routines establish the “default” directory in which to
  find encoding data files.
ENCODING FILES¶
Space would prohibit precompiling into Tcl every possible encoding algorithm, so
  many encodings are stored on disk as dynamically-loadable encoding files. This
  behavior also allows the user to create additional encoding files that can be
  loaded using the same mechanism. These encoding files contain information
  about the tables and/or escape sequences used to map between an external
  encoding and Unicode. The external encoding may consist of single-byte,
  multi-byte, or double-byte characters.
Each dynamically-loadable encoding is represented as a text file. The initial
  line of the file, beginning with a “#” symbol, is a comment that
  provides a human-readable description of the file. The next line identifies
  the type of encoding file. It can be one of the following letters:
  - [1] S
 
  - A single-byte encoding, where one character is always one byte long in the
      encoding. An example is iso8859-1, used by many European
    languages.
 
  - [2] D
 
  - A double-byte encoding, where one character is always two bytes long in
      the encoding. An example is big5, used for Chinese text.
 
  - [3] M
 
  - A multi-byte encoding, where one character may be either one or two bytes
      long. Certain bytes are lead bytes, indicating that another byte must
      follow and that together the two bytes represent one character. Other
      bytes are not lead bytes and represent themselves. An example is
      shiftjis, used by many Japanese computers.
 
  - [4] E
 
  - An escape-sequence encoding, specifying that certain sequences of bytes do
      not represent characters, but commands that describe how following bytes
      should be interpreted.
 
The rest of the lines in the file depend on the type.
Cases [1], [2], and [3] are collectively referred to as table-based encoding
  files. The lines in a table-based encoding file are in the same format as this
  example taken from the 
shiftjis encoding (this is not the complete
  file):
# Encoding file: shiftjis, multi-byte
M
003F 0 40
00
0000000100020003000400050006000700080009000A000B000C000D000E000F
0010001100120013001400150016001700180019001A001B001C001D001E001F
0020002100220023002400250026002700280029002A002B002C002D002E002F
0030003100320033003400350036003700380039003A003B003C003D003E003F
0040004100420043004400450046004700480049004A004B004C004D004E004F
0050005100520053005400550056005700580059005A005B005C005D005E005F
0060006100620063006400650066006700680069006A006B006C006D006E006F
0070007100720073007400750076007700780079007A007B007C007D203E007F
0080000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000FF61FF62FF63FF64FF65FF66FF67FF68FF69FF6AFF6BFF6CFF6DFF6EFF6F
FF70FF71FF72FF73FF74FF75FF76FF77FF78FF79FF7AFF7BFF7CFF7DFF7EFF7F
FF80FF81FF82FF83FF84FF85FF86FF87FF88FF89FF8AFF8BFF8CFF8DFF8EFF8F
FF90FF91FF92FF93FF94FF95FF96FF97FF98FF99FF9AFF9BFF9CFF9DFF9EFF9F
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
81
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
300030013002FF0CFF0E30FBFF1AFF1BFF1FFF01309B309C00B4FF4000A8FF3E
FFE3FF3F30FD30FE309D309E30034EDD30053006300730FC20152010FF0F005C
301C2016FF5C2026202520182019201C201DFF08FF0930143015FF3BFF3DFF5B
FF5D30083009300A300B300C300D300E300F30103011FF0B221200B100D70000
00F7FF1D2260FF1CFF1E22662267221E22342642264000B0203220332103FFE5
FF0400A200A3FF05FF03FF06FF0AFF2000A72606260525CB25CF25CE25C725C6
25A125A025B325B225BD25BC203B301221922190219121933013000000000000
000000000000000000000000000000002208220B2286228722822283222A2229
000000000000000000000000000000002227222800AC21D221D4220022030000
0000000000000000000000000000000000000000222022A52312220222072261
2252226A226B221A223D221D2235222B222C0000000000000000000000000000
212B2030266F266D266A2020202100B6000000000000000025EF000000000000
 
The third line of the file is three numbers. The first number is the fallback
  character (in base 16) to use when converting from UTF-8 to this encoding. The
  second number is a 
1 if this file represents the encoding for a symbol
  font, or 
0 otherwise. The last number (in base 10) is how many pages of
  data follow.
Subsequent lines in the example above are pages that describe how to map from
  the encoding into 2-byte Unicode. The first line in a page identifies the page
  number. Following it are 256 double-byte numbers, arranged as 16 rows of 16
  numbers. Given a character in the encoding, the high byte of that character is
  used to select which page, and the low byte of that character is used as an
  index to select one of the double-byte numbers in that page - the value
  obtained being the corresponding Unicode character. By examination of the
  example above, one can see that the characters 0x7E and 0x8163 in
  
shiftjis map to 203E and 2026 in Unicode, respectively.
Following the first page will be all the other pages, each in the same format as
  the first: one number identifying the page followed by 256 double-byte Unicode
  characters. If a character in the encoding maps to the Unicode character 0000,
  it means that the character does not actually exist. If all characters on a
  page would map to 0000, that page can be omitted.
Case [4] is the escape-sequence encoding file. The lines in an this type of file
  are in the same format as this example taken from the 
iso2022-jp
  encoding:
# Encoding file: iso2022-jp, escape-driven
E
init		{}
final		{}
iso8859-1	\x1b(B
jis0201		\x1b(J
jis0208		\x1b$@
jis0208		\x1b$B
jis0212		\x1b$(D
gb2312		\x1b$A
ksc5601		\x1b$(C
 
In the file, the first column represents an option and the second column is the
  associated value. 
init is a string to emit or expect before the first
  character is converted, while 
final is a string to emit or expect after
  the last character. All other options are names of table-based encodings; the
  associated value is the escape-sequence that marks that encoding. Tcl syntax
  is used for the values; in the above example, for instance, “
  
{}” represents the empty string and “ 
\x1b”
  represents character 27.
When 
Tcl_GetEncoding encounters an encoding 
name that has not been
  loaded, it attempts to load an encoding file called 
name.enc
  from the 
encoding subdirectory of each directory that Tcl searches for
  its script library. If the encoding file exists, but is malformed, an error
  message will be left in 
interp.
KEYWORDS¶
utf, encoding, convert