.\" Copyright (C) 2001 Information-technology Promotion Agency (IPA) .\" Copyright (C) 2001-2011 .\" National Institute of Advanced Industrial Science and Technology (AIST) .\" This file is part of the m17n library documentation. .\" Permission is granted to copy, distribute and/or modify this document .\" under the terms of the GNU Free Documentation License, Version 1.2 or .\" any later version published by the Free Software Foundation; with no .\" Invariant Section, no Front-Cover Texts, .\" and no Back-Cover Texts. A copy of the license is included in the .\" appendix entitled "GNU Free Documentation License". .TH "mdbGeneral" 5 "Mon Sep 25 2023" "Version 1.8.4" "The m17n Library" \" -*- nroff -*- .ad l .nh .SH NAME mdbGeneral \- General Format .SH "DESCRIPTION" .PP The mdatabase_load() function returns the data specified by tags in the form of plist if the first tag is not \fCMchartable\fP nor \fCMcharset\fP\&. The keys of the returned plist are limited to \fCMinteger\fP, \fCMsymbol\fP, \fCMtext\fP, and \fCMplist\fP\&. The type of the value is unambiguously determined by the corresponding key\&. If the key is \fCMinteger\fP, the value is an integer\&. If the key is \fCMsymbol\fP, the value is a symbol\&. And so on\&. .PP A number of expressions are possible to represent a plist\&. For instance, we can use the form \fC(K1:V1, K2:V2, \&.\&.\&., Kn:Vn)\fP to represent a plist whose first property key and value are K1 and V1, second key and value are K2 and V2, and so on\&. However, we can use a simpler expression here because the types of plists used in the m17n database are fairly restricted\&. .PP Hereafter, we use an expression, which is similar to S\-expression, to represent a plist\&. (Actually, the default database loader of the m17n library is designed to read data files written in this expression\&.) .PP The expression consists of one or more \fIelements\fP\&. Each element represents a property, i\&.e\&. a single element of a plist\&. .PP Elements are separated by one or more \fIwhitespaces\fP, i\&.e\&. a space (code 32), a tab (code 9), or a newline (code 10)\&. Comments begin with a semicolon (\fC;\fP) and extend to the end of the line\&. .PP The key and the value of each property are determined based on the type of the element as explained below\&. .PP .PD 0 .IP "\(bu" 2 INTEGER .PP An element that matches the regular expression \fC\-?[0\-9]+\fP or \fC0[xX][0\-9A\-Fa\-f]+\fP represents a property whose key is \fCMinteger\fP\&. An element matching the former expression is interpreted as an integer in decimal notation, and one matching the latter is interpreted as an integer in hexadecimal notation\&. The value of the property is the result of interpretation\&. .PP For instance, the element \fC0xA0\fP represents a property whose value is 160 in decimal\&. .PP .IP "\(bu" 2 SYMBOL .PP An element that matches the regular expression \fC[^\-(0\-9]\fP\fC([^\\()]|\\\&.)+\fP represents a property whose key is \fC Msymbol\fP\&. In the element, \fC\\t\fP, \fC\\n\fP, \fC\\r\fP, and \fC\\e\fP are replaced with tab (code 9), newline (code 10), carriage return (code 13), and escape (code 27) respectively\&. Other characters following a backslash is interpreted as it is\&. The value of the property is the symbol having the resulting string as its name\&. .PP For instance, the element \fCabc\\ def\fP represents a property whose value is the symbol having the name 'abc def'\&. .PP .IP "\(bu" 2 MTEXT .PP An element that matches the regular expression \fC'([^']|\\')*'\fP represents a property whose key is \fCMtext\fP\&. The backslash escape explained above also applies here\&. r, each part in the element matching the regular expression \fC \\[xX][0\-9A\-Fa\-f][0\-9A\-Fa\-f]\fP is replaced with its hexadecimal interpretation\&. .PP After having resolved the backslash escapes, the byte sequence between the double quotes is interpreted as a UTF\-8 sequence and decoded into an M\-text\&. This M\-text is the value of the property\&. .PP .IP "\(bu" 2 PLIST .PP Zero or more elements surrounded by a pair of parentheses represent a property whose key is \fCMplist\fP\&. Whitespaces before and after a parenthesis can be omitted\&. The value of the property is a plist, which is the result of recursive interpretation of the elements between the parentheses\&. .PP .PP .SH "SYNTAX NOTATION" .PP In an explanation of a plist format of data, a BNF\-like notation is used\&. In the notation, non\-terminals are represented by a string of uppercase letters (including '\-' in the middle), terminals are represented by a string surrounded by '''\&. Special non\-terminals INTEGER, SYMBOL, MTEXT and PLIST represents property integer, symbol, M\-text, or plist respectively\&. .SH "EXAMPLE" .PP Here is an example of database data that is read into a plist of this simple format: .PP .PP .nf DATA\-FORMAT ::= [ INTEGER | SYMBOL | MTEXT | FUNC ] * FUNC ::= '(' FUNC\-NAME FUNC\-ARG * ')' FUNC\-NAME ::= SYMBOL FUNC\-ARG ::= INTEGER | SYMBOL | MTEXT | '(' FUNC\-ARG ')' .fi .PP .PP For instance, a data file that contains this text matches the above syntax: .PP .PP .nf abc 123 (pqr 0xff) "m\"text" (_\\_ ("string" xyz) \-456) .fi .PP .PP and is read into this plist: .PP .PP .nf 1st element: key: Msymbol, value: abc 2nd element: key: Minteger, value: 123 3rd element: key: Mplist, value: a plist of these elements: 1st element: key Msymbol, value: pgr 2nd element: key Minteger, value: 255 4th element: key: Mtext, value: m"text 5th element: key: Mplist, value: a plist of these elements: 1st element: key: Msymbol, value: _\_ 2nd element: key: Mplist, value: a plist of these elements: 1st element: key: Mtext, value: string 2nd element: key: Msymbol, value: xyz 3rd element: key: Minteger, value: \-456 .fi .PP .SH COPYRIGHT Copyright (C) 2001 Information\-technology Promotion Agency (IPA) .br Copyright (C) 2001\-2011 National Institute of Advanced Industrial Science and Technology (AIST) .br Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License .