ZEBRASRV(8) | [FIXME: manual] | ZEBRASRV(8) |
NAME¶
zebrasrv - Zebra ServerSYNOPSIS¶
zebrasrv
[ -install] [-installa] [-remove]
[-a file] [-v level]
[-l file] [-u uid]
[-c config] [-f vconfig]
[ -C fname] [-t minutes]
[-k kilobytes] [-d daemon]
[ -w dir] [-p pidfile]
[-ziDST1] [listener-spec...]
DESCRIPTION¶
Zebra is a high-performance, general-purpose structured text indexing and retrieval engine. It reads structured records in a variety of input formats (e.g. email, XML, MARC) and allows access to them through exact boolean search expressions and relevance-ranked free-text queries.OPTIONS¶
The options for zebrasrv are the same as those for YAZ' yaz-ztest. Option -c specifies a Zebra configuration file - if omitted zebra.cfg is read. -a fileSpecify a file for dumping PDUs (for
diagnostic purposes). The special name - (dash) sends output to stderr.
-S
Don't fork or make threads on connection
requests. This is good for debugging, but not recommended for real operation:
Although the server is asynchronous and non-blocking, it can be nice to keep a
software malfunction (okay then, a crash) from affecting all current users.
The server can only accept a single connection in this mode.
-1
Like -S but after one session the server
exits. This mode is for debugging only.
-T
Operate the server in threaded mode. The
server creates a thread for each connection rather than a fork a process. Only
available on UNIX systems that offers POSIX threads.
-s
Use the SR protocol (obsolete).
-z
Use the Z39.50 protocol (default). This option
and -s complement each other. You can use both multiple times on the same
command line, between listener-specifications (see below). This way, you can
set up the server to listen for connections in both protocols concurrently, on
different local ports.
-l file
Specify an output file for the diagnostic
messages. The default is to write this information to stderr
-c config-file
Read configuration information from
config-file. The default configuration is ./zebra.cfg
-f vconfig
This specifies an XML file that describes one
or more YAZ frontend virtual servers. See section VIRTUAL HOSTS for
details.
-C fname
Sets SSL certificate file name for server
(PEM).
-v level
The log level. Use a comma-separated list of
members of the set {fatal,debug,warn,log,malloc,all,none}.
-u uid
Set user ID. Sets the real UID of the server
process to that of the given user. It's useful if you aren't comfortable with
having the server run as root, but you need to start it as such to bind a
privileged port.
-w working-directory
The server changes to this working directory
during before listening on incoming connections. This option is useful when
the server is operating from the inetd daemon (see -i).
-p pidfile
Specifies that the server should write its
Process ID to file given by pidfile. A typical location would be
/var/run/zebrasrv.pid.
-i
Use this to make the the server run from the
inetd server (UNIX only). Make sure you use the logfile option -l in
conjunction with this mode and specify the -l option before any other
options.
-D
Use this to make the server put itself in the
background and run as a daemon. If neither -i nor -D is given, the server
starts in the foreground.
-install
Use this to install the server as an NT
service (Windows NT/2000/XP only). Control the server by going to the Services
in the Control Panel.
-installa
Use this to install and activate the server as
an NT service (Windows NT/2000/XP only). Control the server by going to the
Services in the Control Panel.
-remove
Use this to remove the server from the NT
services (Windows NT/2000/XP only).
-t minutes
Idle session timeout, in minutes. Default is
60 minutes.
-k size
Maximum record size/message size, in
kilobytes. Default is 1024 KB (1 MB).
-d daemon
Set name of daemon to be used in hosts access
file. See hosts_access(5) and tcpd(8).
A listener-address consists of an optional transport mode followed by a
colon (:) followed by a listener address. The transport mode is either a file
system socket unix, a SSL TCP/IP socket ssl, or a plain TCP/IP socket tcp
(default).
For TCP, an address has the form
hostname | IP-number [: portnumber]
zebrasrv @ zebrasrv tcp:some.server.name.org:1234 zebrasrv ssl:@:3000
zebrasrv -u daemon @ zebrasrv -u daemon tcp:@:210 zebrasrv -u daemon unix:/some/file/system/socket
Z39.50 PROTOCOL SUPPORT AND BEHAVIOR¶
Z39.50 Initialization¶
During initialization, the server will negotiate to version 3 of the Z39.50 protocol, and the option bits for Search, Present, Scan, NamedResultSets, and concurrentOperations will be set, if requested by the client. The maximum PDU size is negotiated down to a maximum of 1 MB by default.Z39.50 Search¶
The supported query type are 1 and 101. All operators are currently supported with the restriction that only proximity units of type "word" are supported for the proximity operator. Queries can be arbitrarily complex. Named result sets are supported, and result sets can be used as operands without limitations. Searches may span multiple databases. The server has full support for piggy-backed retrieval (see also the following section).Z39.50 Present¶
The present facility is supported in a standard fashion. The requested record syntax is matched against the ones supported by the profile of each record retrieved. If no record syntax is given, SUTRS is the default. The requested element set name, again, is matched against any provided by the relevant record profiles.Z39.50 Scan¶
The attribute combinations provided with the termListAndStartPoint are processed in the same way as operands in a query (see above). Currently, only the term and the globalOccurrences are returned with the termInfo structure.Z39.50 Sort¶
Z39.50 Close¶
If a Close PDU is received, the server will respond with a Close PDU with reason=FINISHED, no matter which protocol version was negotiated during initialization. If the protocol version is 3 or more, the server will generate a Close PDU under certain circumstances, including a session timeout (60 minutes by default), and certain kinds of protocol errors. Once a Close PDU has been sent, the protocol association is considered broken, and the transport connection will be closed immediately upon receipt of further data, or following a short timeout.Z39.50 Explain¶
Zebra maintains a "classic" Z39.50 Explain[1] database on the side. This database is called IR-Explain-1 and can be searched using the attribute set exp-1. The records in the explain database are of type grs.sgml. The root element for the Explain grs.sgml records is explain, thus explain.abs is used for indexing.THE SRU SERVER¶
In addition to Z39.50, Zebra supports the more recent and web-friendly IR protocol SRU[2]. SRU can be carried over SOAP or a REST-like protocol that uses HTTP GET or POST to request search responses. The request itself is made of parameters such as query, startRecord, maximumRecords and recordSchema; the response is an XML document containing hit-count, result-set records, diagnostics, etc. SRU can be thought of as a re-casting of Z39.50 semantics in web-friendly terms; or as a standardisation of the ad-hoc query parameters used by search engines such as Google and AltaVista; or as a superset of A9's OpenSearch (which it predates). Zebra supports Z39.50, SRU GET, SRU POST, SRU SOAP (SRW) - on the same port, recognising what protocol is used by each incoming requests and handling them accordingly. This is a achieved through the use of Deep Magic; civilians are warned not to stand too close.Running zebrasrv as an SRU Server¶
Because Zebra supports all protocols on one port, it would seem to follow that the SRU server is run in the same way as the Z39.50 server, as described above. This is true, but only in an uninterestingly vacuous way: a Zebra server run in this manner will indeed recognise and accept SRU requests; but since it doesn't know how to handle the CQL queries that these protocols use, all it can do is send failure responses.http://localhost:9999/Default?version=1.1 &operation=searchRetrieve &x-pquery=mineral &startRecord=1 &maximumRecords=1
<yazgfs> <server> <config>zebra.cfg</config> <cql2rpn>../../tab/pqf.properties</cql2rpn> </server> </yazgfs>
http://localhost:9999/Default?version=1.1 &operation=searchRetrieve &query=title=utah and description=epicent* &startRecord=1 &maximumRecords=1
SRU PROTOCOL SUPPORT AND BEHAVIOR¶
Zebra running as an SRU server supports SRU version 1.1, including CQL version 1.1. In particular, it provides support for the following elements of the protocol.SRU Search and Retrieval¶
Zebra supports the SRU searchRetrieve[3] operation. One of the great strengths of SRU is that it mandates a standard query language, CQL, and that all conforming implementations can therefore be trusted to correctly interpret the same queries. It is with some shame, then, that we admit that Zebra also supports an additional query language, our own Prefix Query Format ( PQF[4]). A PQF query is submitted by using the extension parameter x-pquery, in which case the query parameter must be omitted, which makes the request not valid SRU. Please feel free to use this facility within your own applications; but be aware that it is not only non-standard SRU but not even syntactically valid, since it omits the mandatory query parameter.SRU Scan¶
Zebra supports SRU scan[5] operation. Scanning using CQL syntax is the default, where the standard scanClause parameter is used. In addition, a mutant form of SRU scan is supported, using the non-standard x-pScanClause parameter in place of the standard scanClause to scan on a PQF query clause.SRU Explain¶
Zebra supports SRU explain[6]. The ZeeRex record explaining a database may be requested either with a fully fledged SRU request (with operation=explain and version-number specified) or with a simple HTTP GET at the server's basename. The ZeeRex record returned in response is the one embedded in the YAZ Frontend Server configuration file that is described in the the section called “YAZ SERVER VIRTUAL HOSTS”. Unfortunately, the data found in the CQL-to-PQF text file must be added by hand-craft into the explain section of the YAZ Frontend Server configuration file to be able to provide a suitable explain record. Too bad, but this is all extreme new alpha stuff, and a lot of work has yet to be done .. There is no linkage whatsoever between the Z39.50 explain model and the SRU explain response (well, at least not implemented in Zebra, that is ..). Zebra does not provide a means using Z39.50 to obtain the ZeeRex record.Other SRU operations¶
In the Z39.50 protocol, Initialization, Present, Sort and Close are separate operations. In SRU, however, these operations do not exist.•
SRU has no explicit initialization handshake phase, but commences immediately
with searching, scanning and explain operations.
•Neither does SRU have a close
operation, since the protocol is stateless and each request is self-contained.
(It is true that multiple SRU request/response pairs may be implemented as
multiple HTTP request/response pairs over a single persistent TCP/IP
connection; but the closure of that connection is not a protocol-level
operation.)
•Retrieval in SRU is part of the
searchRetrieve operation, in which a search is submitted and the response
includes a subset of the records in the result set. There is no direct
analogue of Z39.50's Present operation which requests records from an
established result set. In SRU, this is achieved by sending a subsequent
searchRetrieve request with the query cql.resultSetId= id where
id is the identifier of the previously generated result-set.
•Sorting in CQL is done within the
searchRetrieve operation - in v1.1, by an explicit sort parameter, but the
forthcoming v1.2 or v2.0 will most likely use an extension of the query
language, CQL sorting[7].
It can be seen, then, that while Zebra operating as an SRU server does not
provide the same set of operations as when operating as a Z39.50 server, it
does provide equivalent functionality.
SRU EXAMPLES¶
Surf into http://localhost:9999 to get an explain response, or usehttp://localhost:9999/?version=1.1&operation=searchRetrieve &query=text=(plant%20and%20soil)
http://localhost:9999/?version=1.1&operation=searchRetrieve &query=text=(plant%20and%20soil) &startRecord=5&maximumRecords=2&recordSchema=dc
http://localhost:9999/?version=1.1&operation=searchRetrieve &x-pquery=@attr%201=text%20@and%20plant%20soil
http://localhost:9999/?version=1.1&operation=scan &x-pScanClause=@attr%201=text%20something
YAZ SERVER VIRTUAL HOSTS¶
The Virtual hosts mechanism allows a YAZ frontend server to support multiple backends. A backend is selected on the basis of the TCP/IP binding (port+listening address) and/or the virtual host. A backend can be configured to execute in a particular working directory. Or the YAZ frontend may perform CQL[8] to RPN conversion, thus allowing traditional Z39.50 backends to be offered as a SRU[2] service. SRU Explain information for a particular backend may also be specified. For the HTTP protocol, the virtual host is specified in the Host header. For the Z39.50 protocol, the virtual host is specified as in the Initialize Request in the OtherInfo, OID 1.2.840.10003.10.1000.81.1.The CDATA for the listen element holds the
listener string, such as tcp:@:210, tcp:server1:2100, etc.
attribute id (optional)
identifier for this listener. This may be
referred to from server sections.
Identifier for this server. Currently not used
for anything, but it might be for logging purposes.
attribute listenref (optional)
Specifies listener for this server. If this
attribute is not given, the server is accessible from all listener. In order
for the server to be used for real, however, the virtual host must match (if
specified in the configuration).
element config (optional)
Specifies the server configuration. This is
equivalent to the config specified using command line option -c.
element directory (optional)
Specifies a working directory for this backend
server. If specified, the YAZ frontend changes current working directory to
this directory whenever a backend of this type is started (backend handler
bend_start), stopped (backend handler hand_stop) and initialized
(bend_init).
element host (optional)
Specifies the virtual host for this server. If
this is specified a client must specify this host string in order to
use this backend.
element cql2rpn (optional)
Specifies a filename that includes
CQL[8] to RPN conversion for this backend server. See CQL[8]
section in YAZ manual. If given, the backend server will only "see"
a Type-1/RPN query.
element explain (optional)
Specifies SRU[2] ZeeRex content for
this server - copied verbatim to the client. As things are now, some of the
Explain content seems redundant because host information, etc. is also stored
elsewhere.
The format of the Explain record is described in detail, with examples, on the
file at the ZeeRex[9] web-site.
The XML below configures a server that accepts connections from two ports,
TCP/IP port 9900 and a local UNIX file socket. We name the TCP/IP server
public and the other server internal.
<yazgfs> <listen id="public">tcp:@:9900</listen> <listen id="internal">unix:/var/tmp/socket</listen> <server id="server1"> <host>server1.mydomain</host> <directory>/var/www/s1</directory> <config>config.cfg</config> </server> <server id="server2"> <host>server2.mydomain</host> <directory>/var/www/s2</directory> <config>config.cfg</config> <cql2rpn>../etc/pqf.properties</cql2rpn> <explain xmlns="http://explain.z3950.org/dtd/2.0/"> <serverInfo> <host>server2.mydomain</host> <port>9900</port> <database>a</database> </serverInfo> </explain> </server> <server id="server3" listenref="internal"> <directory>/var/www/s3</directory> <config>config.cfg</config> </server> </yazgfs>
SEE ALSO¶
NOTES¶
- 1.
- Z39.50 Explain
- 2.
- SRU
- 3.
- SRU searchRetrieve
- 4.
- PQF
- 5.
- SRU scan
- 6.
- SRU explain
- 7.
- CQL sorting
- 8.
- CQL
- 9.
- ZeeRex
07/08/2011 | zebra 2.0.44 |