NAME¶
http - Client-side implementation of the HTTP/1.1 protocol
SYNOPSIS¶
package require http ?2.7?
::http::config ?options?
::http::geturl url ?options?
::http::formatQuery key value ?
key value ...?
::http::reset token ?
why?
::http::wait token
::http::status token
::http::size token
::http::code token
::http::ncode token
::http::meta token
::http::data token
::http::error token
::http::cleanup token
::http::register proto port command
::http::unregister proto
DESCRIPTION¶
The
http package provides the client side of the HTTP/1.1 protocol, as
defined in RFC 2616. The package implements the GET, POST, and HEAD operations
of HTTP/1.1. It allows configuration of a proxy host to get through firewalls.
The package is compatible with the
Safesock security policy, so it can
be used by untrusted applets to do URL fetching from a restricted set of
hosts. This package can be extended to support additional HTTP transport
protocols, such as HTTPS, by providing a custom
socket command, via
::http::register.
The
::http::geturl procedure does a HTTP transaction. Its
options
determine whether a GET, POST, or HEAD transaction is performed. The return
value of
::http::geturl is a token for the transaction. The value is
also the name of an array in the ::http namespace that contains state
information about the transaction. The elements of this array are described in
the
STATE ARRAY section.
If the
-command option is specified, then the HTTP operation is done in
the background.
::http::geturl returns immediately after generating the
HTTP request and the callback is invoked when the transaction completes. For
this to work, the Tcl event loop must be active. In Tk applications this is
always true. For pure-Tcl applications, the caller can use
::http::wait
after calling
::http::geturl to start the event loop.
COMMANDS¶
- ::http::config ?options?
- The ::http::config command is used to set and query
the name of the proxy server and port, and the User-Agent name used in the
HTTP requests. If no options are specified, then the current configuration
is returned. If a single argument is specified, then it should be one of
the flags described below. In this case the current value of that setting
is returned. Otherwise, the options should be a set of flags and values
that define the configuration:
- -accept mimetypes
- The Accept header of the request. The default is */*, which
means that all types of documents are accepted. Otherwise you can supply a
comma-separated list of mime type patterns that you are willing to
receive. For example, “image/gif, image/jpeg, text/*”.
- -proxyhost hostname
- The name of the proxy host, if any. If this value is the
empty string, the URL host is contacted directly.
- -proxyport number
- The proxy port number.
- -proxyfilter command
- The command is a callback that is made during
::http::geturl to determine if a proxy is required for a given
host. One argument, a host name, is added to command when it is
invoked. If a proxy is required, the callback should return a two-element
list containing the proxy server and proxy port. Otherwise the filter
should return an empty list. The default filter returns the values of the
-proxyhost and -proxyport settings if they are
non-empty.
- -urlencoding encoding
- The encoding used for creating the x-url-encoded
URLs with ::http::formatQuery. The default is utf-8, as
specified by RFC 2718. Prior to http 2.5 this was unspecified, and that
behavior can be returned by specifying the empty string ( {}),
although iso8859-1 is recommended to restore similar behavior but
without the ::http::formatQuery throwing an error processing
non-latin-1 characters.
- -useragent string
- The value of the User-Agent header in the HTTP request. The
default is “ Tcl http client package 2.7”.
- ::http::geturl url ?options?
- The ::http::geturl command is the main procedure in
the package. The -query option causes a POST operation and the
-validate option causes a HEAD operation; otherwise, a GET
operation is performed. The ::http::geturl command returns a
token value that can be used to get information about the
transaction. See the STATE ARRAY and ERRORS section for
details. The ::http::geturl command blocks until the operation
completes, unless the -command option specifies a callback that is
invoked when the HTTP transaction completes. ::http::geturl takes
several options:
- -binary boolean
- Specifies whether to force interpreting the URL data as
binary. Normally this is auto-detected (anything not beginning with a
text content type or whose content encoding is gzip or
compress is considered binary data).
- -blocksize size
- The block size used when reading the URL. At most
size bytes are read at once. After each block, a call to the
-progress callback is made (if that option is specified).
- -channel name
- Copy the URL contents to channel name instead of
saving it in state(body).
- -command callback
- Invoke callback after the HTTP transaction
completes. This option causes ::http::geturl to return immediately.
The callback gets an additional argument that is the token
returned from ::http::geturl. This token is the name of an array
that is described in the STATE ARRAY section. Here is a template
for the callback:
proc httpCallback {token} {
upvar #0 $token state
# Access state as a Tcl array
}
- -handler callback
- Invoke callback whenever HTTP data is available; if
present, nothing else will be done with the HTTP data. This procedure gets
two additional arguments: the socket for the HTTP data and the
token returned from ::http::geturl. The token is the name of
a global array that is described in the STATE ARRAY section. The
procedure is expected to return the number of bytes read from the socket.
Here is a template for the callback:
proc httpHandlerCallback {socket token} {
upvar #0 $token state
# Access socket, and state as a Tcl array
# For example...
...
set data [read $socket 1000]
set nbytes [string length $data]
...
return $nbytes
}
- -headers keyvaluelist
- This option is used to add extra headers to the HTTP
request. The keyvaluelist argument must be a list with an even
number of elements that alternate between keys and values. The keys become
header field names. Newlines are stripped from the values so the header
cannot be corrupted. For example, if keyvaluelist is Pragma
no-cache then the following header is included in the HTTP
request:
- -keepalive boolean
- If true, attempt to keep the connection open for servicing
multiple requests. Default is 0.
- -method type
- Force the HTTP request method to type.
::http::geturl will auto-select GET, POST or HEAD based on other
options, but this option enables choices like PUT and DELETE for webdav
support.
- -myaddr address
- Pass an specific local address to the underlying
socket call in case multiple interfaces are available.
- -progress callback
- The callback is made after each transfer of data
from the URL. The callback gets three additional arguments: the
token from ::http::geturl, the expected total size of the
contents from the Content-Length meta-data, and the current number
of bytes transferred so far. The expected total size may be unknown, in
which case zero is passed to the callback. Here is a template for the
progress callback:
proc httpProgress {token total current} {
upvar #0 $token state
}
- -protocol version
- Select the HTTP protocol version to use. This should be 1.0
or 1.1 (the default). Should only be necessary for servers that do not
understand or otherwise complain about HTTP/1.1.
- -query query
- This flag causes ::http::geturl to do a POST request
that passes the query to the server. The query must be an
x-url-encoding formatted query. The ::http::formatQuery procedure
can be used to do the formatting.
- -queryblocksize size
- The block size used when posting query data to the URL. At
most size bytes are written at once. After each block, a call to
the -queryprogress callback is made (if that option is
specified).
- -querychannel channelID
- This flag causes ::http::geturl to do a POST request
that passes the data contained in channelID to the server. The data
contained in channelID must be an x-url-encoding formatted query
unless the -type option below is used. If a Content-Length header
is not specified via the -headers options, ::http::geturl
attempts to determine the size of the post data in order to create that
header. If it is unable to determine the size, it returns an error.
- -queryprogress callback
- The callback is made after each transfer of data to
the URL (i.e. POST) and acts exactly like the -progress option (the
callback format is the same).
- -strict boolean
- Whether to enforce RFC 3986 URL validation on the request.
Default is 1.
- -timeout milliseconds
- If milliseconds is non-zero, then
::http::geturl sets up a timeout to occur after the specified
number of milliseconds. A timeout results in a call to
::http::reset and to the -command callback, if specified.
The return value of ::http::status is timeout after a
timeout has occurred.
- -type mime-type
- Use mime-type as the Content-Type value,
instead of the default value ( application/x-www-form-urlencoded)
during a POST operation.
- -validate boolean
- If boolean is non-zero, then ::http::geturl
does an HTTP HEAD request. This request returns meta information about the
URL, but the contents are not returned. The meta information is available
in the state(meta) variable after the transaction. See the
STATE ARRAY section for details.
- ::http::formatQuery key value ?key
value ...?
- This procedure does x-url-encoding of query data. It takes
an even number of arguments that are the keys and values of the query. It
encodes the keys and values, and generates one string that has the proper
& and = separators. The result is suitable for the -query value
passed to ::http::geturl.
- ::http::reset token ?why?
- This command resets the HTTP transaction identified by
token, if any. This sets the state(status) value to
why, which defaults to reset, and then calls the registered
-command callback.
- ::http::wait token
- This is a convenience procedure that blocks and waits for
the transaction to complete. This only works in trusted code because it
uses vwait. Also, it is not useful for the case where
::http::geturl is called without the -command option
because in this case the ::http::geturl call does not return until
the HTTP transaction is complete, and thus there is nothing to wait
for.
- ::http::data token
- This is a convenience procedure that returns the
body element (i.e., the URL data) of the state array.
- ::http::error token
- This is a convenience procedure that returns the
error element of the state array.
- ::http::status token
- This is a convenience procedure that returns the
status element of the state array.
- ::http::code token
- This is a convenience procedure that returns the
http element of the state array.
- ::http::ncode token
- This is a convenience procedure that returns just the
numeric return code (200, 404, etc.) from the http element of the
state array.
- ::http::size token
- This is a convenience procedure that returns the
currentsize element of the state array, which represents the number
of bytes received from the URL in the ::http::geturl call.
- ::http::meta token
- This is a convenience procedure that returns the
meta element of the state array which contains the HTTP response
headers. See below for an explanation of this element.
- ::http::cleanup token
- This procedure cleans up the state associated with the
connection identified by token. After this call, the procedures
like ::http::data cannot be used to get information about the
operation. It is strongly recommended that you call this function
after you are done with a given HTTP request. Not doing so will result in
memory not being freed, and if your app calls ::http::geturl enough
times, the memory leak could cause a performance hit...or worse.
- ::http::register proto port command
- This procedure allows one to provide custom HTTP transport
types such as HTTPS, by registering a prefix, the default port, and the
command to execute to create the Tcl channel. E.g.:
package require http
package require tls
::http::register https 443 ::tls::socket
set token [::http::geturl https://my.secure.site/]
- ::http::unregister proto
- This procedure unregisters a protocol handler that was
previously registered via ::http::register.
ERRORS¶
The
::http::geturl procedure will raise errors in the following cases:
invalid command line options, an invalid URL, a URL on a non-existent host, or
a URL at a bad port on an existing host. These errors mean that it cannot even
start the network transaction. It will also raise an error if it gets an I/O
error while writing out the HTTP request header. For synchronous
::http::geturl calls (where
-command is not specified), it will
raise an error if it gets an I/O error while reading the HTTP reply headers or
data. Because
::http::geturl does not return a token in these cases, it
does all the required cleanup and there is no issue of your app having to call
::http::cleanup.
For asynchronous
::http::geturl calls, all of the above error situations
apply, except that if there is any error while reading the HTTP reply headers
or data, no exception is thrown. This is because after writing the HTTP
headers,
::http::geturl returns, and the rest of the HTTP transaction
occurs in the background. The command callback can check if any error occurred
during the read by calling
::http::status to check the status and if
its
error, calling
::http::error to get the error message.
Alternatively, if the main program flow reaches a point where it needs to know
the result of the asynchronous HTTP request, it can call
::http::wait
and then check status and error, just as the callback does.
In any case, you must still call
::http::cleanup to delete the state
array when you are done.
There are other possible results of the HTTP transaction determined by examining
the status from
::http::status. These are described below.
- ok
- If the HTTP transaction completes entirely, then status
will be ok. However, you should still check the ::http::code
value to get the HTTP status. The ::http::ncode procedure provides
just the numeric error (e.g., 200, 404 or 500) while the
::http::code procedure returns a value like “HTTP 404 File
not found”.
- eof
- If the server closes the socket without replying, then no
error is raised, but the status of the transaction will be
eof.
- error
- The error message will also be stored in the error
status array element, accessible via ::http::error.
Another error possibility is that
::http::geturl is unable to write all
the post query data to the server before the server responds and closes the
socket. The error message is saved in the
posterror status array
element and then
::http::geturl attempts to complete the transaction.
If it can read the server's response it will end up with an
ok status,
otherwise it will have an
eof status.
STATE ARRAY¶
The
::http::geturl procedure returns a
token that can be used to
get to the state of the HTTP transaction in the form of a Tcl array. Use this
construct to create an easy-to-use array variable:
Once the data associated with the URL is no longer needed, the state array
should be unset to free up storage. The
::http::cleanup procedure is
provided for that purpose. The following elements of the array are supported:
- body
- The contents of the URL. This will be empty if the
-channel option has been specified. This value is returned by the
::http::data command.
- charset
- The value of the charset attribute from the
Content-Type meta-data value. If none was specified, this defaults
to the RFC standard iso8859-1, or the value of
$::http::defaultCharset. Incoming text data will be automatically
converted from this charset to utf-8.
- coding
- A copy of the Content-Encoding meta-data value.
- currentsize
- The current number of bytes fetched from the URL. This
value is returned by the ::http::size command.
- error
- If defined, this is the error string seen when the HTTP
transaction was aborted.
- http
- The HTTP status reply from the server. This value is
returned by the ::http::code command. The format of this value
is:
The
code is a three-digit number defined in the HTTP standard. A code of
200 is OK. Codes beginning with 4 or 5 indicate errors. Codes beginning with 3
are redirection errors. In this case the
Location meta-data specifies a
new URL that contains the requested information.
- meta
- The HTTP protocol returns meta-data that describes the URL
contents. The meta element of the state array is a list of the keys
and values of the meta-data. This is in a format useful for initializing
an array that just contains the meta-data:
array set meta $state(meta)
Some of the meta-data keys are listed below, but the HTTP standard defines more,
and servers are free to add their own.
- Content-Type
- The type of the URL contents. Examples include
text/html, image/gif, application/postscript and
application/x-tcl.
- Content-Length
- The advertised size of the contents. The actual size
obtained by ::http::geturl is available as state(size).
- Location
- An alternate URL that contains the requested data.
- posterror
- The error, if any, that occurred while writing the post
query data to the server.
- status
- Either ok, for successful completion, reset
for user-reset, timeout if a timeout occurred before the
transaction could complete, or error for an error condition. During
the transaction this value is the empty string.
- totalsize
- A copy of the Content-Length meta-data value.
- type
- A copy of the Content-Type meta-data value.
- url
- The requested URL.
EXAMPLE¶
# Copy a URL to a file and print meta-data
proc httpcopy { url file {chunk 4096} } {
set out [open $file w]
set token [ ::http::geturl $url -channel $out \
-progress httpCopyProgress -blocksize $chunk]
close $out
# This ends the line started by httpCopyProgress
puts stderr ""
upvar #0 $token state
set max 0
foreach {name value} $state(meta) {
if {[string length $name] > $max} {
set max [string length $name]
}
if {[regexp -nocase ^location$ $name]} {
# Handle URL redirects
puts stderr "Location:$value"
return [httpcopy [string trim $value] $file $chunk]
}
}
incr max
foreach {name value} $state(meta) {
puts [format "%-*s %s" $max $name: $value]
}
return $token
}
proc httpCopyProgress {args} {
puts -nonewline stderr .
flush stderr
}
SEE ALSO¶
safe(3tcl), socket(3tcl), safesock(3tcl)
KEYWORDS¶
security policy, socket