Scroll to navigation
HXWLS(1) |
HTML-XML-utils |
HXWLS(1) |
NAME¶
hxwls - list links in an HTML file
SYNOPSIS¶
hxwls [ -l ] [ -t ] [ -r ] [ -h ] [ -a
] [ -b base ] [ file ]
DESCRIPTION¶
The hxwls command reads an HTML file (standard input by default) and
prints out all links it finds. The output is written to stdout.
OPTIONS¶
The following options are supported:
- -l
- Produce a long listing. Instead of just the URI, hxwls prints three
columns: the element name, the value of the REL attribute, and the target
URI.
- -t
- Produce a tuple listing. hxwls prints four columns: the URI of the
document itself, the element name, the value of the REL attribute, and the
target URI.
- -r
- Print relative URLs as they are, without converting them to absolute
URLs.
- -b base
- Use base as the initial base URL. If there is a <base>
element in the document, it will override the -b option.
- -h
- Output as HTML. The output will be listed in the form of <a>
elements.
- -a
- Convert any IRIs (Internationalized Resource Identifiers) to ASCII-only
URIs. This causes any non-ASCII characters in the path of a URI to be
encoded as %-escaped octets and non-ASCII characters in the domain name as
punycode. (Punycode encoding is only available if hxwls is compiled
with libidn support.)
OPERANDS¶
The following operand is supported:
- file
- The name or the URL of an HTML file. If absent, standard input is read
instead.
DIAGNOSTICS¶
The following exit values are returned:
- 0
- Successful completion.
- > 0
- An error occurred in the parsing of the HTML file. hxwls will try
to correct the error and produce output anyway.