NAME¶
html2markdown - converts a page of HTML into markdown.
SYNOPSIS¶
html2markdown [options...] [(
filename|
url)
[
encoding]]
DESCRIPTION¶
html2markdown downloads the specified HTML page, and converts it to text
marked up with markdown. The source HTML page may be a local file or remote
URL. If not specified, it will be read from standard input. The output is
printed to standard output.
If an
encoding is specified, it will override any encoding information
provided by the HTTP Server. When not specified,
python-feedparser (if
available) will be used to determine the source encoding. If not available, or
when reading local files, the encoding is assumed to be UTF-8.
OPTIONS¶
- --ignore-emphasis
- Don't include any formatting for emphasis.
- --ignore-links
- Don't include any formatting for links.
- --ignore-images
- Don't include any formatting for images.
- -g, --google-doc
- Convert an html-exported Google Document.
- -d, --dash-unordered-list
- Use a dash rather than a star for unordered list items.
- -e, --asterisk-emphasis
- Use an asterisk rather than an underscore for emphasized text.
- -b BODY_WIDTH, --body-width=BODY_WIDTH
- Number of characters per output line, 0 for no wrap.
- -i LIST_INDENT,
--google-list-indent=LIST_INDENT
- Number of pixels Google indents nested lists.
- -s, --hide-strikethrough
- Hide strike-through text. Only relevant when -g is specified as
well.
- --escape-all
- Escape all special characters. Output is less readable, but avoids corner
case formatting issues.
- --bypass-tables
- Format tables in HTML rather than Markdown syntax.
- --version
- Show program's version number and exit.
- -h, --help
- Show a help message and exit.
AUTHOR¶
This manpage was written for Debian, by Stefano Rivera
<stefanor@debian.org>.