Scroll to navigation

Parser(3pm) User Contributed Perl Documentation Parser(3pm)

NAME

HTML::StripScripts::Parser - XSS filter using HTML::Parser

SYNOPSIS

  use HTML::StripScripts::Parser();
  my $hss = HTML::StripScripts::Parser->new(
       {
           Context => 'Document',       ## HTML::StripScripts configuration
           Rules   => { ... },
       },
       strict_comment => 1,             ## HTML::Parser options
       strict_names   => 1,
  );
  $hss->parse_file("foo.html");
  print $hss->filtered_document;
  OR
  print $hss->filter_html($html);

DESCRIPTION

This class provides an easy interface to "HTML::StripScripts", using "HTML::Parser" to parse the HTML.

See HTML::Parser for details of how to customise how the raw HTML is parsed into tags, and HTML::StripScripts for details of how to customise the way those tags are filtered.

CONSTRUCTORS

Creates a new "HTML::StripScripts::Parser" object.

The CONFIG parameter has the same semantics as the CONFIG parameter to the "HTML::StripScripts" constructor.

Any PARSER_OPTIONS supplied will be passed on to the HTML::Parser init method, allowing you to influence the way the input is parsed.

You cannot use PARSER_OPTIONS to set the "HTML::Parser" event handlers (see "Events" in HTML::Parser) since "HTML::StripScripts::Parser" uses all of the event hooks itself. However, you can use "Rules" (see "Rules" in HTML::StripScripts) to customise the handling of all tags and attributes.

METHODS

See HTML::Parser for input methods, HTML::StripScripts for output methods.

"filter_html()"

"filter_html()" is a convenience method for filtering HTML already loaded into a scalar variable. It combines calls to "HTML::Parser::parse()", "HTML::Parser::eof()" and "HTML::StripScripts::filtered_document()".

    $filtered_html = $hss->filter_html($html);

SUBCLASSING

The "HTML::StripScripts::Parser" class is subclassable. Filter objects are plain hashes. The hss_init() method takes the same arguments as new(), and calls the initialization methods of both "HTML::StripScripts" and "HTML::Parser".

See "SUBCLASSING" in HTML::StripScripts and "SUBCLASSING" in HTML::Parser.

SEE ALSO

HTML::StripScripts, HTML::Parser, HTML::StripScripts::LibXML

BUGS

None reported.

Please report any bugs or feature requests to bug-html-stripscripts-parser@rt.cpan.org, or through the web interface at <http://rt.cpan.org>.

AUTHOR

Original author Nick Cleaton <nick@cleaton.net>

New code added and module maintained by Clinton Gormley <clint@traveljury.com>

COPYRIGHT

Copyright (C) 2003 Nick Cleaton. All Rights Reserved.

Copyright (C) 2007 Clinton Gormley. All Rights Reserved.

LICENSE

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

2021-01-03 perl v5.32.0