NAME¶
URI::Find::Delimited - Find URIs which may be wrapped in enclosing delimiters.
DESCRIPTION¶
Works like URI::Find, but is prepared for URIs in your text to be wrapped in a
pair of delimiters and optionally have a title. This will be useful for
processing text that already has some minimal markup in it, like bulletin
board posts or wiki text.
SYNOPSIS¶
my $finder = URI::Find::Delimited->new;
my $text = "This is a [http://the.earth.li/ titled link].";
$finder->find(\$text);
print $text;
METHODS¶
- new
-
my $finder = URI::Find::Delimited->new(
callback => \&callback,
delimiter_re => [ '\[', '\]' ],
ignore_quoted => 1 # defaults to 0
);
All arguments are optional; defaults are provided (see below).
Creates a new URI::Find::Delimited object. This object works similarly to a
URI::Find object, but as well as just looking for URIs it is also aware of
the concept of a wrapped, titled URI. These look something like
[http://foo.com/ the foo website]
where:
- * "[" is the opening delimiter
- * "]" is the closing delimiter
- * "http://foo.com/" is the URI
- * "the foo website" is the title
- * the URI and title are separated by spaces and/or
tabs
The URI::Find::Delimited object will extract each of these parts separately and
pass them to your callback.
- callback
- "callback" is a function which is called on each
URI found. It is passed five arguments: the opening delimiter (if found),
the closing delimiter (if found), the URI, the title (if found), and any
whitespace found between the URI and title.
The return value of the callback will replace the original URI in the text.
If you do not supply your own callback, the object will create a default one
which will put your URIs in 'a href' tags using the URI for the target and
the title for the link text. If no title is provided for a URI then the
URI itself will be used as the title. If the delimiters aren't balanced
(eg if the opening one is present but no closing one is found) then the
URI is treated as not being wrapped.
Note: the default callback will not remove the delimiters from the text. It
should be simple enough to write your own callback to remove them, based
on the one in the source, if that's what you want. In fact there's an
example in this distribution, in "t/delimited.t".
- delimiter_re
- The "delimiter_re" parameter is optional. If you
do supply it then it should be a ref to an array containing two regexes.
It defaults to using single square brackets as the delimiters.
Don't use capturing groupings "( )" in your delimiters or things
will break. Use non-capturing "(?: )" instead.
- ignore_quoted
- If the "ignore_quoted" parameter is supplied and
set to a true value, then any URIs immediately preceded with a
double-quote character will not be matched, ie your callback will not be
executed for them and they'll be treated just as normal text.
This is kinda lame but it's in here because I need to be able to ignore
things like
<img src="http://foo.com/bar.gif">
A better implementation may happen at some point.
SEE ALSO¶
URI::Find.
AUTHOR¶
Kake Pugh (kake@earth.li).
COPYRIGHT¶
Copyright (C) 2003 Kake Pugh. All Rights Reserved.
This module is free software; you can redistribute it and/or modify it under the
same terms as Perl itself.
CREDITS¶
Tim Bagot helped me stop faffing over the name, by pointing out that RFC 2396
Appendix E uses "delimited". Dave Hinton helped me fix the regex to
make it work for delimited URIs with no title. Nick Cleaton helped me make
"ignore_quoted" work. Some of the code was taken from
URI::Find.