.\" Automatically generated by Pod::Man 4.14 (Pod::Simple 3.42) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "Email::Find 3pm" .TH Email::Find 3pm "2022-06-13" "perl v5.34.0" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" Email::Find \- Find RFC 822 email addresses in plain text .SH "SYNOPSIS" .IX Header "SYNOPSIS" .Vb 1 \& use Email::Find; \& \& # new object oriented interface \& my $finder = Email::Find\->new(\e&callback); \& my $num_found \- $finder\->find(\e$text); \& \& # good old functional style \& $num_found = find_emails($text, \e&callback); .Ve .SH "DESCRIPTION" .IX Header "DESCRIPTION" Email::Find is a module for finding a \fIsubset\fR of \s-1RFC 822\s0 email addresses in arbitrary text (see \*(L"\s-1CAVEATS\*(R"\s0). The addresses it finds are not guaranteed to exist or even actually be email addresses at all (see \*(L"\s-1CAVEATS\*(R"\s0), but they will be valid \s-1RFC 822\s0 syntax. .PP Email::Find will perform some heuristics to avoid some of the more obvious red herrings and false addresses, but there's only so much which can be done without a human. .SH "METHODS" .IX Header "METHODS" .IP "new" 4 .IX Item "new" .Vb 1 \& $finder = Email::Find\->new(\e&callback); .Ve .Sp Constructs new Email::Find object. Specified callback will be called with each email as they're found. .IP "find" 4 .IX Item "find" .Vb 1 \& $num_emails_found = $finder\->find(\e$text); .Ve .Sp Finds email addresses in the text and executes callback registered. .Sp The callback is given two arguments. The first is a Mail::Address object representing the address found. The second is the actual original email as found in the text. Whatever the callback returns will replace the original text. .SH "FUNCTIONS" .IX Header "FUNCTIONS" For backward compatibility, Email::Find exports one function, \&\fBfind_emails()\fR. It works very similar to URI::Find's \fBfind_uris()\fR. .SH "EXAMPLES" .IX Header "EXAMPLES" .Vb 1 \& use Email::Find; \& \& # Simply print out all the addresses found leaving the text undisturbed. \& my $finder = Email::Find\->new(sub { \& my($email, $orig_email) = @_; \& print "Found ".$email\->format."\en"; \& return $orig_email; \& }); \& $finder\->find(\e$text); \& \& # For each email found, ping its host to see if its alive. \& require Net::Ping; \& $ping = Net::Ping\->new; \& my %Pinged = (); \& my $finder = Email::Find\->new(sub { \& my($email, $orig_email) = @_; \& my $host = $email\->host; \& next if exists $Pinged{$host}; \& $Pinged{$host} = $ping\->ping($host); \& }); \& \& $finder\->find(\e$text); \& \& while( my($host, $up) = each %Pinged ) { \& print "$host is ". $up ? \*(Aqup\*(Aq : \*(Aqdown\*(Aq ."\en"; \& } \& \& # Count how many addresses are found. \& my $finder = Email::Find\->new(sub { $_[1] }); \& print "Found ", $finder\->find(\e$text), " addresses\en"; \& \& # Wrap each address in an HTML mailto link. \& my $finder = Email::Find\->new( \& sub { \& my($email, $orig_email) = @_; \& my($address) = $email\->format; \& return qq|$orig_email|; \& }, \& ); \& $finder\->find(\e$text); .Ve .SH "SUBCLASSING" .IX Header "SUBCLASSING" If you want to change the way this module works in finding email address, you can do it by making your subclass of Email::Find, which overrides \f(CW\*(C`addr_regex\*(C'\fR and \f(CW\*(C`do_validate\*(C'\fR method. .PP For example, the following class can additionally find email addresses with dot before at mark. This is illegal in \s-1RFC822,\s0 see Email::Valid::Loose for details. .PP .Vb 3 \& package Email::Find::Loose; \& use base qw(Email::Find); \& use Email::Valid::Loose; \& \& # should return regex, which Email::Find will use in finding \& # strings which are "thought to be" email addresses \& sub addr_regex { \& return $Email::Valid::Loose::Addr_spec_re; \& } \& \& # should validate $addr is a valid email or not. \& # if so, return the address as a string. \& # else, return undef \& sub do_validate { \& my($self, $addr) = @_; \& return Email::Valid::Loose\->address($addr); \& } .Ve .PP Let's see another example, which validates if the address is an existent one or not, with Mail::CheckUser module. .PP .Vb 3 \& package Email::Find::Existent; \& use base qw(Email::Find); \& use Mail::CheckUser qw(check_email); \& \& sub do_validate { \& my($self, $addr) = @_; \& return check_email($addr) ? $addr : undef; \& } .Ve .SH "CAVEATS" .IX Header "CAVEATS" .IP "Why a subset of \s-1RFC 822\s0?" 4 .IX Item "Why a subset of RFC 822?" I say that this module finds a \fIsubset\fR of \s-1RFC 822\s0 because if I attempted to look for \fIall\fR possible valid \s-1RFC 822\s0 addresses I'd wind up practically matching the entire block of text! The complete specification is so wide open that its difficult to construct soemthing that's \fInot\fR an \s-1RFC 822\s0 address. .Sp To keep myself sane, I look for the 'address spec' or 'global address' part of an \s-1RFC 822\s0 address. This is the part which most people consider to be an email address (the 'foo@bar.com' part) and it is also the part which contains the information necessary for delivery. .IP "Why are some of the matches not email addresses?" 4 .IX Item "Why are some of the matches not email addresses?" Alas, many things which aren't email addresses \fIlook\fR like email addresses and parse just fine as them. The biggest headache is email and usenet and email message IDs. I do my best to avoid them, but there's only so much cleverness you can pack into one library. .SH "AUTHORS" .IX Header "AUTHORS" Copyright 2000, 2001 Michael G Schwern . All rights reserved. .PP Current maintainer is Tatsuhiko Miyagawa . .SH "THANKS" .IX Header "THANKS" Schwern thanks to Jeremy Howard for his patch to make it work under 5.005. .SH "LICENSE" .IX Header "LICENSE" This module is free software; you may redistribute it and/or modify it under the same terms as Perl itself. .PP The author \fB\s-1STRONGLY SUGGESTS\s0\fR that this module not be used for the purposes of sending unsolicited email (ie. spamming) in any way, shape or form or for the purposes of generating lists for commercial sale. .PP If you use this module for spamming I reserve the right to make fun of you. .SH "SEE ALSO" .IX Header "SEE ALSO" Email::Valid, \s-1RFC 822,\s0 URI::Find, Apache::AntiSpam, Email::Valid::Loose