NAME¶
News::Scan - gather and report Usenet newsgroup statistics
SYNOPSIS¶
use News::Scan;
my $scan = News::Scan->new;
DESCRIPTION¶
This module provides a class whose objects can be used to gather and report
Usenet newsgroup statistics.
CONSTRUCTOR¶
- new ( [ OPTIONS ] )
- "OPTIONS" is a list of named parameters (i.e.
given in key-value pairs). Valid options are
- Group
- The value of this option is the name of the newsgroup you
wish to scan.
- From
- The value of this option should be either 'spool' or 'NNTP'
(case is not significant). Any other value will produce an error (see the
"error" method description below). A value of 'spool' indicates
that you would like to scan articles in a spool (see the Spool
option below). A value of 'NNTP' indicates that articles should be
retrieved from your NNTP server (see the NNTPServer option
below).
- Spool
- The value of this option should be the path to the spool
directory that contains the articles you would like to scan. This option
is ignored unless the value of From is 'spool'.
- NNTPServer
- The value of this option (in the form
server:port, with both being optional--see Net::NNTP for the
semantics of omitting one or both of these parameters) indicates the NNTP
server from which to retrieve articles. This option is ignored unless
From is 'NNTP'. See the description of the NNTPAuthLogin and
NNTPAuthPasswd options below.
- NNTPAuthLogin
- The value of this option should be a valid NNTP
authentication login for your NNTP server. This option is only necessary
if your NNTP server requires authentication.
- NNTPAuthPasswd
- The value of this option should be the password
corresponding to the login in NNTPAuthLogin. Having this hardcoded
in a script is evil, and there should be a much better way.
- Period
- The value of this option indicates the length of the period
(in days) immediately prior to invocation of the program from which you
would like to scan articles. The default period is seven (7) days.
- QuoteRE
- The value of this option is a Perl regular expression that
accepts quoted lines and rejects unquoted or original lines. The default
regular expression is "^\s{0,3}(?:"|:|\S+>|\+\+)>.
- Exclude
- The value of this option should be a reference to an array
containing regular expressions that accept email addresses of posters
whose articles you wish to ignore.
- Aliases
- The value of this option should be a reference to a hash
whose keys are email addresses that should be transformed into the email
addresses that are their corresponding values, i.e. "alias ="
'real@address'>.
METHODS¶
- configure ( [ OPTIONS ] )
- "OPTIONS" is a list of named parameters identical
to those accepted by "new". Re-"configure"-ing an
object after scanning is probably a bad idea. This method returns
"undef" if it encounters an error.
The following methods are the actual underlying methods used to set and retrieve
the configuration options of the same name (modulo case):
- name ( [ NEWSGROUP-NAME ] )
- spool ( [ SPOOL-DIRECTORY ] )
- period ( [ INTERVAL-LENGTH ] )
- aliases ( [ ALIASES-HASHREF ] )
- from ( 'NNTP' | 'spool' )
- quote_re ( [ QUOTE-REGEX-ARRAYREF ] )
- exclude ( [ EXCLUSION-REGEX-ARRAYREF ] )
- nntp_server ( [ [ NNTP-SERVER ]:[ NNTP-PORT ] ] )
- nntp_auth_login ( [ LOGIN ] )
- nntp_auth_passwd ( [ PASSWORD ] )
These methods can be used to retrieve information from the
"News::Scan" object or ask it to perform some action.
- error ( [ MESSAGE ] )
- Use this method to determine whether an object has
encountered an error condition. The return value of "error" is
guaranteed to be 0 after any method completes successfully (except
"error"). (Keep in mind that this will also overwrite any
previous error message.) If there has been an error, this method should
return some useful message.
If provided, "MESSAGE" sets the object's error message.
- articles
- Returns the number of articles accounted for.
- volume
- Returns the volume of traffic (in bytes) to the newsgroup
in the period.
- header_volume
- Returns the volume (in bytes) generated by headers.
- header_lines
- Returns the number of lines consumed by headers.
- body_volume
- Returns the volume (in bytes) generated by message
bodies.
- body_lines
- Returns the number of lines consumed by message
bodies.
- orig_volume
- Returns the volume (in bytes) of text which has been
determined to be original (see QuoteRE). Note that original traffic
is a subset of body traffic.
- orig_lines
- Returns the number of lines that are determined to be
original.
- signatures
- Returns the number of messages that had a cutline (/^--
$/).
- sig_volume
- Returns the volume (in bytes) generated by signatures.
- sig_lines
- Returns the number of lines consumed by signatures.
- earliest ( [ TIME ] )
- Use this method to determine the date (in seconds since the
Epoch) that the oldest article found within the period was posted to
Usenet.
If "TIME" is given, it is treated as a candidate for the earliest
article. If "TIME" is successful (i.e. is less than the previous
earliest), this method returns 1, else 0.
- latest ( [ TIME ] )
- Use this method to determine the date (in seconds since the
Epoch) that the youngest article found within the period was posted to
Usenet.
If "TIME" is given, it is treated as a candidate for the latest
article. If "TIME" is successful (i.e. is greater than the
previous latest), this method returns 1, else 0.
- excludes
- Returns the list of regular expressions used to determine
whether an article from a given email address should be ignored.
- posters
- Returns a reference to a hash whose keys are email
addresses and whose values are "News::Scan::Poster" objects
corresponding to those email addresses. See News::Scan::Poster.
- threads
- Returns a reference to a hash whose keys are subjects and
whose values are "News::Scan::Thread" objects corresponding to
those subjects. See News::Scan::Thread.
- crossposts
- Returns a reference to a hash whose keys are newsgroup
names and whose values are the number of times the corresponding groups
have been crossposted to.
- collect
- Use this method to mirror the articles from the specified
NNTP server to the specified spool. Please be kind to the NNTP
server.
- scan
- Instruct the object to gather information about the
newsgroup.
EXAMPLES¶
See the
eg/ directory in the
News-Scan distribution, available
from the CPAN--
http://www.perl.com/CPAN/.
SEE ALSO¶
perlre, News::Scan::Poster, News::Scan::Thread, News::Scan::Article, Net::NNTP
AUTHOR¶
Greg Bacon <gbacon@cs.uah.edu>
COPYRIGHT¶
Copyright (c) 1997 Greg Bacon. All Rights Reserved. This library is free
software. You may distribute and/or modify it under the same terms as Perl
itself.