.\" Automatically generated by Pod::Man 4.14 (Pod::Simple 3.42) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "MediaWiki::DumpFile::Compat 3pm" .TH MediaWiki::DumpFile::Compat 3pm "2022-06-15" "perl v5.34.0" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" MediaWiki::DumpFile::Compat \- Compatibility with Parse::MediaWikiDump .SH "SYNOPSIS" .IX Header "SYNOPSIS" .Vb 1 \& use MediaWiki::DumpFile::Compat; \& \& $pmwd = Parse::MediaWikiDump\->new; \& \& $pages = $pmwd\->pages(\*(Aqpages\-articles.xml\*(Aq); \& $revisions = $pmwd\->revisions(\*(Aqpages\-articles.xml\*(Aq); \& $links = $pmwd\->links(\*(Aqlinks.sql\*(Aq); .Ve .SH "ABOUT" .IX Header "ABOUT" This software suite provides the tools needed to process the contents of the \s-1XML\s0 page dump files and the \s-1SQL\s0 based links dump file from a Mediawiki instance. This is a compatibility layer between MediaWiki::Dumpfile and Parse::MediaWikiDump; instead of \*(L"use Parse::MediaWikiDump;\*(R" you \*(L"use MediaWiki::DumpFile::Compat;\*(R". The benefit of using the new compatibility module is an increased processing speed \- see the MediaWiki::DumpFile::Benchmarks documentation for benchmark results. .SH "MORE DOCUMENTATION" .IX Header "MORE DOCUMENTATION" The original Parse::MediaWikiDump documentation is also available in this package; it has been updated to include new features introduced by MediaWiki::DumpFile. You can find the documentation in the following locations: .IP "MediaWiki::DumpFile::Compat::Pages" 4 .IX Item "MediaWiki::DumpFile::Compat::Pages" .PD 0 .IP "MediaWiki::DumpFile::Compat::Revisions" 4 .IX Item "MediaWiki::DumpFile::Compat::Revisions" .IP "MediaWiki::DumpFile::Compat::page" 4 .IX Item "MediaWiki::DumpFile::Compat::page" .IP "MediaWiki::DumpFile::Compat::Links" 4 .IX Item "MediaWiki::DumpFile::Compat::Links" .IP "MediaWiki::DumpFile::Compat::link" 4 .IX Item "MediaWiki::DumpFile::Compat::link" .PD .SH "USAGE" .IX Header "USAGE" This module is a factory class that allows you to create instances of the individual parser objects. .ie n .IP "$pmwd\->pages" 4 .el .IP "\f(CW$pmwd\fR\->pages" 4 .IX Item "$pmwd->pages" Returns a Parse::MediaWikiDump::Pages object capable of parsing an article \s-1XML\s0 dump file with one revision per each article. .ie n .IP "$pmwd\->revisions" 4 .el .IP "\f(CW$pmwd\fR\->revisions" 4 .IX Item "$pmwd->revisions" Returns a Parse::MediaWikiDump::Revisions object capable of parsing an article \s-1XML\s0 dump file with multiple revisions per each article. .ie n .IP "$pmwd\->links" 4 .el .IP "\f(CW$pmwd\fR\->links" 4 .IX Item "$pmwd->links" Returns a Parse::MediaWikiDump::Links object capable of parsing an article links \s-1SQL\s0 dump file. .SS "General" .IX Subsection "General" All parser creation invocations require a location of source data to parse; this argument can be either a filename or a reference to an already open filehandle. This entire software suite will \fBdie()\fR upon errors in the file or if internal inconsistencies have been detected. If this concerns you then you can wrap the portion of your code that uses these calls with \fBeval()\fR. .SH "COMPATIBILITY" .IX Header "COMPATIBILITY" Any deviation of the behavior of MediaWiki::DumpFile::Compat from Parse::MediaWikiDump that is not listed below is a bug. Please report it so that this package can act as a near perfect standin for the original. Compatibility is verified by using the existing Parse::MediaWikiDump test suite with the following adjustments: .SS "Parse::MediaWikiDump::Pages" .IX Subsection "Parse::MediaWikiDump::Pages" .IP "\(bu" 4 Parse::MediaWikiDump did not need to load all revisions of an article into memory when processing dump files that contain more than one revision but this compatibility module does. The \s-1API\s0 does not change but the memory requirements for parsing those dump files certainly do. It is, however, highly unlikely that you will notice this as most of the documents with many revisions per article are so large that Parse::MediaWikiDump would not have been able to parse them in any reasonable timeframe. .IP "\(bu" 4 The order of the results from \fBnamespaces()\fR is now sorted by the namespace \s-1ID\s0 instead of being in document order .SS "Parse::MediaWikiDump::Links" .IX Subsection "Parse::MediaWikiDump::Links" .IP "\(bu" 4 Order of values from \fBnext()\fR is now in identical order as \s-1SQL\s0 file. .SH "BUGS" .IX Header "BUGS" .IP "\(bu" 4 The value of \fBcurrent_byte()\fR wraps at around 2 gigabytes of input \s-1XML\s0; see http://rt.cpan.org/Public/Bug/Display.html?id=56843 .SH "LIMITATIONS" .IX Header "LIMITATIONS" .IP "\(bu" 4 This compatibility layer is not yet well tested.