table of contents
Text::Levenshtein::Damerau(3pm) | User Contributed Perl Documentation | Text::Levenshtein::Damerau(3pm) |
NAME¶
Text::Levenshtein::Damerau - Damerau Levenshtein edit distance.
SYNOPSIS¶
use Text::Levenshtein::Damerau; use warnings; use strict; my @targets = ('fuor','xr','fourrrr','fo'); # Initialize Text::Levenshtein::Damerau object with text to compare against my $tld = Text::Levenshtein::Damerau->new('four'); print $tld->dld($targets[0]); # prints 1 my $tld = $tld->dld({ list => \@targets }); print $tld->{'fuor'}; # prints 1 print $tld->dld_best_match({ list => \@targets }); # prints fuor print $tld->dld_best_distance({ list => \@targets }); # prints 1 # or even more simply use Text::Levenshtein::Damerau qw/edistance/; use warnings; use strict; print edistance('Neil','Niel'); # prints 1
DESCRIPTION¶
Returns the true Damerau Levenshtein edit distance of strings with adjacent transpositions. Useful for fuzzy matching, DNA variation metrics, and fraud detection.
Defaults to using Pure Perl Text::Levenshtein::Damerau::PP, but has an XS addon Text::Levenshtein::Damerau::XS for massive speed imrovements. Works correctly with utf8 if backend supports it; known to work with "Text::Levenshtein::Damerau::PP" and "Text::Levenshtein::Damerau::XS".
use utf8; my $tld = Text::Levenshtein::Damerau->new('ⓕⓞⓤⓡ'); print $tld->dld('ⓕⓤⓞⓡ'); # prints 1
CONSTRUCTOR¶
new¶
Creates and returns a "Text::Levenshtein::Damerau" object. Takes a scalar with the text (source) you want to compare against.
my $tld = Text::Levenshtein::Damerau->new('Neil'); # Creates a new Text::Levenshtein::Damerau object $tld
METHODS¶
$tld->dld¶
Scalar Argument: Takes a string to compare with.
Returns: an integer representing the edit distance between the source and the passed argument.
Hashref Argument: Takes a hashref containing:
- list => \@array (array ref of strings to compare with)
- OPTIONAL max_distance => $int (only return results with $int distance or less).
- OPTIONAL backend => 'Some::Module::its_function' Any module that
will take 2 arguments and returns an int. If the module fails to load, the
function doesn't exist, or the function doesn't return a number when
passed 2 strings, then "backend" remains
unchanged.
# Override defaults and use Text::Levenshtein::Damerau::PP's pp_edistance() $tld->dld({ list=> \@list, backend => 'Text::Levenshtein::Damerau::PP::pp_edistance'); # Override defaults and use Text::Levenshtein::Damerau::XS's xs_edistance() use Text::Levenshtein::Damerau; requires Text::Levenshtein::Damerau::XS; ... $tld->dld({ list=> \@list, backend => 'Text::Levenshtein::Damerau::XS::xs_edistance');
Returns: hashref with each word from the passed list as keys, and their edit distance (if less than max_distance, which is unlimited by default).
my $tld = Text::Levenshtein::Damerau->new('Neil'); print $tld->dld( 'Niel' ); # prints 1 #or if you want to check the distance of various items in a list my @names_list = ('Niel','Jack'); my $tld = Text::Levenshtein::Damerau->new('Neil'); my $d_ref = $tld->dld({ list=> \@names_list }); # pass a list, returns a hash ref print $d_ref->{'Niel'}; #prints 1 print $d_ref->{'Jack'}; #prints 4
$tld->dld_best_match¶
Argument: an array reference of strings.
Returns: the string with the smallest edit distance between the source and the array of strings passed.
Takes distance of $tld source against every item in @targets, then returns the string of the best match.
my $tld = Text::Levenshtein::Damerau->new('Neil'); my @name_spellings = ('Niel','Neell','KNiel'); print $tld->dld_best_match({ list=> \@name_spellings }); # prints Niel
$tld->dld_best_distance¶
Arguments: an array reference of strings.
Returns: the smallest edit distance between the source and the array reference of strings passed.
Takes distance of $tld source against every item in the passed array, then returns the smallest edit distance.
my $tld = Text::Levenshtein::Damerau->new('Neil'); my @name_spellings = ('Niel','Neell','KNiel'); print $tld->dld_best_distance({ list => \@name_spellings }); # prints 1
EXPORTABLE METHODS¶
edistance¶
Arguments: source string and target string.
- •
- OPTIONAL 3rd argument int (max distance; only return results with $int distance or less). 0 = unlimited. Default = 0.
Returns: int that represents the edit distance between the two argument. -1 if max distance is set and reached.
Wrapper function to take the edit distance between a source and target string. It will attempt to use, in order:
- Text::Levenshtein::Damerau::XS xs_edistance
- Text::Levenshtein::Damerau::PP pp_edistance
use Text::Levenshtein::Damerau qw/edistance/; print edistance('Neil','Niel'); # prints 1
SEE ALSO¶
- <https://github.com/ugexe/Text--Levenshtein--Damerau> Repository
- <http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance> Damerau levenshtein explanation
- Text::Fuzzy Regular levenshtein distance
BUGS¶
Please report bugs to:
<https://rt.cpan.org/Public/Dist/Display.html?Name=Text-Levenshtein-Damerau>
AUTHOR¶
Nick Logan <ug@skunkds.com>
LICENSE AND COPYRIGHT¶
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
2022-10-13 | perl v5.34.0 |