reference/strings/functions/levenshtein.xml
a484d5d2bbad7e385a8864370b5280c4e05add8c
...
...
@@ -1,6 +1,6 @@
1
1
<?xml version="1.0" encoding="utf-8"?>
2
2
<!-- $Revision$ -->
3
-
<refentry xmlns="http://docbook.org/ns/docbook" xml:id="function.levenshtein" xmlns:xlink="http://www.w3.org/1999/xlink">
3
+
<refentry xml:id="function.levenshtein" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">
4
4
<refnamediv>
5
5
<refname>levenshtein</refname>
6
6
<refpurpose>Calculate Levenshtein distance between two strings</refpurpose>
...
...
@@ -10,39 +10,29 @@
10
10
&reftitle.description;
11
11
<methodsynopsis>
12
12
<type>int</type><methodname>levenshtein</methodname>
13
-
<methodparam><type>string</type><parameter>str1</parameter></methodparam>
14
-
<methodparam><type>string</type><parameter>str2</parameter></methodparam>
15
-
</methodsynopsis>
16
-
<methodsynopsis>
17
-
<type>int</type><methodname>levenshtein</methodname>
18
-
<methodparam><type>string</type><parameter>str1</parameter></methodparam>
19
-
<methodparam><type>string</type><parameter>str2</parameter></methodparam>
20
-
<methodparam><type>int</type><parameter>cost_ins</parameter></methodparam>
21
-
<methodparam><type>int</type><parameter>cost_rep</parameter></methodparam>
22
-
<methodparam><type>int</type><parameter>cost_del</parameter></methodparam>
13
+
<methodparam><type>string</type><parameter>string1</parameter></methodparam>
14
+
<methodparam><type>string</type><parameter>string2</parameter></methodparam>
15
+
<methodparam choice="opt"><type>int</type><parameter>insertion_cost</parameter><initializer>1</initializer></methodparam>
16
+
<methodparam choice="opt"><type>int</type><parameter>replacement_cost</parameter><initializer>1</initializer></methodparam>
17
+
<methodparam choice="opt"><type>int</type><parameter>deletion_cost</parameter><initializer>1</initializer></methodparam>
23
18
</methodsynopsis>
24
19
<para>
25
20
The Levenshtein distance is defined as the minimal number of
26
21
characters you have to replace, insert or delete to transform
27
-
<parameter>str1</parameter> into <parameter>str2</parameter>.
22
+
<parameter>string1</parameter> into <parameter>string2</parameter>.
28
23
The complexity of the algorithm is <literal>O(m*n)</literal>,
29
24
where <literal>n</literal> and <literal>m</literal> are the
30
-
length of <parameter>str1</parameter> and
31
-
<parameter>str2</parameter> (rather good when compared to
32
-
<function>similar_text</function>, which is O(max(n,m)**3),
25
+
length of <parameter>string1</parameter> and
26
+
<parameter>string2</parameter> (rather good when compared to
27
+
<function>similar_text</function>, which is <literal>O(max(n,m)**3)</literal>,
33
28
but still expensive).
34
29
</para>
35
30
<para>
36
-
In its simplest form the function will take only the two
37
-
strings as parameter and will calculate just the number of
38
-
insert, replace and delete operations needed to transform
39
-
<parameter>str1</parameter> into <parameter>str2</parameter>.
40
-
</para>
41
-
<para>
42
-
A second variant will take three additional parameters that
43
-
define the cost of insert, replace and delete operations. This
44
-
is more general and adaptive than variant one, but not as
45
-
efficient.
31
+
If <parameter>insertion_cost</parameter>, <parameter>replacement_cost</parameter>
32
+
and/or <parameter>deletion_cost</parameter> are unequal to <literal>1</literal>,
33
+
the algorithm adapts to choose the cheapest transforms.
34
+
E.g. if <code>$insertion_cost + $deletion_cost &lt; $replacement_cost</code>,
35
+
no replacements will be done, but rather inserts and deletions instead.
46
36
</para>
47
37
</refsect1>
48
38

...
...
@@ -51,7 +41,7 @@
51
41
<para>
52
42
<variablelist>
53
43
<varlistentry>
54
-
<term><parameter>str1</parameter></term>
44
+
<term><parameter>string1</parameter></term>
55
45
<listitem>
56
46
<para>
57
47
One of the strings being evaluated for Levenshtein distance.
...
...
@@ -59,7 +49,7 @@
59
49
</listitem>
60
50
</varlistentry>
61
51
<varlistentry>
62
-
<term><parameter>str2</parameter></term>
52
+
<term><parameter>string2</parameter></term>
63
53
<listitem>
64
54
<para>
65
55
One of the strings being evaluated for Levenshtein distance.
...
...
@@ -67,7 +57,7 @@
67
57
</listitem>
68
58
</varlistentry>
69
59
<varlistentry>
70
-
<term><parameter>cost_ins</parameter></term>
60
+
<term><parameter>insertion_cost</parameter></term>
71
61
<listitem>
72
62
<para>
73
63
Defines the cost of insertion.
...
...
@@ -75,7 +65,7 @@
75
65
</listitem>
76
66
</varlistentry>
77
67
<varlistentry>
78
-
<term><parameter>cost_rep</parameter></term>
68
+
<term><parameter>replacement_cost</parameter></term>
79
69
<listitem>
80
70
<para>
81
71
Defines the cost of replacement.
...
...
@@ -83,7 +73,7 @@
83
73
</listitem>
84
74
</varlistentry>
85
75
<varlistentry>
86
-
<term><parameter>cost_del</parameter></term>
76
+
<term><parameter>deletion_cost</parameter></term>
87
77
<listitem>
88
78
<para>
89
79
Defines the cost of deletion.
...
...
@@ -98,11 +88,40 @@
98
88
&reftitle.returnvalues;
99
89
<para>
100
90
This function returns the Levenshtein-Distance between the
101
-
two argument strings or -1, if one of the argument strings
102
-
is longer than the limit of 255 characters.
91
+
two argument strings.
103
92
</para>
104
93
</refsect1>
105
94

95
+
<refsect1 role="changelog">
96
+
&reftitle.changelog;
97
+
<informaltable>
98
+
<tgroup cols="2">
99
+
<thead>
100
+
<row>
101
+
<entry>&Version;</entry>
102
+
<entry>&Description;</entry>
103
+
</row>
104
+
</thead>
105
+
<tbody>
106
+
<row>
107
+
<entry>8.0.0</entry>
108
+
<entry>
109
+
Prior to this version, <function>levenshtein</function> had to be called
110
+
with either two or five arguments.
111
+
</entry>
112
+
</row>
113
+
<row>
114
+
<entry>8.0.0</entry>
115
+
<entry>
116
+
Prior to this version, <function>levenshtein</function> would return <literal>-1</literal>
117
+
if one of the argument strings is longer than 255 characters.
118
+
</entry>
119
+
</row>
120
+
</tbody>
121
+
</tgroup>
122
+
</informaltable>
123
+
</refsect1>
124
+

106
125
<refsect1 role="examples">
107
126
&reftitle.examples;
108
127
<para>
...
...
@@ -181,7 +200,6 @@ Did you mean: carrot?
181
200
</refsect1>
182
201

183
202
</refentry>
184
-

185
203
<!-- Keep this comment at the end of the file
186
204
Local variables:
187
205
mode: sgml
188
206