language/types/string.xml
affa37e16f562d9297e83b2e21ec416aadc8b72d
...
...
@@ -4,7 +4,7 @@
4
4
<title>Strings</title>
5
5

6
6
<para>
7
-
A <type>string</type> is series of characters, where a character is
7
+
A <type>string</type> is a series of characters, where a character is
8
8
the same as a byte. This means that PHP only supports a 256-character set,
9
9
and hence does not offer native Unicode support. See
10
10
<link linkend="language.types.string.details">details of the string
...
...
@@ -13,10 +13,8 @@
13
13

14
14
<note>
15
15
<simpara>
16
-
As of PHP 7.0.0, there are no particular restrictions regarding the length of
17
-
a <type>string</type> on 64-bit builds. On 32-bit builds and in earlier
18
-
versions, a
19
-
<type>string</type> can be as large as up to 2GB (2147483647 bytes maximum)
16
+
On 32-bit builds, a <type>string</type> can be as large as up to 2GB
17
+
(2147483647 bytes maximum)
20
18
</simpara>
21
19
</note>
22
20

...
...
@@ -46,7 +44,6 @@
46
44
<listitem>
47
45
<simpara>
48
46
<link linkend="language.types.string.syntax.nowdoc">nowdoc syntax</link>
49
-
(since PHP 5.3.0)
50
47
</simpara>
51
48
</listitem>
52
49
</itemizedlist>
...
...
@@ -144,15 +141,15 @@ echo 'Variables do not $expand $either';
144
141
</row>
145
142
<row>
146
143
<entry><literal>\v</literal></entry>
147
-
<entry>vertical tab (VT or 0x0B (11) in ASCII) (since PHP 5.2.5)</entry>
144
+
<entry>vertical tab (VT or 0x0B (11) in ASCII)</entry>
148
145
</row>
149
146
<row>
150
147
<entry><literal>\e</literal></entry>
151
-
<entry>escape (ESC or 0x1B (27) in ASCII) (since PHP 5.4.4)</entry>
148
+
<entry>escape (ESC or 0x1B (27) in ASCII)</entry>
152
149
</row>
153
150
<row>
154
151
<entry><literal>\f</literal></entry>
155
-
<entry>form feed (FF or 0x0C (12) in ASCII) (since PHP 5.2.5)</entry>
152
+
<entry>form feed (FF or 0x0C (12) in ASCII)</entry>
156
153
</row>
157
154
<row>
158
155
<entry><literal>\\</literal></entry>
...
...
@@ -169,24 +166,25 @@ echo 'Variables do not $expand $either';
169
166
<row>
170
167
<entry><literal>\[0-7]{1,3}</literal></entry>
171
168
<entry>
172
-
the sequence of characters matching the regular expression is a
173
-
character in octal notation, which silently overflows to fit in a byte
174
-
(e.g. "\400" === "\000")
169
+
Octal: the sequence of characters matching the regular expression <literal>[0-7]{1,3}</literal>
170
+
is a character in octal notation (e.g. <literal>"\101" === "A"</literal>),
171
+
which silently overflows to fit in a byte (e.g. <literal>"\400" === "\000"</literal>)
175
172
</entry>
176
173
</row>
177
174
<row>
178
175
<entry><literal>\x[0-9A-Fa-f]{1,2}</literal></entry>
179
176
<entry>
180
-
the sequence of characters matching the regular expression is a
181
-
character in hexadecimal notation
177
+
Hexadecimal: the sequence of characters matching the regular expression
178
+
<literal>[0-9A-Fa-f]{1,2}</literal> is a character in hexadecimal notation
179
+
(e.g. <literal>"\x41" === "A"</literal>)
182
180
</entry>
183
181
</row>
184
182
<row>
185
183
<entry><literal>\u{[0-9A-Fa-f]+}</literal></entry>
186
184
<entry>
187
-
the sequence of characters matching the regular expression is a
188
-
Unicode codepoint, which will be output to the string as that
189
-
codepoint's UTF-8 representation (added in PHP 7.0.0)
185
+
Unicode: the sequence of characters matching the regular expression <literal>[0-9A-Fa-f]+</literal>
186
+
is a Unicode codepoint, which will be output to the string as that codepoint's UTF-8 representation.
187
+
The braces are required in the sequence. E.g. <literal>"\u{41}" === "A"</literal>
190
188
</entry>
191
189
</row>
192
190
</tbody>
...
...
@@ -195,8 +193,7 @@ echo 'Variables do not $expand $either';
195
193

196
194
<para>
197
195
As in single quoted <type>string</type>s, escaping any other character will
198
-
result in the backslash being printed too. Before PHP 5.1.1, the backslash
199
-
in <literal>\{$var}</literal> had not been printed.
196
+
result in the backslash being printed too.
200
197
</para>
201
198

202
199
<para>
...
...
@@ -218,16 +215,198 @@ echo 'Variables do not $expand $either';
218
215
</simpara>
219
216

220
217
<simpara>
221
-
The closing identifier <emphasis>must</emphasis> begin in the first column
222
-
of the line. Also, the identifier must follow the same naming rules as any
218
+
The closing identifier may be indented by space or tab, in which case
219
+
the indentation will be stripped from all lines in the doc string.
220
+
Prior to PHP 7.3.0, the closing identifier <emphasis>must</emphasis>
221
+
begin in the first column of the line.
222
+
</simpara>
223
+

224
+
<simpara>
225
+
Also, the closing identifier must follow the same naming rules as any
223
226
other label in PHP: it must contain only alphanumeric characters and
224
227
underscores, and must start with a non-digit character or underscore.
225
228
</simpara>
226
229

230
+
<example>
231
+
<title>Basic Heredoc example as of PHP 7.3.0</title>
232
+
<programlisting role="php">
233
+
<![CDATA[
234
+
<?php
235
+
// no indentation
236
+
echo <<<END
237
+
a
238
+
b
239
+
c
240
+
\n
241
+
END;
242
+

243
+
// 4 spaces of indentation
244
+
echo <<<END
245
+
a
246
+
b
247
+
c
248
+
END;
249
+
]]>
250
+
</programlisting>
251
+
&example.outputs.73;
252
+
<screen>
253
+
<![CDATA[
254
+
a
255
+
b
256
+
c
257
+

258
+
a
259
+
b
260
+
c
261
+
]]>
262
+
</screen>
263
+
</example>
264
+

265
+
<simpara>
266
+
If the closing identifier is indented further than any lines of the body, then a <classname>ParseError</classname> will be thrown:
267
+
</simpara>
268
+

269
+
<example>
270
+
<title>Closing identifier must not be indented further than any lines of the body</title>
271
+
<programlisting role="php">
272
+
<![CDATA[
273
+
<?php
274
+
echo <<<END
275
+
a
276
+
b
277
+
c
278
+
END;
279
+
]]>
280
+
</programlisting>
281
+
&example.outputs.73;
282
+
<screen>
283
+
<![CDATA[
284
+
PHP Parse error: Invalid body indentation level (expecting an indentation level of at least 3) in example.php on line 4
285
+
]]>
286
+
</screen>
287
+
</example>
288
+

289
+
<simpara>
290
+
If the closing identifier is indented, tabs can be used as well, however,
291
+
tabs and spaces <emphasis>must not</emphasis> be intermixed regarding
292
+
the indentation of the closing identifier and the indentation of the body
293
+
(up to the closing identifier). In any of these cases, a <classname>ParseError</classname> will be thrown.
294
+

295
+
These whitespace constraints have been included because mixing tabs and
296
+
spaces for indentation is harmful to legibility.
297
+
</simpara>
298
+

299
+
<example>
300
+
<title>Different indentation for body (spaces) closing identifier</title>
301
+
<programlisting role="php">
302
+
<![CDATA[
303
+
<?php
304
+
// All the following code do not work.
305
+

306
+
// different indentation for body (spaces) ending marker (tabs)
307
+
{
308
+
echo <<<END
309
+
a
310
+
END;
311
+
}
312
+

313
+
// mixing spaces and tabs in body
314
+
{
315
+
echo <<<END
316
+
a
317
+
END;
318
+
}
319
+

320
+
// mixing spaces and tabs in ending marker
321
+
{
322
+
echo <<<END
323
+
a
324
+
END;
325
+
}
326
+
]]>
327
+
</programlisting>
328
+
&example.outputs.73;
329
+
<screen>
330
+
<![CDATA[
331
+
PHP Parse error: Invalid indentation - tabs and spaces cannot be mixed in example.php line 8
332
+
]]>
333
+
</screen>
334
+
</example>
335
+

336
+
<simpara>
337
+
The closing identifier for the body string is not required to be
338
+
followed by a semicolon or newline. For example, the following code
339
+
is allowed as of PHP 7.3.0:
340
+
</simpara>
341
+

342
+
<example>
343
+
<title>Continuing an expression after a closing identifier</title>
344
+
<programlisting role="php">
345
+
<![CDATA[
346
+
<?php
347
+
$values = [<<<END
348
+
a
349
+
b
350
+
c
351
+
END, 'd e f'];
352
+
var_dump($values);
353
+
]]>
354
+
</programlisting>
355
+
&example.outputs.73;
356
+
<screen>
357
+
<![CDATA[
358
+
array(2) {
359
+
[0] =>
360
+
string(11) "a
361
+
b
362
+
c"
363
+
[1] =>
364
+
string(5) "d e f"
365
+
}
366
+
]]>
367
+
</screen>
368
+
</example>
369
+

227
370
<warning>
228
371
<simpara>
229
-
It is very important to note that the line with the closing identifier must
230
-
contain no other characters, except a semicolon (<literal>;</literal>).
372
+
If the closing identifier was found at the start of a line, then
373
+
regardless of whether it was a part of another word, it may be considered
374
+
as the closing identifier and causes a <classname>ParseError</classname>.
375
+
</simpara>
376
+

377
+
<example>
378
+
<title>Closing identifier in body of the string tends to cause ParseError</title>
379
+
<programlisting role="php">
380
+
<![CDATA[
381
+
<?php
382
+
$values = [<<<END
383
+
a
384
+
b
385
+
END ING
386
+
END, 'd e f'];
387
+
]]>
388
+
</programlisting>
389
+
&example.outputs.73;
390
+
<screen>
391
+
<![CDATA[
392
+
PHP Parse error: syntax error, unexpected identifier "ING", expecting "]" in example.php on line 6
393
+
]]>
394
+
</screen>
395
+
</example>
396
+

397
+
<simpara>
398
+
To avoid this problem, it is safe to follow the simple rule:
399
+
<emphasis>do not choose as a closing identifier if it appears in the body
400
+
of the text</emphasis>.
401
+
</simpara>
402
+

403
+
</warning>
404
+

405
+
<warning>
406
+
<simpara>
407
+
Prior to PHP 7.3.0, it is very important to note that the line with the
408
+
closing identifier must contain no other characters, except a semicolon
409
+
(<literal>;</literal>).
231
410
That means especially that the identifier
232
411
<emphasis>may not be indented</emphasis>, and there may not be any spaces
233
412
or tabs before or after the semicolon. It's also important to realize that
...
...
@@ -245,7 +424,7 @@ echo 'Variables do not $expand $either';
245
424
</simpara>
246
425

247
426
<example>
248
-
<title>Invalid example</title>
427
+
<title>Invalid example, prior to PHP 7.3.0</title>
249
428
<programlisting role="php">
250
429
<!-- This is an INVALID example -->
251
430
<![CDATA[
...
...
@@ -261,7 +440,7 @@ bar
261
440
</programlisting>
262
441
</example>
263
442
<example>
264
-
<title>Valid example</title>
443
+
<title>Valid example, even prior to PHP 7.3.0</title>
265
444
<programlisting role="php">
266
445
<!-- This is a VALID example -->
267
446
<![CDATA[
...
...
@@ -277,8 +456,7 @@ EOT;
277
456
</example>
278
457

279
458
<para>
280
-
Heredocs can not be used for initializing class properties. Since PHP 5.3,
281
-
this limitation is valid only for heredocs containing variables.
459
+
Heredocs containing variables can not be used for initializing class properties.
282
460
</para>
283
461

284
462
</warning>
...
...
@@ -355,7 +533,7 @@ EOD
355
533
</example>
356
534

357
535
<para>
358
-
As of PHP 5.3.0, it's possible to initialize static variables and class
536
+
It's possible to initialize static variables and class
359
537
properties/constants using the Heredoc syntax:
360
538
</para>
361
539

...
...
@@ -389,7 +567,7 @@ FOOBAR;
389
567
</example>
390
568

391
569
<para>
392
-
Starting with PHP 5.3.0, the opening Heredoc identifier may optionally be
570
+
The opening Heredoc identifier may optionally be
393
571
enclosed in double quotes:
394
572
</para>
395
573

...
...
@@ -503,12 +681,6 @@ EOT;
503
681
</programlisting>
504
682
</example>
505
683

506
-
<note>
507
-
<para>
508
-
Nowdoc support was added in PHP 5.3.0.
509
-
</para>
510
-
</note>
511
-

512
684
</sect3>
513
685

514
686
<sect3 xml:id="language.types.string.parsing">
...
...
@@ -549,11 +721,14 @@ EOT;
549
721
<?php
550
722
$juice = "apple";
551
723

552
-
echo "He drank some $juice juice.".PHP_EOL;
553
-
// Invalid. "s" is a valid character for a variable name, but the variable is $juice.
554
-
echo "He drank some juice made of $juices.";
555
-
// Valid. Explicitly specify the end of the variable name by enclosing it in braces:
556
-
echo "He drank some juice made of ${juice}s.";
724
+
echo "He drank some $juice juice." . PHP_EOL;
725
+

726
+
// Unintended. "s" is a valid character for a variable name, so this refers to $juices, not $juice.
727
+
echo "He drank some juice made of $juices." . PHP_EOL;
728
+

729
+
// Explicitly specify the end of the variable name by enclosing the reference in braces.
730
+
echo "He drank some juice made of {$juice}s.";
731
+

557
732
?>
558
733
]]>
559
734
</programlisting>
...
...
@@ -656,8 +831,8 @@ Changing the character at index -3 to o gives strong.
656
831
<simpara>
657
832
Any scalar variable, array element or object property with a
658
833
<type>string</type> representation can be included via this syntax.
659
-
Simply write the expression the same way as it would appear outside the
660
-
<type>string</type>, and then wrap it in <literal>{</literal> and
834
+
The expression is written the same way as it would appear outside the
835
+
<type>string</type>, and then wrapped in <literal>{</literal> and
661
836
<literal>}</literal>. Since <literal>{</literal> can not be escaped, this
662
837
syntax will only be recognised when the <literal>$</literal> immediately
663
838
follows the <literal>{</literal>. Use <literal>{\$</literal> to get a
...
...
@@ -691,9 +866,9 @@ echo "This works: {$arr['key']}";
691
866
echo "This works: {$arr[4][3]}";
692
867

693
868
// This is wrong for the same reason as $foo[bar] is wrong outside a string.
694
-
// In other words, it will still work, but only because PHP first looks for a
695
-
// constant named foo; an error of level E_NOTICE (undefined constant) will be
696
-
// thrown.
869
+
// PHP first looks for a constant named foo, and throws an error if not found.
870
+
// If the constant is found, its value (and not 'foo' itself) would be used
871
+
// for the array index.
697
872
echo "This is wrong: {$arr[foo][3]}";
698
873

699
874
// Works. When using multi-dimensional arrays, always use braces around arrays
...
...
@@ -713,6 +888,11 @@ echo "This is the value of the var named by the return value of \$object->getNam
713
888

714
889
// Won't work, outputs: This is the return value of getName(): {getName()}
715
890
echo "This is the return value of getName(): {getName()}";
891
+

892
+
// Won't work, outputs: C:\folder\{fantastic}.txt
893
+
echo "C:\folder\{$great}.txt"
894
+
// Works, outputs: C:\folder\fantastic.txt
895
+
echo "C:\\folder\\{$great}.txt"
716
896
?>
717
897
]]>
718
898
<!-- maybe it's better to leave this out??
...
...
@@ -756,9 +936,9 @@ I am bar.
756
936

757
937
<note>
758
938
<para>
759
-
Functions, method calls, static class variables, and class
760
-
constants inside <literal>{$}</literal> work since PHP
761
-
5. However, the value accessed will be interpreted as the name
939
+
The value accessed from functions, method calls, static class variables,
940
+
and class constants inside
941
+
<literal>{$}</literal> will be interpreted as the name
762
942
of a variable in the scope in which the string is defined. Using
763
943
single curly braces (<literal>{}</literal>) will not work for
764
944
accessing the return values of functions or methods or the
...
...
@@ -819,8 +999,9 @@ echo "I'd like an {${beers::$ale}}\n";
819
999

820
1000
<note>
821
1001
<simpara>
822
-
<type>String</type>s may also be accessed using braces, as in
1002
+
Prior to PHP 8.0.0, <type>string</type>s could also be accessed using braces, as in
823
1003
<varname>$str{42}</varname>, for the same purpose.
1004
+
This curly brace syntax was deprecated as of PHP 7.4.0 and no longer supported as of PHP 8.0.0.
824
1005
</simpara>
825
1006
</note>
826
1007

...
...
@@ -877,12 +1058,13 @@ $str[strlen($str)-1] = 'e';
877
1058
</example>
878
1059

879
1060
<para>
880
-
As of PHP 5.4 string offsets have to either be integers or integer-like strings, otherwise a warning
881
-
will be thrown. Previously an offset like <literal>"foo"</literal> was silently cast to <literal>0</literal>.
1061
+
String offsets have to either be integers or integer-like strings,
1062
+
otherwise a warning will be thrown.
882
1063
</para>
883
1064

884
1065
<example>
885
-
<title>Differences between PHP 5.3 and PHP 5.4</title>
1066
+
<!-- TODO Update for PHP 8.0 -->
1067
+
<title>Example of Illegal String Offsets</title>
886
1068
<programlisting role="php">
887
1069
<![CDATA[
888
1070
<?php
...
...
@@ -902,20 +1084,7 @@ var_dump(isset($str['1x']));
902
1084
?>
903
1085
]]>
904
1086
</programlisting>
905
-
&example.outputs.53;
906
-
<screen>
907
-
<![CDATA[
908
-
string(1) "b"
909
-
bool(true)
910
-
string(1) "b"
911
-
bool(true)
912
-
string(1) "a"
913
-
bool(true)
914
-
string(1) "b"
915
-
bool(true)
916
-
]]>
917
-
</screen>
918
-
&example.outputs.54;
1087
+
&example.outputs;
919
1088
<screen>
920
1089
<![CDATA[
921
1090
string(1) "b"
...
...
@@ -944,10 +1113,18 @@ bool(false)
944
1113

945
1114
<note>
946
1115
<para>
947
-
PHP 5.5 added support for accessing characters within string literals
1116
+
Characters within string literals can be accessed
948
1117
using <literal>[]</literal> or <literal>{}</literal>.
949
1118
</para>
950
1119
</note>
1120
+

1121
+
<note>
1122
+
<para>
1123
+
Accessing characters within string literals using the
1124
+
<literal>{}</literal> syntax has been deprecated in PHP 7.4.
1125
+
This has been removed in PHP 8.0.
1126
+
</para>
1127
+
</note>
951
1128
</sect3>
952
1129
</sect2><!-- end syntax -->
953
1130

...
...
@@ -1000,14 +1177,14 @@ bool(false)
1000
1177
</para>
1001
1178

1002
1179
<para>
1003
-
A <type>boolean</type> &true; value is converted to the <type>string</type>
1004
-
<literal>"1"</literal>. <type>Boolean</type> &false; is converted to
1180
+
A <type>bool</type> &true; value is converted to the <type>string</type>
1181
+
<literal>"1"</literal>. <type>bool</type> &false; is converted to
1005
1182
<literal>""</literal> (the empty string). This allows conversion back and
1006
-
forth between <type>boolean</type> and <type>string</type> values.
1183
+
forth between <type>bool</type> and <type>string</type> values.
1007
1184
</para>
1008
1185

1009
1186
<para>
1010
-
An <type>integer</type> or <type>float</type> is converted to a
1187
+
An <type>int</type> or <type>float</type> is converted to a
1011
1188
<type>string</type> representing the number textually (including the
1012
1189
exponent part for <type>float</type>s). Floating point numbers can be
1013
1190
converted using exponential notation (<literal>4.1E+6</literal>).
...
...
@@ -1015,7 +1192,9 @@ bool(false)
1015
1192

1016
1193
<note>
1017
1194
<para>
1018
-
The decimal point character is defined in the script's locale (category
1195
+
As of PHP 8.0.0, the decimal point character is always
1196
+
a period ("<literal>.</literal>"). Prior to PHP 8.0.0,
1197
+
the decimal point character is defined in the script's locale (category
1019
1198
LC_NUMERIC). See the <function>setlocale</function> function.
1020
1199
</para>
1021
1200
</note>
...
...
@@ -1030,7 +1209,7 @@ bool(false)
1030
1209
</para>
1031
1210

1032
1211
<para>
1033
-
In order to convert <type>object</type>s to <type>string</type> magic
1212
+
In order to convert <type>object</type>s to <type>string</type>, the magic
1034
1213
method <link linkend="language.oop5.magic">__toString</link> must be used.
1035
1214
</para>
1036
1215

...
...
@@ -1065,77 +1244,6 @@ bool(false)
1065
1244

1066
1245
</sect2>
1067
1246

1068
-
<sect2 xml:id="language.types.string.conversion">
1069
-
<title>String conversion to numbers</title>
1070
-

1071
-
<simpara>
1072
-
When a <type>string</type> is evaluated in a numeric context, the resulting
1073
-
value and type are determined as follows.
1074
-
</simpara>
1075
-

1076
-
<simpara>
1077
-
If the <type>string</type> does not contain any of the characters '.', 'e',
1078
-
or 'E' and the numeric value fits into integer type limits (as defined by
1079
-
<constant>PHP_INT_MAX</constant>), the <type>string</type> will be evaluated
1080
-
as an <type>integer</type>. In all other cases it will be evaluated as a
1081
-
<type>float</type>.
1082
-
</simpara>
1083
-

1084
-
<para>
1085
-
The value is given by the initial portion of the <type>string</type>. If the
1086
-
<type>string</type> starts with valid numeric data, this will be the value
1087
-
used. Otherwise, the value will be 0 (zero). Valid numeric data is an
1088
-
optional sign, followed by one or more digits (optionally containing a
1089
-
decimal point), followed by an optional exponent. The exponent is an 'e' or
1090
-
'E' followed by one or more digits.
1091
-
</para>
1092
-

1093
-
<informalexample>
1094
-
<programlisting role="php">
1095
-
<![CDATA[
1096
-
<?php
1097
-
$foo = 1 + "10.5"; // $foo is float (11.5)
1098
-
$foo = 1 + "-1.3e3"; // $foo is float (-1299)
1099
-
$foo = 1 + "bob-1.3e3"; // $foo is integer (1)
1100
-
$foo = 1 + "bob3"; // $foo is integer (1)
1101
-
$foo = 1 + "10 Small Pigs"; // $foo is integer (11)
1102
-
$foo = 4 + "10.2 Little Piggies"; // $foo is float (14.2)
1103
-
$foo = "10.0 pigs " + 1; // $foo is float (11)
1104
-
$foo = "10.0 pigs " + 1.0; // $foo is float (11)
1105
-
?>
1106
-
]]>
1107
-
</programlisting>
1108
-
</informalexample>
1109
-

1110
-
<simpara>
1111
-
For more information on this conversion, see the Unix manual page for
1112
-
strtod(3).
1113
-
</simpara>
1114
-

1115
-
<para>
1116
-
To test any of the examples in this section, cut and paste the examples and
1117
-
insert the following line to see what's going on:
1118
-
</para>
1119
-

1120
-
<informalexample>
1121
-
<programlisting role="php">
1122
-
<![CDATA[
1123
-
<?php
1124
-
echo "\$foo==$foo; type is " . gettype ($foo) . "<br />\n";
1125
-
?>
1126
-
]]>
1127
-
</programlisting>
1128
-
</informalexample>
1129
-

1130
-
<para>
1131
-
Do not expect to get the code of one character by converting it to integer,
1132
-
as is done in C. Use the <function>ord</function> and
1133
-
<function>chr</function> functions to convert between ASCII codes and
1134
-
characters.
1135
-
</para>
1136
-

1137
-
</sect2>
1138
-

1139
1247
<sect2 xml:id="language.types.string.details">
1140
1248

1141
1249
<title>Details of the String Type</title>
...
...
@@ -1204,15 +1312,7 @@ echo "\$foo==$foo; type is " . gettype ($foo) . "<br />\n";
1204
1312
<listitem>
1205
1313
<simpara>
1206
1314
Others use the current locale (see <function>setlocale</function>), but
1207
-
operate byte-by-byte. This is the case of <function>strcasecmp</function>,
1208
-
<function>strtoupper</function> and <function>ucfirst</function>.
1209
-
This means they can be used only with single-byte encodings, as long as
1210
-
the encoding is matched by the locale. For instance
1211
-
<literal>strtoupper("á")</literal> may return <literal>"Á"</literal> if the
1212
-
locale is correctly set and <literal>á</literal> is encoded with a single
1213
-
byte. If it is encoded in UTF-8, the correct result will not be returned
1214
-
and the resulting string may or may not be returned corrupted, depending
1215
-
on the current locale.
1315
+
operate byte-by-byte.
1216
1316
</simpara>
1217
1317
</listitem>
1218
1318
<listitem>
...
...
@@ -1222,9 +1322,6 @@ echo "\$foo==$foo; type is " . gettype ($foo) . "<br />\n";
1222
1322
<link linkend="book.intl">intl</link> extension and in the
1223
1323
<link linkend="book.pcre">PCRE</link> extension
1224
1324
(in the last case, only when the <literal>u</literal> modifier is used).
1225
-
Although this is due to their special purpose, the function
1226
-
<function>utf8_decode</function> assumes a UTF-8 encoding and the
1227
-
function <function>utf8_encode</function> assumes an ISO-8859-1 encoding.
1228
1325
</simpara>
1229
1326
</listitem>
1230
1327
</itemizedlist>
1231
1328