language/types/string.xml
affa37e16f562d9297e83b2e21ec416aadc8b72d
...
...
@@ -4,7 +4,7 @@
4
4
<title>Strings</title>
5
5

6
6
<para>
7
-
A <type>string</type> is series of characters, where a character is
7
+
A <type>string</type> is a series of characters, where a character is
8
8
the same as a byte. This means that PHP only supports a 256-character set,
9
9
and hence does not offer native Unicode support. See
10
10
<link linkend="language.types.string.details">details of the string
...
...
@@ -13,7 +13,8 @@
13
13

14
14
<note>
15
15
<simpara>
16
-
<type>string</type> can be as large as up to 2GB (2147483647 bytes maximum)
16
+
On 32-bit builds, a <type>string</type> can be as large as up to 2GB
17
+
(2147483647 bytes maximum)
17
18
</simpara>
18
19
</note>
19
20

...
...
@@ -43,7 +44,6 @@
43
44
<listitem>
44
45
<simpara>
45
46
<link linkend="language.types.string.syntax.nowdoc">nowdoc syntax</link>
46
-
(since PHP 5.3.0)
47
47
</simpara>
48
48
</listitem>
49
49
</itemizedlist>
...
...
@@ -141,15 +141,15 @@ echo 'Variables do not $expand $either';
141
141
</row>
142
142
<row>
143
143
<entry><literal>\v</literal></entry>
144
-
<entry>vertical tab (VT or 0x0B (11) in ASCII) (since PHP 5.2.5)</entry>
144
+
<entry>vertical tab (VT or 0x0B (11) in ASCII)</entry>
145
145
</row>
146
146
<row>
147
147
<entry><literal>\e</literal></entry>
148
-
<entry>escape (ESC or 0x1B (27) in ASCII) (since PHP 5.4.4)</entry>
148
+
<entry>escape (ESC or 0x1B (27) in ASCII)</entry>
149
149
</row>
150
150
<row>
151
151
<entry><literal>\f</literal></entry>
152
-
<entry>form feed (FF or 0x0C (12) in ASCII) (since PHP 5.2.5)</entry>
152
+
<entry>form feed (FF or 0x0C (12) in ASCII)</entry>
153
153
</row>
154
154
<row>
155
155
<entry><literal>\\</literal></entry>
...
...
@@ -166,24 +166,25 @@ echo 'Variables do not $expand $either';
166
166
<row>
167
167
<entry><literal>\[0-7]{1,3}</literal></entry>
168
168
<entry>
169
-
the sequence of characters matching the regular expression is a
170
-
character in octal notation, which silently overflows to fit in a byte
171
-
(e.g. "\400" === "\000")
169
+
Octal: the sequence of characters matching the regular expression <literal>[0-7]{1,3}</literal>
170
+
is a character in octal notation (e.g. <literal>"\101" === "A"</literal>),
171
+
which silently overflows to fit in a byte (e.g. <literal>"\400" === "\000"</literal>)
172
172
</entry>
173
173
</row>
174
174
<row>
175
175
<entry><literal>\x[0-9A-Fa-f]{1,2}</literal></entry>
176
176
<entry>
177
-
the sequence of characters matching the regular expression is a
178
-
character in hexadecimal notation
177
+
Hexadecimal: the sequence of characters matching the regular expression
178
+
<literal>[0-9A-Fa-f]{1,2}</literal> is a character in hexadecimal notation
179
+
(e.g. <literal>"\x41" === "A"</literal>)
179
180
</entry>
180
181
</row>
181
182
<row>
182
183
<entry><literal>\u{[0-9A-Fa-f]+}</literal></entry>
183
184
<entry>
184
-
the sequence of characters matching the regular expression is a
185
-
Unicode codepoint, which will be output to the string as that
186
-
codepoint's UTF-8 representation (added in PHP 7.0.0)
185
+
Unicode: the sequence of characters matching the regular expression <literal>[0-9A-Fa-f]+</literal>
186
+
is a Unicode codepoint, which will be output to the string as that codepoint's UTF-8 representation.
187
+
The braces are required in the sequence. E.g. <literal>"\u{41}" === "A"</literal>
187
188
</entry>
188
189
</row>
189
190
</tbody>
...
...
@@ -192,8 +193,7 @@ echo 'Variables do not $expand $either';
192
193

193
194
<para>
194
195
As in single quoted <type>string</type>s, escaping any other character will
195
-
result in the backslash being printed too. Before PHP 5.1.1, the backslash
196
-
in <literal>\{$var}</literal> had not been printed.
196
+
result in the backslash being printed too.
197
197
</para>
198
198

199
199
<para>
...
...
@@ -215,22 +215,204 @@ echo 'Variables do not $expand $either';
215
215
</simpara>
216
216

217
217
<simpara>
218
-
The closing identifier <emphasis>must</emphasis> begin in the first column
219
-
of the line. Also, the identifier must follow the same naming rules as any
218
+
The closing identifier may be indented by space or tab, in which case
219
+
the indentation will be stripped from all lines in the doc string.
220
+
Prior to PHP 7.3.0, the closing identifier <emphasis>must</emphasis>
221
+
begin in the first column of the line.
222
+
</simpara>
223
+

224
+
<simpara>
225
+
Also, the closing identifier must follow the same naming rules as any
220
226
other label in PHP: it must contain only alphanumeric characters and
221
227
underscores, and must start with a non-digit character or underscore.
222
228
</simpara>
223
229

230
+
<example>
231
+
<title>Basic Heredoc example as of PHP 7.3.0</title>
232
+
<programlisting role="php">
233
+
<![CDATA[
234
+
<?php
235
+
// no indentation
236
+
echo <<<END
237
+
a
238
+
b
239
+
c
240
+
\n
241
+
END;
242
+

243
+
// 4 spaces of indentation
244
+
echo <<<END
245
+
a
246
+
b
247
+
c
248
+
END;
249
+
]]>
250
+
</programlisting>
251
+
&example.outputs.73;
252
+
<screen>
253
+
<![CDATA[
254
+
a
255
+
b
256
+
c
257
+

258
+
a
259
+
b
260
+
c
261
+
]]>
262
+
</screen>
263
+
</example>
264
+

265
+
<simpara>
266
+
If the closing identifier is indented further than any lines of the body, then a <classname>ParseError</classname> will be thrown:
267
+
</simpara>
268
+

269
+
<example>
270
+
<title>Closing identifier must not be indented further than any lines of the body</title>
271
+
<programlisting role="php">
272
+
<![CDATA[
273
+
<?php
274
+
echo <<<END
275
+
a
276
+
b
277
+
c
278
+
END;
279
+
]]>
280
+
</programlisting>
281
+
&example.outputs.73;
282
+
<screen>
283
+
<![CDATA[
284
+
PHP Parse error: Invalid body indentation level (expecting an indentation level of at least 3) in example.php on line 4
285
+
]]>
286
+
</screen>
287
+
</example>
288
+

289
+
<simpara>
290
+
If the closing identifier is indented, tabs can be used as well, however,
291
+
tabs and spaces <emphasis>must not</emphasis> be intermixed regarding
292
+
the indentation of the closing identifier and the indentation of the body
293
+
(up to the closing identifier). In any of these cases, a <classname>ParseError</classname> will be thrown.
294
+

295
+
These whitespace constraints have been included because mixing tabs and
296
+
spaces for indentation is harmful to legibility.
297
+
</simpara>
298
+

299
+
<example>
300
+
<title>Different indentation for body (spaces) closing identifier</title>
301
+
<programlisting role="php">
302
+
<![CDATA[
303
+
<?php
304
+
// All the following code do not work.
305
+

306
+
// different indentation for body (spaces) ending marker (tabs)
307
+
{
308
+
echo <<<END
309
+
a
310
+
END;
311
+
}
312
+

313
+
// mixing spaces and tabs in body
314
+
{
315
+
echo <<<END
316
+
a
317
+
END;
318
+
}
319
+

320
+
// mixing spaces and tabs in ending marker
321
+
{
322
+
echo <<<END
323
+
a
324
+
END;
325
+
}
326
+
]]>
327
+
</programlisting>
328
+
&example.outputs.73;
329
+
<screen>
330
+
<![CDATA[
331
+
PHP Parse error: Invalid indentation - tabs and spaces cannot be mixed in example.php line 8
332
+
]]>
333
+
</screen>
334
+
</example>
335
+

336
+
<simpara>
337
+
The closing identifier for the body string is not required to be
338
+
followed by a semicolon or newline. For example, the following code
339
+
is allowed as of PHP 7.3.0:
340
+
</simpara>
341
+

342
+
<example>
343
+
<title>Continuing an expression after a closing identifier</title>
344
+
<programlisting role="php">
345
+
<![CDATA[
346
+
<?php
347
+
$values = [<<<END
348
+
a
349
+
b
350
+
c
351
+
END, 'd e f'];
352
+
var_dump($values);
353
+
]]>
354
+
</programlisting>
355
+
&example.outputs.73;
356
+
<screen>
357
+
<![CDATA[
358
+
array(2) {
359
+
[0] =>
360
+
string(11) "a
361
+
b
362
+
c"
363
+
[1] =>
364
+
string(5) "d e f"
365
+
}
366
+
]]>
367
+
</screen>
368
+
</example>
369
+

224
370
<warning>
225
371
<simpara>
226
-
It is very important to note that the line with the closing identifier must
227
-
contain no other characters, except a semicolon (<literal>;</literal>).
372
+
If the closing identifier was found at the start of a line, then
373
+
regardless of whether it was a part of another word, it may be considered
374
+
as the closing identifier and causes a <classname>ParseError</classname>.
375
+
</simpara>
376
+

377
+
<example>
378
+
<title>Closing identifier in body of the string tends to cause ParseError</title>
379
+
<programlisting role="php">
380
+
<![CDATA[
381
+
<?php
382
+
$values = [<<<END
383
+
a
384
+
b
385
+
END ING
386
+
END, 'd e f'];
387
+
]]>
388
+
</programlisting>
389
+
&example.outputs.73;
390
+
<screen>
391
+
<![CDATA[
392
+
PHP Parse error: syntax error, unexpected identifier "ING", expecting "]" in example.php on line 6
393
+
]]>
394
+
</screen>
395
+
</example>
396
+

397
+
<simpara>
398
+
To avoid this problem, it is safe to follow the simple rule:
399
+
<emphasis>do not choose as a closing identifier if it appears in the body
400
+
of the text</emphasis>.
401
+
</simpara>
402
+

403
+
</warning>
404
+

405
+
<warning>
406
+
<simpara>
407
+
Prior to PHP 7.3.0, it is very important to note that the line with the
408
+
closing identifier must contain no other characters, except a semicolon
409
+
(<literal>;</literal>).
228
410
That means especially that the identifier
229
411
<emphasis>may not be indented</emphasis>, and there may not be any spaces
230
412
or tabs before or after the semicolon. It's also important to realize that
231
413
the first character before the closing identifier must be a newline as
232
414
defined by the local operating system. This is <literal>\n</literal> on
233
-
UNIX systems, including Mac OS X. The closing delimiter must also be
415
+
UNIX systems, including macOS. The closing delimiter must also be
234
416
followed by a newline.
235
417
</simpara>
236
418

...
...
@@ -241,14 +423,10 @@ echo 'Variables do not $expand $either';
241
423
current file, a parse error will result at the last line.
242
424
</simpara>
243
425

244
-
<para>
245
-
Heredocs can not be used for initializing class properties. Since PHP 5.3,
246
-
this limitation is valid only for heredocs containing variables.
247
-
</para>
248
-
249
426
<example>
250
-
<title>Invalid example</title>
427
+
<title>Invalid example, prior to PHP 7.3.0</title>
251
428
<programlisting role="php">
429
+
<!-- This is an INVALID example -->
252
430
<![CDATA[
253
431
<?php
254
432
class foo {
...
...
@@ -256,10 +434,31 @@ class foo {
256
434
bar
257
435
EOT;
258
436
}
437
+
// Identifier must not be indented
438
+
?>
439
+
]]>
440
+
</programlisting>
441
+
</example>
442
+
<example>
443
+
<title>Valid example, even prior to PHP 7.3.0</title>
444
+
<programlisting role="php">
445
+
<!-- This is a VALID example -->
446
+
<![CDATA[
447
+
<?php
448
+
class foo {
449
+
public $bar = <<<EOT
450
+
bar
451
+
EOT;
452
+
}
259
453
?>
260
454
]]>
261
455
</programlisting>
262
456
</example>
457
+

458
+
<para>
459
+
Heredocs containing variables can not be used for initializing class properties.
460
+
</para>
461
+

263
462
</warning>
264
463

265
464
<para>
...
...
@@ -287,7 +486,7 @@ class foo
287
486
var $foo;
288
487
var $bar;
289
488

290
-
function foo()
489
+
function __construct()
291
490
{
292
491
$this->foo = 'Foo';
293
492
$this->bar = array('Bar1', 'Bar2', 'Bar3');
...
...
@@ -334,7 +533,7 @@ EOD
334
533
</example>
335
534

336
535
<para>
337
-
As of PHP 5.3.0, it's possible to initialize static variables and class
536
+
It's possible to initialize static variables and class
338
537
properties/constants using the Heredoc syntax:
339
538
</para>
340
539

...
...
@@ -368,7 +567,7 @@ FOOBAR;
368
567
</example>
369
568

370
569
<para>
371
-
Starting with PHP 5.3.0, the opening Heredoc identifier may optionally be
570
+
The opening Heredoc identifier may optionally be
372
571
enclosed in double quotes:
373
572
</para>
374
573

...
...
@@ -413,19 +612,34 @@ FOOBAR;
413
612
<programlisting role="php">
414
613
<![CDATA[
415
614
<?php
416
-
$str = <<<'EOD'
417
-
Example of string
418
-
spanning multiple lines
419
-
using nowdoc syntax.
615
+
echo <<<'EOD'
616
+
Example of string spanning multiple lines
617
+
using nowdoc syntax. Backslashes are always treated literally,
618
+
e.g. \\ and \'.
420
619
EOD;
620
+
]]>
621
+
</programlisting>
622
+
&example.outputs;
623
+
<screen>
624
+
<![CDATA[
625
+
Example of string spanning multiple lines
626
+
using nowdoc syntax. Backslashes are always treated literally,
627
+
e.g. \\ and \'.
628
+
]]>
629
+
</screen>
630
+
</example>
421
631

422
-
/* More complex example, with variables. */
632
+
<example>
633
+
<title>Nowdoc string quoting example with variables</title>
634
+
<programlisting role="php">
635
+
<![CDATA[
636
+
<?php
423
637
class foo
424
638
{
425
639
public $foo;
426
640
public $bar;
427
641

428
-
function foo()
642
+
function __construct()
429
643
{
430
644
$this->foo = 'Foo';
431
645
$this->bar = array('Bar1', 'Bar2', 'Bar3');
...
...
@@ -467,12 +681,6 @@ EOT;
467
681
</programlisting>
468
682
</example>
469
683

470
-
<note>
471
-
<para>
472
-
Nowdoc support was added in PHP 5.3.0.
473
-
</para>
474
-
</note>
475
-

476
684
</sect3>
477
685

478
686
<sect3 xml:id="language.types.string.parsing">
...
...
@@ -513,11 +721,14 @@ EOT;
513
721
<?php
514
722
$juice = "apple";
515
723

516
-
echo "He drank some $juice juice.".PHP_EOL;
517
-
// Invalid. "s" is a valid character for a variable name, but the variable is $juice.
518
-
echo "He drank some juice made of $juices.";
519
-
// Valid. Explicitly specify the end of the variable name by enclosing it in braces:
520
-
echo "He drank some juice made of ${juice}s."
724
+
echo "He drank some $juice juice." . PHP_EOL;
725
+

726
+
// Unintended. "s" is a valid character for a variable name, so this refers to $juices, not $juice.
727
+
echo "He drank some juice made of $juices." . PHP_EOL;
728
+

729
+
// Explicitly specify the end of the variable name by enclosing the reference in braces.
730
+
echo "He drank some juice made of {$juice}s.";
731
+

521
732
?>
522
733
]]>
523
734
</programlisting>
...
...
@@ -580,6 +791,31 @@ Robert Paulsen greeted the two .
580
791
</example>
581
792

582
793
<simpara>
794
+
As of PHP 7.1.0 also <emphasis>negative</emphasis> numeric indices are
795
+
supported.
796
+
</simpara>
797
+

798
+
<example><title>Negative numeric indices</title>
799
+
<programlisting role="php">
800
+
<![CDATA[
801
+
<?php
802
+
$string = 'string';
803
+
echo "The character at index -2 is $string[-2].", PHP_EOL;
804
+
$string[-3] = 'o';
805
+
echo "Changing the character at index -3 to o gives $string.", PHP_EOL;
806
+
?>
807
+
]]>
808
+
</programlisting>
809
+
&example.outputs;
810
+
<screen>
811
+
<![CDATA[
812
+
The character at index -2 is n.
813
+
Changing the character at index -3 to o gives strong.
814
+
]]>
815
+
</screen>
816
+
</example>
817
+

818
+
<simpara>
583
819
For anything more complex, you should use the complex syntax.
584
820
</simpara>
585
821
</sect4>
...
...
@@ -595,8 +831,8 @@ Robert Paulsen greeted the two .
595
831
<simpara>
596
832
Any scalar variable, array element or object property with a
597
833
<type>string</type> representation can be included via this syntax.
598
-
Simply write the expression the same way as it would appear outside the
599
-
<type>string</type>, and then wrap it in <literal>{</literal> and
834
+
The expression is written the same way as it would appear outside the
835
+
<type>string</type>, and then wrapped in <literal>{</literal> and
600
836
<literal>}</literal>. Since <literal>{</literal> can not be escaped, this
601
837
syntax will only be recognised when the <literal>$</literal> immediately
602
838
follows the <literal>{</literal>. Use <literal>{\$</literal> to get a
...
...
@@ -630,9 +866,9 @@ echo "This works: {$arr['key']}";
630
866
echo "This works: {$arr[4][3]}";
631
867

632
868
// This is wrong for the same reason as $foo[bar] is wrong outside a string.
633
-
// In other words, it will still work, but only because PHP first looks for a
634
-
// constant named foo; an error of level E_NOTICE (undefined constant) will be
635
-
// thrown.
869
+
// PHP first looks for a constant named foo, and throws an error if not found.
870
+
// If the constant is found, its value (and not 'foo' itself) would be used
871
+
// for the array index.
636
872
echo "This is wrong: {$arr[foo][3]}";
637
873

638
874
// Works. When using multi-dimensional arrays, always use braces around arrays
...
...
@@ -652,6 +888,11 @@ echo "This is the value of the var named by the return value of \$object->getNam
652
888

653
889
// Won't work, outputs: This is the return value of getName(): {getName()}
654
890
echo "This is the return value of getName(): {getName()}";
891
+

892
+
// Won't work, outputs: C:\folder\{fantastic}.txt
893
+
echo "C:\folder\{$great}.txt"
894
+
// Works, outputs: C:\folder\fantastic.txt
895
+
echo "C:\\folder\\{$great}.txt"
655
896
?>
656
897
]]>
657
898
<!-- maybe it's better to leave this out??
...
...
@@ -695,9 +936,9 @@ I am bar.
695
936

696
937
<note>
697
938
<para>
698
-
Functions, method calls, static class variables, and class
699
-
constants inside <literal>{$}</literal> work since PHP
700
-
5. However, the value accessed will be interpreted as the name
939
+
The value accessed from functions, method calls, static class variables,
940
+
and class constants inside
941
+
<literal>{$}</literal> will be interpreted as the name
701
942
of a variable in the scope in which the string is defined. Using
702
943
single curly braces (<literal>{}</literal>) will not work for
703
944
accessing the return values of functions or methods or the
...
...
@@ -748,8 +989,19 @@ echo "I'd like an {${beers::$ale}}\n";
748
989

749
990
<note>
750
991
<simpara>
751
-
<type>String</type>s may also be accessed using braces, as in
992
+
As of PHP 7.1.0, negative string offsets are also supported. These specify
993
+
the offset from the end of the string.
994
+
Formerly, negative offsets emitted <constant>E_NOTICE</constant> for reading
995
+
(yielding an empty string) and <constant>E_WARNING</constant> for writing
996
+
(leaving the string untouched).
997
+
</simpara>
998
+
</note>
999
+

1000
+
<note>
1001
+
<simpara>
1002
+
Prior to PHP 8.0.0, <type>string</type>s could also be accessed using braces, as in
752
1003
<varname>$str{42}</varname>, for the same purpose.
1004
+
This curly brace syntax was deprecated as of PHP 7.4.0 and no longer supported as of PHP 8.0.0.
753
1005
</simpara>
754
1006
</note>
755
1007

...
...
@@ -757,10 +1009,10 @@ echo "I'd like an {${beers::$ale}}\n";
757
1009
<simpara>
758
1010
Writing to an out of range offset pads the string with spaces.
759
1011
Non-integer types are converted to integer.
760
-
Illegal offset type emits <constant>E_NOTICE</constant>.
761
-
Negative offset emits <constant>E_NOTICE</constant> in write but reads empty string.
1012
+
Illegal offset type emits <constant>E_WARNING</constant>.
762
1013
Only the first character of an assigned string is used.
763
-
Assigning empty string assigns NULL byte.
1014
+
As of PHP 7.1.0, assigning an empty string throws a fatal error. Formerly,
1015
+
it assigned a NULL byte.
764
1016
</simpara>
765
1017
</warning>
766
1018

...
...
@@ -773,6 +1025,13 @@ echo "I'd like an {${beers::$ale}}\n";
773
1025
</simpara>
774
1026
</warning>
775
1027

1028
+
<note>
1029
+
<simpara>
1030
+
As of PHP 7.1.0, applying the empty index operator on an empty string throws a fatal
1031
+
error. Formerly, the empty string was silently converted to an array.
1032
+
</simpara>
1033
+
</note>
1034
+

776
1035
<example>
777
1036
<title>Some string examples</title>
778
1037
<programlisting role="php">
...
...
@@ -799,12 +1058,13 @@ $str[strlen($str)-1] = 'e';
799
1058
</example>
800
1059

801
1060
<para>
802
-
As of PHP 5.4 string offsets have to either be integers or integer-like strings, otherwise a warning
803
-
will be thrown. Previously an offset like <literal>"foo"</literal> was silently cast to <literal>0</literal>.
1061
+
String offsets have to either be integers or integer-like strings,
1062
+
otherwise a warning will be thrown.
804
1063
</para>
805
1064

806
1065
<example>
807
-
<title>Differences between PHP 5.3 and PHP 5.4</title>
1066
+
<!-- TODO Update for PHP 8.0 -->
1067
+
<title>Example of Illegal String Offsets</title>
808
1068
<programlisting role="php">
809
1069
<![CDATA[
810
1070
<?php
...
...
@@ -824,20 +1084,7 @@ var_dump(isset($str['1x']));
824
1084
?>
825
1085
]]>
826
1086
</programlisting>
827
-
&example.outputs.53;
828
-
<screen>
829
-
<![CDATA[
830
-
string(1) "b"
831
-
bool(true)
832
-
string(1) "b"
833
-
bool(true)
834
-
string(1) "a"
835
-
bool(true)
836
-
string(1) "b"
837
-
bool(true)
838
-
]]>
839
-
</screen>
840
-
&example.outputs.54;
1087
+
&example.outputs;
841
1088
<screen>
842
1089
<![CDATA[
843
1090
string(1) "b"
...
...
@@ -866,10 +1113,18 @@ bool(false)
866
1113

867
1114
<note>
868
1115
<para>
869
-
PHP 5.5 added support for accessing characters within string literals
1116
+
Characters within string literals can be accessed
870
1117
using <literal>[]</literal> or <literal>{}</literal>.
871
1118
</para>
872
1119
</note>
1120
+

1121
+
<note>
1122
+
<para>
1123
+
Accessing characters within string literals using the
1124
+
<literal>{}</literal> syntax has been deprecated in PHP 7.4.
1125
+
This has been removed in PHP 8.0.
1126
+
</para>
1127
+
</note>
873
1128
</sect3>
874
1129
</sect2><!-- end syntax -->
875
1130

...
...
@@ -889,16 +1144,15 @@ bool(false)
889
1144

890
1145
<simpara>
891
1146
See the <link linkend="ref.strings">string functions section</link> for
892
-
general functions, and the <link linkend="ref.regex">regular expression
893
-
functions</link> or the <link linkend="ref.pcre">Perl-compatible regular
1147
+
general functions, and the <link linkend="ref.pcre">Perl-compatible regular
894
1148
expression functions</link> for advanced find &amp; replace functionality.
895
1149
</simpara>
896
1150

897
1151
<simpara>
898
1152
There are also <link linkend="ref.url">functions for URL strings</link>, and
899
1153
functions to encrypt/decrypt strings
900
-
(<link linkend="ref.mcrypt">mcrypt</link> and
901
-
<link linkend="ref.mhash">mhash</link>).
1154
+
(<link linkend="ref.sodium">Sodium</link> and
1155
+
<link linkend="ref.hash">Hash</link>).
902
1156
</simpara>
903
1157

904
1158
<simpara>
...
...
@@ -923,14 +1177,14 @@ bool(false)
923
1177
</para>
924
1178

925
1179
<para>
926
-
A <type>boolean</type> &true; value is converted to the <type>string</type>
927
-
<literal>"1"</literal>. <type>Boolean</type> &false; is converted to
1180
+
A <type>bool</type> &true; value is converted to the <type>string</type>
1181
+
<literal>"1"</literal>. <type>bool</type> &false; is converted to
928
1182
<literal>""</literal> (the empty string). This allows conversion back and
929
-
forth between <type>boolean</type> and <type>string</type> values.
1183
+
forth between <type>bool</type> and <type>string</type> values.
930
1184
</para>
931
1185

932
1186
<para>
933
-
An <type>integer</type> or <type>float</type> is converted to a
1187
+
An <type>int</type> or <type>float</type> is converted to a
934
1188
<type>string</type> representing the number textually (including the
935
1189
exponent part for <type>float</type>s). Floating point numbers can be
936
1190
converted using exponential notation (<literal>4.1E+6</literal>).
...
...
@@ -938,7 +1192,9 @@ bool(false)
938
1192

939
1193
<note>
940
1194
<para>
941
-
The decimal point character is defined in the script's locale (category
1195
+
As of PHP 8.0.0, the decimal point character is always
1196
+
a period ("<literal>.</literal>"). Prior to PHP 8.0.0,
1197
+
the decimal point character is defined in the script's locale (category
942
1198
LC_NUMERIC). See the <function>setlocale</function> function.
943
1199
</para>
944
1200
</note>
...
...
@@ -953,12 +1209,8 @@ bool(false)
953
1209
</para>
954
1210

955
1211
<para>
956
-
<type>Object</type>s in PHP 4 are always converted to the <type>string</type>
957
-
<literal>"Object"</literal>. To print the values of object properties for
958
-
debugging reasons, read the paragraphs below. To get an object's class name,
959
-
use the <function>get_class</function> function. As of PHP 5, the
960
-
<link linkend="language.oop5.magic">__toString</link> method is used when
961
-
applicable.
1212
+
In order to convert <type>object</type>s to <type>string</type>, the magic
1213
+
method <link linkend="language.oop5.magic">__toString</link> must be used.
962
1214
</para>
963
1215

964
1216
<para>
...
...
@@ -987,80 +1239,7 @@ bool(false)
987
1239
<para>
988
1240
Most PHP values can also be converted to <type>string</type>s for permanent
989
1241
storage. This method is called serialization, and is performed by the
990
-
<function>serialize</function> function. If the PHP engine was built with
991
-
<link linkend="ref.wddx">WDDX</link> support, PHP values can also be
992
-
serialized as well-formed XML text.
993
-
</para>
994
-

995
-
</sect2>
996
-

997
-
<sect2 xml:id="language.types.string.conversion">
998
-
<title>String conversion to numbers</title>
999
-

1000
-
<simpara>
1001
-
When a <type>string</type> is evaluated in a numeric context, the resulting
1002
-
value and type are determined as follows.
1003
-
</simpara>
1004
-

1005
-
<simpara>
1006
-
If the <type>string</type> does not contain any of the characters '.', 'e',
1007
-
or 'E' and the numeric value fits into integer type limits (as defined by
1008
-
<constant>PHP_INT_MAX</constant>), the <type>string</type> will be evaluated
1009
-
as an <type>integer</type>. In all other cases it will be evaluated as a
1010
-
<type>float</type>.
1011
-
</simpara>
1012
-

1013
-
<para>
1014
-
The value is given by the initial portion of the <type>string</type>. If the
1015
-
<type>string</type> starts with valid numeric data, this will be the value
1016
-
used. Otherwise, the value will be 0 (zero). Valid numeric data is an
1017
-
optional sign, followed by one or more digits (optionally containing a
1018
-
decimal point), followed by an optional exponent. The exponent is an 'e' or
1019
-
'E' followed by one or more digits.
1020
-
</para>
1021
-

1022
-
<informalexample>
1023
-
<programlisting role="php">
1024
-
<![CDATA[
1025
-
<?php
1026
-
$foo = 1 + "10.5"; // $foo is float (11.5)
1027
-
$foo = 1 + "-1.3e3"; // $foo is float (-1299)
1028
-
$foo = 1 + "bob-1.3e3"; // $foo is integer (1)
1029
-
$foo = 1 + "bob3"; // $foo is integer (1)
1030
-
$foo = 1 + "10 Small Pigs"; // $foo is integer (11)
1031
-
$foo = 4 + "10.2 Little Piggies"; // $foo is float (14.2)
1032
-
$foo = "10.0 pigs " + 1; // $foo is float (11)
1033
-
$foo = "10.0 pigs " + 1.0; // $foo is float (11)
1034
-
?>
1035
-
]]>
1036
-
</programlisting>
1037
-
</informalexample>
1038
-

1039
-
<simpara>
1040
-
For more information on this conversion, see the Unix manual page for
1041
-
strtod(3).
1042
-
</simpara>
1043
-

1044
-
<para>
1045
-
To test any of the examples in this section, cut and paste the examples and
1046
-
insert the following line to see what's going on:
1047
-
</para>
1048
-

1049
-
<informalexample>
1050
-
<programlisting role="php">
1051
-
<![CDATA[
1052
-
<?php
1053
-
echo "\$foo==$foo; type is " . gettype ($foo) . "<br />\n";
1054
-
?>
1055
-
]]>
1056
-
</programlisting>
1057
-
</informalexample>
1058
-

1059
-
<para>
1060
-
Do not expect to get the code of one character by converting it to integer,
1061
-
as is done in C. Use the <function>ord</function> and
1062
-
<function>chr</function> functions to convert between ASCII codes and
1063
-
characters.
1242
+
<function>serialize</function> function.
1064
1243
</para>
1065
1244

1066
1245
</sect2>
...
...
@@ -1095,7 +1274,7 @@ echo "\$foo==$foo; type is " . gettype ($foo) . "<br />\n";
1095
1274
it is encoded in the script file. Thus, if the script is written in
1096
1275
ISO-8859-1, the string will be encoded in ISO-8859-1 and so on. However,
1097
1276
this does not apply if Zend Multibyte is enabled; in that case, the script
1098
-
may be written in an arbitrary encoding (which is explicity declared or is
1277
+
may be written in an arbitrary encoding (which is explicitly declared or is
1099
1278
detected) and then converted to a certain internal encoding, which is then
1100
1279
the encoding that will be used for the string literals.
1101
1280
Note that there are some constraints on the encoding of the script (or on the
...
...
@@ -1133,15 +1312,7 @@ echo "\$foo==$foo; type is " . gettype ($foo) . "<br />\n";
1133
1312
<listitem>
1134
1313
<simpara>
1135
1314
Others use the current locale (see <function>setlocale</function>), but
1136
-
operate byte-by-byte. This is the case of <function>strcasecmp</function>,
1137
-
<function>strtoupper</function> and <function>ucfirst</function>.
1138
-
This means they can be used only with single-byte encodings, as long as
1139
-
the encoding is matched by the locale. For instance
1140
-
<literal>strtoupper("á")</literal> may return <literal>"Á"</literal> if the
1141
-
locale is correctly set and <literal>á</literal> is encoded with a single
1142
-
byte. If it is encoded in UTF-8, the correct result will not be returned
1143
-
and the resulting string may or may not be returned corrupted, depending
1144
-
on the current locale.
1315
+
operate byte-by-byte.
1145
1316
</simpara>
1146
1317
</listitem>
1147
1318
<listitem>
...
...
@@ -1151,9 +1322,6 @@ echo "\$foo==$foo; type is " . gettype ($foo) . "<br />\n";
1151
1322
<link linkend="book.intl">intl</link> extension and in the
1152
1323
<link linkend="book.pcre">PCRE</link> extension
1153
1324
(in the last case, only when the <literal>u</literal> modifier is used).
1154
-
Although this is due to their special purpose, the function
1155
-
<function>utf8_decode</function> assumes a UTF-8 encoding and the
1156
-
function <function>utf8_encode</function> assumes an ISO-8859-1 encoding.
1157
1325
</simpara>
1158
1326
</listitem>
1159
1327
</itemizedlist>
1160
1328