language/types/string.xml
affa37e16f562d9297e83b2e21ec416aadc8b72d
...
...
@@ -4,7 +4,7 @@
4
4
<title>Strings</title>
5
5

6
6
<para>
7
-
A <type>string</type> is series of characters, where a character is
7
+
A <type>string</type> is a series of characters, where a character is
8
8
the same as a byte. This means that PHP only supports a 256-character set,
9
9
and hence does not offer native Unicode support. See
10
10
<link linkend="language.types.string.details">details of the string
...
...
@@ -13,7 +13,8 @@
13
13

14
14
<note>
15
15
<simpara>
16
-
<type>string</type> can be as large as up to 2GB (2147483647 bytes maximum)
16
+
On 32-bit builds, a <type>string</type> can be as large as up to 2GB
17
+
(2147483647 bytes maximum)
17
18
</simpara>
18
19
</note>
19
20

...
...
@@ -43,7 +44,6 @@
43
44
<listitem>
44
45
<simpara>
45
46
<link linkend="language.types.string.syntax.nowdoc">nowdoc syntax</link>
46
-
(since PHP 5.3.0)
47
47
</simpara>
48
48
</listitem>
49
49
</itemizedlist>
...
...
@@ -112,7 +112,7 @@ echo 'Variables do not $expand $either';
112
112

113
113
<para>
114
114
If the <type>string</type> is enclosed in double-quotes ("), PHP will
115
-
interpret more escape sequences for special characters:
115
+
interpret the following escape sequences for special characters:
116
116
</para>
117
117

118
118
<table>
...
...
@@ -141,15 +141,15 @@ echo 'Variables do not $expand $either';
141
141
</row>
142
142
<row>
143
143
<entry><literal>\v</literal></entry>
144
-
<entry>vertical tab (VT or 0x0B (11) in ASCII) (since PHP 5.2.5)</entry>
144
+
<entry>vertical tab (VT or 0x0B (11) in ASCII)</entry>
145
145
</row>
146
146
<row>
147
147
<entry><literal>\e</literal></entry>
148
-
<entry>escape (ESC or 0x1B (27) in ASCII) (since PHP 5.4.4)</entry>
148
+
<entry>escape (ESC or 0x1B (27) in ASCII)</entry>
149
149
</row>
150
150
<row>
151
151
<entry><literal>\f</literal></entry>
152
-
<entry>form feed (FF or 0x0C (12) in ASCII) (since PHP 5.2.5)</entry>
152
+
<entry>form feed (FF or 0x0C (12) in ASCII)</entry>
153
153
</row>
154
154
<row>
155
155
<entry><literal>\\</literal></entry>
...
...
@@ -166,15 +166,25 @@ echo 'Variables do not $expand $either';
166
166
<row>
167
167
<entry><literal>\[0-7]{1,3}</literal></entry>
168
168
<entry>
169
-
the sequence of characters matching the regular expression is a
170
-
character in octal notation
169
+
Octal: the sequence of characters matching the regular expression <literal>[0-7]{1,3}</literal>
170
+
is a character in octal notation (e.g. <literal>"\101" === "A"</literal>),
171
+
which silently overflows to fit in a byte (e.g. <literal>"\400" === "\000"</literal>)
171
172
</entry>
172
173
</row>
173
174
<row>
174
175
<entry><literal>\x[0-9A-Fa-f]{1,2}</literal></entry>
175
176
<entry>
176
-
the sequence of characters matching the regular expression is a
177
-
character in hexadecimal notation
177
+
Hexadecimal: the sequence of characters matching the regular expression
178
+
<literal>[0-9A-Fa-f]{1,2}</literal> is a character in hexadecimal notation
179
+
(e.g. <literal>"\x41" === "A"</literal>)
180
+
</entry>
181
+
</row>
182
+
<row>
183
+
<entry><literal>\u{[0-9A-Fa-f]+}</literal></entry>
184
+
<entry>
185
+
Unicode: the sequence of characters matching the regular expression <literal>[0-9A-Fa-f]+</literal>
186
+
is a Unicode codepoint, which will be output to the string as that codepoint's UTF-8 representation.
187
+
The braces are required in the sequence. E.g. <literal>"\u{41}" === "A"</literal>
178
188
</entry>
179
189
</row>
180
190
</tbody>
...
...
@@ -183,8 +193,7 @@ echo 'Variables do not $expand $either';
183
193

184
194
<para>
185
195
As in single quoted <type>string</type>s, escaping any other character will
186
-
result in the backslash being printed too. Before PHP 5.1.1, the backslash
187
-
in <literal>\{$var}</literal> had not been printed.
196
+
result in the backslash being printed too.
188
197
</para>
189
198

190
199
<para>
...
...
@@ -206,22 +215,204 @@ echo 'Variables do not $expand $either';
206
215
</simpara>
207
216

208
217
<simpara>
209
-
The closing identifier <emphasis>must</emphasis> begin in the first column
210
-
of the line. Also, the identifier must follow the same naming rules as any
218
+
The closing identifier may be indented by space or tab, in which case
219
+
the indentation will be stripped from all lines in the doc string.
220
+
Prior to PHP 7.3.0, the closing identifier <emphasis>must</emphasis>
221
+
begin in the first column of the line.
222
+
</simpara>
223
+

224
+
<simpara>
225
+
Also, the closing identifier must follow the same naming rules as any
211
226
other label in PHP: it must contain only alphanumeric characters and
212
227
underscores, and must start with a non-digit character or underscore.
213
228
</simpara>
214
229

230
+
<example>
231
+
<title>Basic Heredoc example as of PHP 7.3.0</title>
232
+
<programlisting role="php">
233
+
<![CDATA[
234
+
<?php
235
+
// no indentation
236
+
echo <<<END
237
+
a
238
+
b
239
+
c
240
+
\n
241
+
END;
242
+

243
+
// 4 spaces of indentation
244
+
echo <<<END
245
+
a
246
+
b
247
+
c
248
+
END;
249
+
]]>
250
+
</programlisting>
251
+
&example.outputs.73;
252
+
<screen>
253
+
<![CDATA[
254
+
a
255
+
b
256
+
c
257
+

258
+
a
259
+
b
260
+
c
261
+
]]>
262
+
</screen>
263
+
</example>
264
+

265
+
<simpara>
266
+
If the closing identifier is indented further than any lines of the body, then a <classname>ParseError</classname> will be thrown:
267
+
</simpara>
268
+

269
+
<example>
270
+
<title>Closing identifier must not be indented further than any lines of the body</title>
271
+
<programlisting role="php">
272
+
<![CDATA[
273
+
<?php
274
+
echo <<<END
275
+
a
276
+
b
277
+
c
278
+
END;
279
+
]]>
280
+
</programlisting>
281
+
&example.outputs.73;
282
+
<screen>
283
+
<![CDATA[
284
+
PHP Parse error: Invalid body indentation level (expecting an indentation level of at least 3) in example.php on line 4
285
+
]]>
286
+
</screen>
287
+
</example>
288
+

289
+
<simpara>
290
+
If the closing identifier is indented, tabs can be used as well, however,
291
+
tabs and spaces <emphasis>must not</emphasis> be intermixed regarding
292
+
the indentation of the closing identifier and the indentation of the body
293
+
(up to the closing identifier). In any of these cases, a <classname>ParseError</classname> will be thrown.
294
+

295
+
These whitespace constraints have been included because mixing tabs and
296
+
spaces for indentation is harmful to legibility.
297
+
</simpara>
298
+

299
+
<example>
300
+
<title>Different indentation for body (spaces) closing identifier</title>
301
+
<programlisting role="php">
302
+
<![CDATA[
303
+
<?php
304
+
// All the following code do not work.
305
+

306
+
// different indentation for body (spaces) ending marker (tabs)
307
+
{
308
+
echo <<<END
309
+
a
310
+
END;
311
+
}
312
+

313
+
// mixing spaces and tabs in body
314
+
{
315
+
echo <<<END
316
+
a
317
+
END;
318
+
}
319
+

320
+
// mixing spaces and tabs in ending marker
321
+
{
322
+
echo <<<END
323
+
a
324
+
END;
325
+
}
326
+
]]>
327
+
</programlisting>
328
+
&example.outputs.73;
329
+
<screen>
330
+
<![CDATA[
331
+
PHP Parse error: Invalid indentation - tabs and spaces cannot be mixed in example.php line 8
332
+
]]>
333
+
</screen>
334
+
</example>
335
+

336
+
<simpara>
337
+
The closing identifier for the body string is not required to be
338
+
followed by a semicolon or newline. For example, the following code
339
+
is allowed as of PHP 7.3.0:
340
+
</simpara>
341
+

342
+
<example>
343
+
<title>Continuing an expression after a closing identifier</title>
344
+
<programlisting role="php">
345
+
<![CDATA[
346
+
<?php
347
+
$values = [<<<END
348
+
a
349
+
b
350
+
c
351
+
END, 'd e f'];
352
+
var_dump($values);
353
+
]]>
354
+
</programlisting>
355
+
&example.outputs.73;
356
+
<screen>
357
+
<![CDATA[
358
+
array(2) {
359
+
[0] =>
360
+
string(11) "a
361
+
b
362
+
c"
363
+
[1] =>
364
+
string(5) "d e f"
365
+
}
366
+
]]>
367
+
</screen>
368
+
</example>
369
+

215
370
<warning>
216
371
<simpara>
217
-
It is very important to note that the line with the closing identifier must
218
-
contain no other characters, except a semicolon (<literal>;</literal>).
372
+
If the closing identifier was found at the start of a line, then
373
+
regardless of whether it was a part of another word, it may be considered
374
+
as the closing identifier and causes a <classname>ParseError</classname>.
375
+
</simpara>
376
+

377
+
<example>
378
+
<title>Closing identifier in body of the string tends to cause ParseError</title>
379
+
<programlisting role="php">
380
+
<![CDATA[
381
+
<?php
382
+
$values = [<<<END
383
+
a
384
+
b
385
+
END ING
386
+
END, 'd e f'];
387
+
]]>
388
+
</programlisting>
389
+
&example.outputs.73;
390
+
<screen>
391
+
<![CDATA[
392
+
PHP Parse error: syntax error, unexpected identifier "ING", expecting "]" in example.php on line 6
393
+
]]>
394
+
</screen>
395
+
</example>
396
+

397
+
<simpara>
398
+
To avoid this problem, it is safe to follow the simple rule:
399
+
<emphasis>do not choose as a closing identifier if it appears in the body
400
+
of the text</emphasis>.
401
+
</simpara>
402
+

403
+
</warning>
404
+

405
+
<warning>
406
+
<simpara>
407
+
Prior to PHP 7.3.0, it is very important to note that the line with the
408
+
closing identifier must contain no other characters, except a semicolon
409
+
(<literal>;</literal>).
219
410
That means especially that the identifier
220
411
<emphasis>may not be indented</emphasis>, and there may not be any spaces
221
412
or tabs before or after the semicolon. It's also important to realize that
222
413
the first character before the closing identifier must be a newline as
223
414
defined by the local operating system. This is <literal>\n</literal> on
224
-
UNIX systems, including Mac OS X. The closing delimiter must also be
415
+
UNIX systems, including macOS. The closing delimiter must also be
225
416
followed by a newline.
226
417
</simpara>
227
418

...
...
@@ -232,14 +423,10 @@ echo 'Variables do not $expand $either';
232
423
current file, a parse error will result at the last line.
233
424
</simpara>
234
425

235
-
<para>
236
-
Heredocs can not be used for initializing class properties. Since PHP 5.3,
237
-
this limitation is valid only for heredocs containing variables.
238
-
</para>
239
-
240
426
<example>
241
-
<title>Invalid example</title>
427
+
<title>Invalid example, prior to PHP 7.3.0</title>
242
428
<programlisting role="php">
429
+
<!-- This is an INVALID example -->
243
430
<![CDATA[
244
431
<?php
245
432
class foo {
...
...
@@ -247,10 +434,31 @@ class foo {
247
434
bar
248
435
EOT;
249
436
}
437
+
// Identifier must not be indented
250
438
?>
251
439
]]>
252
440
</programlisting>
253
441
</example>
442
+
<example>
443
+
<title>Valid example, even prior to PHP 7.3.0</title>
444
+
<programlisting role="php">
445
+
<!-- This is a VALID example -->
446
+
<![CDATA[
447
+
<?php
448
+
class foo {
449
+
public $bar = <<<EOT
450
+
bar
451
+
EOT;
452
+
}
453
+
?>
454
+
]]>
455
+
</programlisting>
456
+
</example>
457
+

458
+
<para>
459
+
Heredocs containing variables can not be used for initializing class properties.
460
+
</para>
461
+

254
462
</warning>
255
463

256
464
<para>
...
...
@@ -278,7 +486,7 @@ class foo
278
486
var $foo;
279
487
var $bar;
280
488

281
-
function foo()
489
+
function __construct()
282
490
{
283
491
$this->foo = 'Foo';
284
492
$this->bar = array('Bar1', 'Bar2', 'Bar3');
...
...
@@ -325,7 +533,7 @@ EOD
325
533
</example>
326
534

327
535
<para>
328
-
As of PHP 5.3.0, it's possible to initialize static variables and class
536
+
It's possible to initialize static variables and class
329
537
properties/constants using the Heredoc syntax:
330
538
</para>
331
539

...
...
@@ -359,7 +567,7 @@ FOOBAR;
359
567
</example>
360
568

361
569
<para>
362
-
Starting with PHP 5.3.0, the opening Heredoc identifier may optionally be
570
+
The opening Heredoc identifier may optionally be
363
571
enclosed in double quotes:
364
572
</para>
365
573

...
...
@@ -404,19 +612,34 @@ FOOBAR;
404
612
<programlisting role="php">
405
613
<![CDATA[
406
614
<?php
407
-
$str = <<<'EOD'
408
-
Example of string
409
-
spanning multiple lines
410
-
using nowdoc syntax.
615
+
echo <<<'EOD'
616
+
Example of string spanning multiple lines
617
+
using nowdoc syntax. Backslashes are always treated literally,
618
+
e.g. \\ and \'.
411
619
EOD;
620
+
]]>
621
+
</programlisting>
622
+
&example.outputs;
623
+
<screen>
624
+
<![CDATA[
625
+
Example of string spanning multiple lines
626
+
using nowdoc syntax. Backslashes are always treated literally,
627
+
e.g. \\ and \'.
628
+
]]>
629
+
</screen>
630
+
</example>
412
631

413
-
/* More complex example, with variables. */
632
+
<example>
633
+
<title>Nowdoc string quoting example with variables</title>
634
+
<programlisting role="php">
635
+
<![CDATA[
636
+
<?php
414
637
class foo
415
638
{
416
639
public $foo;
417
640
public $bar;
418
641

419
-
function foo()
642
+
function __construct()
420
643
{
421
644
$this->foo = 'Foo';
422
645
$this->bar = array('Bar1', 'Bar2', 'Bar3');
...
...
@@ -443,13 +666,6 @@ This should not print a capital 'A': \x41]]>
443
666
</screen>
444
667
</example>
445
668

446
-
<note>
447
-
<para>
448
-
Unlike heredocs, nowdocs can be used in any static data context. The
449
-
typical example is initializing class properties or constants:
450
-
</para>
451
-
</note>
452
-
453
669
<example>
454
670
<title>Static data example</title>
455
671
<programlisting role="php">
...
...
@@ -465,12 +681,6 @@ EOT;
465
681
</programlisting>
466
682
</example>
467
683

468
-
<note>
469
-
<para>
470
-
Nowdoc support was added in PHP 5.3.0.
471
-
</para>
472
-
</note>
473
-

474
684
</sect3>
475
685

476
686
<sect3 xml:id="language.types.string.parsing">
...
...
@@ -511,9 +721,14 @@ EOT;
511
721
<?php
512
722
$juice = "apple";
513
723

514
-
echo "He drank some $juice juice.".PHP_EOL;
515
-
// Invalid. "s" is a valid character for a variable name, but the variable is $juice.
516
-
echo "He drank some juice made of $juices.";
724
+
echo "He drank some $juice juice." . PHP_EOL;
725
+

726
+
// Unintended. "s" is a valid character for a variable name, so this refers to $juices, not $juice.
727
+
echo "He drank some juice made of $juices." . PHP_EOL;
728
+

729
+
// Explicitly specify the end of the variable name by enclosing the reference in braces.
730
+
echo "He drank some juice made of {$juice}s.";
731
+

517
732
?>
518
733
]]>
519
734
</programlisting>
...
...
@@ -522,6 +737,7 @@ echo "He drank some juice made of $juices.";
522
737
<![CDATA[
523
738
He drank some apple juice.
524
739
He drank some juice made of .
740
+
He drank some juice made of apples.
525
741
]]>
526
742
</screen>
527
743
</informalexample>
...
...
@@ -575,6 +791,31 @@ Robert Paulsen greeted the two .
575
791
</example>
576
792

577
793
<simpara>
794
+
As of PHP 7.1.0 also <emphasis>negative</emphasis> numeric indices are
795
+
supported.
796
+
</simpara>
797
+

798
+
<example><title>Negative numeric indices</title>
799
+
<programlisting role="php">
800
+
<![CDATA[
801
+
<?php
802
+
$string = 'string';
803
+
echo "The character at index -2 is $string[-2].", PHP_EOL;
804
+
$string[-3] = 'o';
805
+
echo "Changing the character at index -3 to o gives $string.", PHP_EOL;
806
+
?>
807
+
]]>
808
+
</programlisting>
809
+
&example.outputs;
810
+
<screen>
811
+
<![CDATA[
812
+
The character at index -2 is n.
813
+
Changing the character at index -3 to o gives strong.
814
+
]]>
815
+
</screen>
816
+
</example>
817
+

818
+
<simpara>
578
819
For anything more complex, you should use the complex syntax.
579
820
</simpara>
580
821
</sect4>
...
...
@@ -590,8 +831,8 @@ Robert Paulsen greeted the two .
590
831
<simpara>
591
832
Any scalar variable, array element or object property with a
592
833
<type>string</type> representation can be included via this syntax.
593
-
Simply write the expression the same way as it would appear outside the
594
-
<type>string</type>, and then wrap it in <literal>{</literal> and
834
+
The expression is written the same way as it would appear outside the
835
+
<type>string</type>, and then wrapped in <literal>{</literal> and
595
836
<literal>}</literal>. Since <literal>{</literal> can not be escaped, this
596
837
syntax will only be recognised when the <literal>$</literal> immediately
597
838
follows the <literal>{</literal>. Use <literal>{\$</literal> to get a
...
...
@@ -612,7 +853,6 @@ echo "This is { $great}";
612
853

613
854
// Works, outputs: This is fantastic
614
855
echo "This is {$great}";
615
-
echo "This is ${great}";
616
856

617
857
// Works
618
858
echo "This square is {$square->width}00 centimeters broad.";
...
...
@@ -626,9 +866,9 @@ echo "This works: {$arr['key']}";
626
866
echo "This works: {$arr[4][3]}";
627
867

628
868
// This is wrong for the same reason as $foo[bar] is wrong outside a string.
629
-
// In other words, it will still work, but only because PHP first looks for a
630
-
// constant named foo; an error of level E_NOTICE (undefined constant) will be
631
-
// thrown.
869
+
// PHP first looks for a constant named foo, and throws an error if not found.
870
+
// If the constant is found, its value (and not 'foo' itself) would be used
871
+
// for the array index.
632
872
echo "This is wrong: {$arr[foo][3]}";
633
873

634
874
// Works. When using multi-dimensional arrays, always use braces around arrays
...
...
@@ -648,6 +888,11 @@ echo "This is the value of the var named by the return value of \$object->getNam
648
888

649
889
// Won't work, outputs: This is the return value of getName(): {getName()}
650
890
echo "This is the return value of getName(): {getName()}";
891
+

892
+
// Won't work, outputs: C:\folder\{fantastic}.txt
893
+
echo "C:\folder\{$great}.txt"
894
+
// Works, outputs: C:\folder\fantastic.txt
895
+
echo "C:\\folder\\{$great}.txt"
651
896
?>
652
897
]]>
653
898
<!-- maybe it's better to leave this out??
...
...
@@ -676,7 +921,7 @@ $foo = new foo();
676
921
$bar = 'bar';
677
922
$baz = array('foo', 'bar', 'baz', 'quux');
678
923
echo "{$foo->$bar}\n";
679
-
echo "{$foo->$baz[1]}\n";
924
+
echo "{$foo->{$baz[1]}}\n";
680
925
?>
681
926
]]>
682
927
</programlisting>
...
...
@@ -691,9 +936,9 @@ I am bar.
691
936

692
937
<note>
693
938
<para>
694
-
Functions, method calls, static class variables, and class
695
-
constants inside <literal>{$}</literal> work since PHP
696
-
5. However, the value accessed will be interpreted as the name
939
+
The value accessed from functions, method calls, static class variables,
940
+
and class constants inside
941
+
<literal>{$}</literal> will be interpreted as the name
697
942
of a variable in the scope in which the string is defined. Using
698
943
single curly braces (<literal>{}</literal>) will not work for
699
944
accessing the return values of functions or methods or the
...
...
@@ -744,8 +989,19 @@ echo "I'd like an {${beers::$ale}}\n";
744
989

745
990
<note>
746
991
<simpara>
747
-
<type>String</type>s may also be accessed using braces, as in
992
+
As of PHP 7.1.0, negative string offsets are also supported. These specify
993
+
the offset from the end of the string.
994
+
Formerly, negative offsets emitted <constant>E_NOTICE</constant> for reading
995
+
(yielding an empty string) and <constant>E_WARNING</constant> for writing
996
+
(leaving the string untouched).
997
+
</simpara>
998
+
</note>
999
+

1000
+
<note>
1001
+
<simpara>
1002
+
Prior to PHP 8.0.0, <type>string</type>s could also be accessed using braces, as in
748
1003
<varname>$str{42}</varname>, for the same purpose.
1004
+
This curly brace syntax was deprecated as of PHP 7.4.0 and no longer supported as of PHP 8.0.0.
749
1005
</simpara>
750
1006
</note>
751
1007

...
...
@@ -753,10 +1009,10 @@ echo "I'd like an {${beers::$ale}}\n";
753
1009
<simpara>
754
1010
Writing to an out of range offset pads the string with spaces.
755
1011
Non-integer types are converted to integer.
756
-
Illegal offset type emits <constant>E_NOTICE</constant>.
757
-
Negative offset emits <constant>E_NOTICE</constant> in write but reads empty string.
1012
+
Illegal offset type emits <constant>E_WARNING</constant>.
758
1013
Only the first character of an assigned string is used.
759
-
Assigning empty string assigns NULL byte.
1014
+
As of PHP 7.1.0, assigning an empty string throws a fatal error. Formerly,
1015
+
it assigned a NULL byte.
760
1016
</simpara>
761
1017
</warning>
762
1018

...
...
@@ -769,6 +1025,13 @@ echo "I'd like an {${beers::$ale}}\n";
769
1025
</simpara>
770
1026
</warning>
771
1027

1028
+
<note>
1029
+
<simpara>
1030
+
As of PHP 7.1.0, applying the empty index operator on an empty string throws a fatal
1031
+
error. Formerly, the empty string was silently converted to an array.
1032
+
</simpara>
1033
+
</note>
1034
+

772
1035
<example>
773
1036
<title>Some string examples</title>
774
1037
<programlisting role="php">
...
...
@@ -795,12 +1058,13 @@ $str[strlen($str)-1] = 'e';
795
1058
</example>
796
1059

797
1060
<para>
798
-
As of PHP 5.4 string offsets have to either be integers or integer-like strings, otherwise a warning
799
-
will be thrown. Previously an offset like <literal>"foo"</literal> was silently cast to <literal>0</literal>.
1061
+
String offsets have to either be integers or integer-like strings,
1062
+
otherwise a warning will be thrown.
800
1063
</para>
801
1064

802
1065
<example>
803
-
<title>Differences between PHP 5.3 and PHP 5.4</title>
1066
+
<!-- TODO Update for PHP 8.0 -->
1067
+
<title>Example of Illegal String Offsets</title>
804
1068
<programlisting role="php">
805
1069
<![CDATA[
806
1070
<?php
...
...
@@ -820,20 +1084,7 @@ var_dump(isset($str['1x']));
820
1084
?>
821
1085
]]>
822
1086
</programlisting>
823
-
&example.outputs.53;
824
-
<screen>
825
-
<![CDATA[
826
-
string(1) "b"
827
-
bool(true)
828
-
string(1) "b"
829
-
bool(true)
830
-
string(1) "a"
831
-
bool(true)
832
-
string(1) "b"
833
-
bool(true)
834
-
]]>
835
-
</screen>
836
-
&example.outputs.54;
1087
+
&example.outputs;
837
1088
<screen>
838
1089
<![CDATA[
839
1090
string(1) "b"
...
...
@@ -862,10 +1113,18 @@ bool(false)
862
1113

863
1114
<note>
864
1115
<para>
865
-
PHP 5.5 added support for accessing characters within string literals
1116
+
Characters within string literals can be accessed
866
1117
using <literal>[]</literal> or <literal>{}</literal>.
867
1118
</para>
868
1119
</note>
1120
+

1121
+
<note>
1122
+
<para>
1123
+
Accessing characters within string literals using the
1124
+
<literal>{}</literal> syntax has been deprecated in PHP 7.4.
1125
+
This has been removed in PHP 8.0.
1126
+
</para>
1127
+
</note>
869
1128
</sect3>
870
1129
</sect2><!-- end syntax -->
871
1130

...
...
@@ -885,16 +1144,15 @@ bool(false)
885
1144

886
1145
<simpara>
887
1146
See the <link linkend="ref.strings">string functions section</link> for
888
-
general functions, and the <link linkend="ref.regex">regular expression
889
-
functions</link> or the <link linkend="ref.pcre">Perl-compatible regular
1147
+
general functions, and the <link linkend="ref.pcre">Perl-compatible regular
890
1148
expression functions</link> for advanced find &amp; replace functionality.
891
1149
</simpara>
892
1150

893
1151
<simpara>
894
1152
There are also <link linkend="ref.url">functions for URL strings</link>, and
895
1153
functions to encrypt/decrypt strings
896
-
(<link linkend="ref.mcrypt">mcrypt</link> and
897
-
<link linkend="ref.mhash">mhash</link>).
1154
+
(<link linkend="ref.sodium">Sodium</link> and
1155
+
<link linkend="ref.hash">Hash</link>).
898
1156
</simpara>
899
1157

900
1158
<simpara>
...
...
@@ -919,14 +1177,14 @@ bool(false)
919
1177
</para>
920
1178

921
1179
<para>
922
-
A <type>boolean</type> &true; value is converted to the <type>string</type>
923
-
<literal>"1"</literal>. <type>Boolean</type> &false; is converted to
1180
+
A <type>bool</type> &true; value is converted to the <type>string</type>
1181
+
<literal>"1"</literal>. <type>bool</type> &false; is converted to
924
1182
<literal>""</literal> (the empty string). This allows conversion back and
925
-
forth between <type>boolean</type> and <type>string</type> values.
1183
+
forth between <type>bool</type> and <type>string</type> values.
926
1184
</para>
927
1185

928
1186
<para>
929
-
An <type>integer</type> or <type>float</type> is converted to a
1187
+
An <type>int</type> or <type>float</type> is converted to a
930
1188
<type>string</type> representing the number textually (including the
931
1189
exponent part for <type>float</type>s). Floating point numbers can be
932
1190
converted using exponential notation (<literal>4.1E+6</literal>).
...
...
@@ -934,7 +1192,9 @@ bool(false)
934
1192

935
1193
<note>
936
1194
<para>
937
-
The decimal point character is defined in the script's locale (category
1195
+
As of PHP 8.0.0, the decimal point character is always
1196
+
a period ("<literal>.</literal>"). Prior to PHP 8.0.0,
1197
+
the decimal point character is defined in the script's locale (category
938
1198
LC_NUMERIC). See the <function>setlocale</function> function.
939
1199
</para>
940
1200
</note>
...
...
@@ -949,12 +1209,8 @@ bool(false)
949
1209
</para>
950
1210

951
1211
<para>
952
-
<type>Object</type>s in PHP 4 are always converted to the <type>string</type>
953
-
<literal>"Object"</literal>. To print the values of object properties for
954
-
debugging reasons, read the paragraphs below. To get an object's class name,
955
-
use the <function>get_class</function> function. As of PHP 5, the
956
-
<link linkend="language.oop5.magic">__toString</link> method is used when
957
-
applicable.
1212
+
In order to convert <type>object</type>s to <type>string</type>, the magic
1213
+
method <link linkend="language.oop5.magic">__toString</link> must be used.
958
1214
</para>
959
1215

960
1216
<para>
...
...
@@ -983,80 +1239,7 @@ bool(false)
983
1239
<para>
984
1240
Most PHP values can also be converted to <type>string</type>s for permanent
985
1241
storage. This method is called serialization, and is performed by the
986
-
<function>serialize</function> function. If the PHP engine was built with
987
-
<link linkend="ref.wddx">WDDX</link> support, PHP values can also be
988
-
serialized as well-formed XML text.
989
-
</para>
990
-

991
-
</sect2>
992
-

993
-
<sect2 xml:id="language.types.string.conversion">
994
-
<title>String conversion to numbers</title>
995
-

996
-
<simpara>
997
-
When a <type>string</type> is evaluated in a numeric context, the resulting
998
-
value and type are determined as follows.
999
-
</simpara>
1000
-

1001
-
<simpara>
1002
-
If the <type>string</type> does not contain any of the characters '.', 'e',
1003
-
or 'E' and the numeric value fits into integer type limits (as defined by
1004
-
<constant>PHP_INT_MAX</constant>), the <type>string</type> will be evaluated
1005
-
as an <type>integer</type>. In all other cases it will be evaluated as a
1006
-
<type>float</type>.
1007
-
</simpara>
1008
-

1009
-
<para>
1010
-
The value is given by the initial portion of the <type>string</type>. If the
1011
-
<type>string</type> starts with valid numeric data, this will be the value
1012
-
used. Otherwise, the value will be 0 (zero). Valid numeric data is an
1013
-
optional sign, followed by one or more digits (optionally containing a
1014
-
decimal point), followed by an optional exponent. The exponent is an 'e' or
1015
-
'E' followed by one or more digits.
1016
-
</para>
1017
-

1018
-
<informalexample>
1019
-
<programlisting role="php">
1020
-
<![CDATA[
1021
-
<?php
1022
-
$foo = 1 + "10.5"; // $foo is float (11.5)
1023
-
$foo = 1 + "-1.3e3"; // $foo is float (-1299)
1024
-
$foo = 1 + "bob-1.3e3"; // $foo is integer (1)
1025
-
$foo = 1 + "bob3"; // $foo is integer (1)
1026
-
$foo = 1 + "10 Small Pigs"; // $foo is integer (11)
1027
-
$foo = 4 + "10.2 Little Piggies"; // $foo is float (14.2)
1028
-
$foo = "10.0 pigs " + 1; // $foo is float (11)
1029
-
$foo = "10.0 pigs " + 1.0; // $foo is float (11)
1030
-
?>
1031
-
]]>
1032
-
</programlisting>
1033
-
</informalexample>
1034
-

1035
-
<simpara>
1036
-
For more information on this conversion, see the Unix manual page for
1037
-
strtod(3).
1038
-
</simpara>
1039
-

1040
-
<para>
1041
-
To test any of the examples in this section, cut and paste the examples and
1042
-
insert the following line to see what's going on:
1043
-
</para>
1044
-

1045
-
<informalexample>
1046
-
<programlisting role="php">
1047
-
<![CDATA[
1048
-
<?php
1049
-
echo "\$foo==$foo; type is " . gettype ($foo) . "<br />\n";
1050
-
?>
1051
-
]]>
1052
-
</programlisting>
1053
-
</informalexample>
1054
-

1055
-
<para>
1056
-
Do not expect to get the code of one character by converting it to integer,
1057
-
as is done in C. Use the <function>ord</function> and
1058
-
<function>chr</function> functions to convert between ASCII codes and
1059
-
characters.
1242
+
<function>serialize</function> function.
1060
1243
</para>
1061
1244

1062
1245
</sect2>
...
...
@@ -1091,7 +1274,7 @@ echo "\$foo==$foo; type is " . gettype ($foo) . "<br />\n";
1091
1274
it is encoded in the script file. Thus, if the script is written in
1092
1275
ISO-8859-1, the string will be encoded in ISO-8859-1 and so on. However,
1093
1276
this does not apply if Zend Multibyte is enabled; in that case, the script
1094
-
may be written in an arbitrary encoding (which is explicity declared or is
1277
+
may be written in an arbitrary encoding (which is explicitly declared or is
1095
1278
detected) and then converted to a certain internal encoding, which is then
1096
1279
the encoding that will be used for the string literals.
1097
1280
Note that there are some constraints on the encoding of the script (or on the
...
...
@@ -1129,15 +1312,7 @@ echo "\$foo==$foo; type is " . gettype ($foo) . "<br />\n";
1129
1312
<listitem>
1130
1313
<simpara>
1131
1314
Others use the current locale (see <function>setlocale</function>), but
1132
-
operate byte-by-byte. This is the case of <function>strcasecmp</function>,
1133
-
<function>strtoupper</function> and <function>ucfirst</function>.
1134
-
This means they can be used only with single-byte encodings, as long as
1135
-
the encoding is matched by the locale. For instance
1136
-
<literal>strtoupper("á")</literal> may return <literal>"Á"</literal> if the
1137
-
locale is correctly set and <literal>á</literal> is encoded with a single
1138
-
byte. If it is encoded in UTF-8, the correct result will not be returned
1139
-
and the resulting string may or may not be returned corrupted, depending
1140
-
on the current locale.
1315
+
operate byte-by-byte.
1141
1316
</simpara>
1142
1317
</listitem>
1143
1318
<listitem>
...
...
@@ -1147,9 +1322,6 @@ echo "\$foo==$foo; type is " . gettype ($foo) . "<br />\n";
1147
1322
<link linkend="book.intl">intl</link> extension and in the
1148
1323
<link linkend="book.pcre">PCRE</link> extension
1149
1324
(in the last case, only when the <literal>u</literal> modifier is used).
1150
-
Although this is due to their special purpose, the function
1151
-
<function>utf8_decode</function> assumes a UTF-8 encoding and the
1152
-
function <function>utf8_encode</function> assumes an ISO-8859-1 encoding.
1153
1325
</simpara>
1154
1326
</listitem>
1155
1327
</itemizedlist>
1156
1328