The simplexml extension of php is quite simple and easy to use when it comes to parsing "well-formatted" xml files. Well formatted means , xml that is not broken or does not have too many errors.
One of the most handy functions of this extension is simplexml_load_string. Here is an example :
<?php $xml = '<?xml version="1.0" encoding="UTF-8"?> <resultset> <row> <name>Happy</name> <age>20</age> </row> <row> <name>Harry</name> <age>25</age> </row> </resultset>'; $simple = simplexml_load_string($xml); print_r($simple);
print_r or var_dump , whatever you like. The output should be :
SimpleXMLElement Object ( [row] => Array ( [0] => SimpleXMLElement Object ( [name] => Happy [age] => 20 ) [1] => SimpleXMLElement Object ( [name] => Harry [age] => 25 ) ) )
Nothing special over there. But one thing that many developers try to do is convert the above object into an array for easy looping over a foreach somewhere in their code.
You can attempt this in many ways. Look at the php documentation page and many people have commented with their version of code to convert a simplexml object to an array.
Here is a common trick :
<?php $xml = '<?xml version="1.0" encoding="UTF-8"?> <resultset> <row> <name>Happy</name> <age>20</age> </row> <row> <name>Harry</name> <age>25</age> </row> </resultset>'; $simple = simplexml_load_string($xml); $arr = json_decode( json_encode($simple) , 1); print_r($arr);
Those 2 json functions are smart enough to convert a simplexml object to an array recursively. The output might be :
Array ( [row] => Array ( [0] => Array ( [name] => Happy [age] => 20 ) [1] => Array ( [name] => Harry [age] => 25 ) ) )
But the above approach has a problem. Have a closer look :
<?php $xml_single = '<?xml version="1.0" encoding="UTF-8"?> <resultset> <row> <name>Happy</name> <age>20</age> </row> </resultset>'; $xml_multi = '<?xml version="1.0" encoding="UTF-8"?> <resultset> <row> <name>Happy</name> <age>20</age> </row> <row> <name>Harry</name> <age>25</age> </row> </resultset>'; $simple_single = simplexml_load_string($xml_single); $simple_multi = simplexml_load_string($xml_multi); $single_array = json_decode( json_encode($simple_single) , 1); $multi_array = json_decode( json_encode($simple_multi) , 1); print_r($single_array); print_r($multi_array);
The above code has 2 xml strings. One has a single row element and another has 2 of them. And the output as you can guess is :
Array ( [row] => Array ( [name] => Happy [age] => 20 ) ) Array ( [row] => Array ( [0] => Array ( [name] => Happy [age] => 20 ) [1] => Array ( [name] => Harry [age] => 25 ) ) )
Looks OK ? Hmm , as we can see above , the array indexing is not identical. So if someone where to do a :
foreach($arr['row'] as $c => $row)
then it would work in the 2nd case but fail in the first one.
You may want to try an alternate function like :
function xml2array ( $xmlObject, $out = array () ) { foreach ( (array) $xmlObject as $index => $node ) { $out[$index] = ( is_object ( $node ) ) ? xml2array ( $node ) : $node; } return $out; }
like this :
<?php $xml_single = '<?xml version="1.0" encoding="UTF-8"?> <resultset> <row> <name>Happy</name> <age>20</age> </row> </resultset>'; $xml_multi = '<?xml version="1.0" encoding="UTF-8"?> <resultset> <row> <name>Happy</name> <age>20</age> </row> <row> <name>Harry</name> <age>25</age> </row> </resultset>'; $simple_single = simplexml_load_string($xml_single); $simple_multi = simplexml_load_string($xml_multi); $single_array = xml2array ($simple_single); $multi_array = xml2array ($simple_multi); print_r($single_array); print_r($multi_array); function xml2array ( $xmlObject, $out = array () ) { foreach ( (array) $xmlObject as $index => $node ) { $out[$index] = ( is_object ( $node ) ) ? xml2array ( $node ) : $node; } return $out; }
But the result is same :
Array ( [row] => Array ( [name] => Happy [age] => 20 ) ) Array ( [row] => Array ( [0] => SimpleXMLElement Object ( [name] => Happy [age] => 20 ) [1] => SimpleXMLElement Object ( [name] => Harry [age] => 25 ) ) )
The problem with the above methods is that they all are looping over the simplexml object as if it was an array. This approach may not work well. The simplexmlelement class provides a method called children() which provides the children of an element.
Use the children method
function xml2array($xml) { $arr = array(); foreach ($xml->children() as $r) { $t = array(); if(count($r->children()) == 0) { $arr[$r->getName()] = strval($r); } else { $arr[$r->getName()][] = xml2array($r); } } return $arr; }
to be used like this :
<?php $xml_single = '<?xml version="1.0" encoding="UTF-8"?> <resultset> <row> <name>Happy</name> <age>20</age> </row> </resultset>'; $xml_multi = '<?xml version="1.0" encoding="UTF-8"?> <resultset> <row> <name>Happy</name> <age>20</age> </row> <row> <name>Harry</name> <age>25</age> </row> </resultset>'; $simple_single = simplexml_load_string($xml_single); $simple_multi = simplexml_load_string($xml_multi); $single_array = xml2array ($simple_single); $multi_array = xml2array ($simple_multi); print_r($single_array); print_r($multi_array); function xml2array($xml) { $arr = array(); foreach ($xml->children() as $r) { $t = array(); if(count($r->children()) == 0) { $arr[$r->getName()] = strval($r); } else { $arr[$r->getName()][] = xml2array($r); } } return $arr; }
and the ouptut is :
Array ( [row] => Array ( [0] => Array ( [name] => Happy [age] => 20 ) ) ) Array ( [row] => Array ( [0] => Array ( [name] => Happy [age] => 20 ) [1] => Array ( [name] => Harry [age] => 25 ) ) )
Now in this output we can see , that the keys are quite consistent and if we do a foreach($arr['row'] as $row) it will work fine in both cases.
Few notable points in the latest method is :
1. count($r->children()) == 0 - If the children count of a simplexmlelement object is 0 it can be taken as an empty node and shown as a string in the array.
2. $arr[$r->getName()] = strval($r) - $r is a Simplexmlelement object and to get its value strval should be used. getName method gives the name of the tag.
Try the method!!
All the code in this post has been htmlentitied()
Works perfectly, thanks.
$t = array(); // I think you can remove this line in xml2array() method.
This “tiny” great solution save me a lot of codification hours.
Thank you, bro!
Really helpful to me this content
Thank you so much
Hi, I’m currently using your code in a new PHP script to generate KML (Googlemap, googleearth, openstreetmap) based on ical from GoogleCalendar.
I plan to deliver it using GNU LGLP license
I could mention your name and can put your email address, are you happy with that ?
Best regards
Marc Van Coillie (from France)
yes but this is not for large xml files over 200k lines. You will get an empty json_decode result
Thank you! Very helpful.
This just savedme after pulling at my hair for quite a whhile.
thanks so much.
Little “fast” fix to fix:
SimpleXMLElement Object
(
[gebruiker] => Array
(
[0] => 94
[1] => [email protected]
)
to:
[gebruiker] => Array
(
[0] => 94
[1] => [email protected]
)
==> (see if($xml->$k->count() == 1){ )
function xml2array($xml) {
$arr = array();
foreach ($xml->children() as $k => $r) {
if (count($r->children()) == 0) {
if($xml->$k->count() == 1){
$arr[$r->getName()] = strval($r);
}else{
$arr[$r->getName()][] = strval($r);
}//Endif
} else {
print_r(xml2array($r));
$arr[$r->getName()][] = xml2array($r);
}//Endif
}//Endofreach
return $arr;
}