Getting XML data from a external page and parsing it with PHP
- by James P
I'm trying to create a database of World of Warcraft gems. If I go to this page:
http://www.wowarmory.com/search.xml?fl[source]=all&fl[type]=gems&fl[subTp]=purple&searchType=items
And go to View Source in Firefox, I see a tonne of XML data which is exactly what I want. I wrote up this quick script to try and parse some of it:
<?php
$gemUrls = array(
'Blue' => 'http://www.wowarmory.com/search.xml?fl[source]=all&fl[type]=gems&fl[subTp]=blue&searchType=items',
'Red' => 'http://www.wowarmory.com/search.xml?fl[source]=all&fl[type]=gems&fl[subTp]=red&searchType=items',
'Yellow' => 'http://www.wowarmory.com/search.xml?fl[source]=all&fl[type]=gems&fl[subTp]=yellow&searchType=items',
'Meta' => 'http://www.wowarmory.com/search.xml?fl[source]=all&fl[type]=gems&fl[subTp]=meta&searchType=items',
'Green' => 'http://www.wowarmory.com/search.xml?fl[source]=all&fl[type]=gems&fl[subTp]=green&searchType=items',
'Orange' => 'http://www.wowarmory.com/search.xml?fl[source]=all&fl[type]=gems&fl[subTp]=orange&searchType=items',
'Purple' => 'http://www.wowarmory.com/search.xml?fl[source]=all&fl[type]=gems&fl[subTp]=purple&searchType=items',
'Prismatic' => 'http://www.wowarmory.com/search.xml?fl[source]=all&fl[type]=gems&fl[subTp]=purple&searchType=items'
);
// Get blue gems
$blueGems = file_get_contents($gemUrls['Blue']);
$xml = new SimpleXMLElement($blueGems);
echo $xml->items[0]->item;
?>
But I get a load of errors like this:
Warning:
SimpleXMLElement::__construct()
[simplexmlelement.--construct]:
Entity: line 20: parser error :
xmlParseEntityRef: no name in
C:\xampp\htdocs\WoW\index.php on line
19
Warning:
SimpleXMLElement::__construct()
[simplexmlelement.--construct]:
if(Browser.iphone &&
Number(getcookie2("mobIntPageVisits"))
< 3 && getcookie2( in
C:\xampp\htdocs\WoW\index.php on line
19
I'm not sure what's wrong. I think file_get_contents() is bringing back data that isn't XML, maybe some Javascript files judging by the iPhone parts in the errors.
Is there any way to just get back the XML from that page? Without any HTML or anything?
Thanks :)