Understanding PHP XML Parsers

The purpose of this article is to present a brief overview of the PHP XML parsers, including SimpleXML Parser, SimpleXML Reader, SimpleXML Get Node/Attribute Values, XML Expat Parser, and XML DOM Parser and to discuss examples of how to use them to parse and handle XML data in PHP.

This article should give you a clear picture of the differences between XML parsers and help you decide which one is the most suitable fit for your needs by the end of it.

A markup language known as XML (Extensible Markup Language) is widely used in the computer industry for storing and transporting data between systems.

A number of methods are available in PHP to parse XML documents, each with its own advantages and disadvantages.



What Is XML?

In XML we can structure data so it can be shared across websites using a structured format that can be easily understood.

The XML format is used by a number of web technologies, such as RSS feeds and podcasts.

Creating XML files is very easy. The format is similar to HTML except that you’re able to create your own tags in it.


What is an XML Parser?

You will need an XML parser in order to read, update, create, and manipulate an XML document.

There are two main types of XML parsers that are available in PHP.

  • Tree-Based Parsers
  • Event-Based Parsers

Tree Based Parsers

Tree-based parsers store the entire XML document in memory and then convert it into a tree structure in memory.

It analyzes the entire document, and gives you access to all the elements in the tree structure (DOM).

XML parsers of this type work well for smaller files but may cause problems when working with large files as they cause the parser to be overloaded.

The following are some examples of tree-based parsers:

  • SimpleXML
  • DOM

SimpleXML Parser

SimpleXML is a PHP extension that provides an easy way to manipulate XML data. It allows developers to access XML elements and attributes as if they were properties of an object.

The SimpleXML parser is useful when working with smaller XML documents, as it is more memory-efficient than the DOM parser.

DOM Parser

DOM stands for Document Object Model. This parser reads an entire XML document into memory and creates a tree-like structure that can be easily navigated.

The DOM parser is useful when dealing with smaller XML documents since it requires the entire document to be loaded into memory at once.

Event Based Parsers

The event-based parser allows you to interact in real time with each node in the document, rather than holding the entire document in memory, since each node is read one by one as it is read in and read out from memory.

As soon as you move onto the next node, the old one is discarded and replaced with the new one.

A parser of this type is well suited to the processing of large XML documents. There is a faster parsing process and less memory consumption with it.

The following is an example of an event-based parser:

  • XMLReader
  • XML Expat Parser

XML Reader

If you are looking for a fast and memory-efficient way to read large XML files in PHP, XMLReader is a great choice.

This PHP extension reads an XML document in a streaming fashion, which means it doesn’t need to load the entire file into memory at once. Instead, it reads the file one node at a time and provides methods to access the data within each node.

This makes it a great choice for working with large, complex XML files where memory usage is a concern.

PHP XML Parsers

To use an XML parser in PHP, you need to first load the XML document into memory. This can be done using the file_get_contents() function or by reading the file directly. Once the document is loaded, you can create an instance of the parser and start parsing the document.


PHP SimpleXML Parser

SimpleXML provides you, as a developer, with a familiar interface that allows you to access and manipulate XML elements and attributes using an object-oriented syntax, making it easy and intuitive.

Parsing small to medium-sized XML documents has never been easier than with SimpleXML. Its ease of use and simplicity in parsing XML documents have made it a popular choice among developers like you. The parser in SimpleXML is based on a tree structure, providing a clear and organized way to navigate and manipulate XML data.

One of the key features of SimpleXML is its ability to read an element’s name, attributes, and textual contents with just a few lines of code. This is especially useful if you’re already familiar with the structure or layout of the XML document, allowing you to quickly extract the data you need.

With SimpleXML, you can convert an XML document into a data structure that can be easily iterated as an array or object collection of data. This makes it simple for you to extract and manipulate the data according to your requirements, saving you time and effort in parsing and extracting data from XML documents.

Compared to other XML parsers like DOM or Expat, SimpleXML requires fewer lines of code to read text data from an element. This makes it a concise and efficient choice for reading text data, allowing you to write cleaner and more concise code.


Installation

Throughout PHP 5, the SimpleXML functions have been implemented as part of the core PHP code. These functions do not require any installation in order to be used.


PHP SimpleXML – Read From String

In order to read XML data from a string, the PHP function simplexml_load_string() is used.

Let us suppose that we have a variable which contains XML data, like the following:

$xmlData ="<?xml version='1.0' encoding='UTF-8'?>
<data>
<to>Denis</to>
<from>Mr Examples</from>
<heading>Important</heading>
<body>simplexml_load_string() can use to parse XML data</body>
</data>";

Here is an example of how to load XML data from a string and use the simplexml_load_string() function to do so:

Example: 

<?php $xmlData ="<?xml version=\"1.0\" encoding=\"UTF-8\"?> <data> <to>Denis</to> <from>Mr Examples</from> <heading>Important</heading> <body>simplexml_load_string() can use to parse XML data</body> </data>"; $xml=simplexml_load_string($xmlData) or die("Error: Cannot create object"); print_r($xml); ?>

Here is what the code above will produce as an output:

SimpleXMLElement Object ( [to] => Denis [from] => Mr Examples [heading] => Important [body] => simplexml_load_string() can use to parse XML data )

Another example for better understanding of how to load XML data from a string using simplexml_load_string() and display by looping all data:

Example: 

<?php $book="<?xml version=\"1.0\" encoding=\"UTF-8\"?> <Book> <Name>PHP</Name> <Name>Java</Name> <Name>Python</Name> <Name>JavaScript</Name> </Book>"; $xmlStr=simplexml_load_string($book); echo $xmlStr->getName() . "<br>"; foreach($xmlStr->children() as $book){ echo $book->getName() . ": " . $book . "<br>"; } ?>

Example Explanation

We have some PHP code here that loads an XML string into a SimpleXMLElement object and then prints out some information from that object.

First, we define a variable called $book and assign it an XML string that represents a book with four different programming languages. Then, we use the simplexml_load_string() function to parse that XML string and create a SimpleXMLElement object called $xmlStr.

Next, we use the getName() method to print out the name of the root element of the XML document, which is “Book”.

Finally, we use a foreach loop to iterate through each child element of the $xmlStr object and print out its name and value. In this case, each child element is a programming language represented by a <Name> element, so we print out the name of the element followed by its value (the name of the programming language) on a separate line for each language.

So when the code is executed, it will output:

Php Xml Parsers

Tip for handling XML errors when loading a document: If there is an XML error in the document, use the libxml function to get all XML errors and then loop over them. Below is an example in which we are trying to load a broken XML string:

Example: 

<?php libxml_use_internal_errors(true); $xmlData = "<?xml version=\"1.0\" encoding=\"UTF-8\"?> <document> <user>Denis zakaria</wronguser> <email>[email protected]</wrongemail> </document>"; $xml = simplexml_load_string($xmlData); if ($xml === false) { echo "Error loading XML: "; foreach (libxml_get_errors() as $error) { echo " ", $error->message; } } else { print_r($xml); } ?>

Here is what the code above will produce as an output:

Error loading XML: Opening and ending tag mismatch: user line 3 and wronguser Opening and ending tag mismatch: email line 4 and wrongemail

PHP SimpleXML Read From File

PHP simplexml_load_file() is an XML reading function that you can use to extract XML data from a file.

We assume that there is a file called “data.xml”, and it has the following format:

$xmlData ="<?xml version='1.0' encoding='UTF-8'?>
<data>
<to>Denis</to>
<from>Mr Examples</from>
<heading>Important</heading>
<body>simplexml_load_string() can use to parse XML data</body>
</data>";

Here is an example of XML data being read from a file using the simplexml_load_file() function, as shown in the following code:

Example: 

<?php $xml=simplexml_load_file("data.xml") or die("Error: Cannot create an object"); print_r($xml); ?>

Here is what the code above will produce as an output:

SimpleXMLElement Object ( [to] => Denis [from] => Mr Examples [heading] => Important [body] => simplexml_load_string() can use to parse XML data )
Hint: In the next, we will look at how to extract values from an XML file with SimpleXML and then apply them to nodes.

PHP SimpleXML Get Node/Attribute Values

This PHP SimpleXML extension provides you with the ability to access and modify XML documents at an ease and in a hassle-free manner.

Whenever you work with XML, one of the most common tasks that you have to do is extract data from the nodes and attributes.


PHP SimpleXML Get Node Values

From the “data.xml” file we are able to get the node values as follows:

Example: 

<?php $data=simplexml_load_file("data.xml") or die("Error: Cannot make object"); echo $data->to . "<br>"; echo $data->from . "<br>"; echo $data->heading . "<br>"; echo $data->body; ?>
Using the code above, the following will be the output:
PHP SimpleXML Get Node Values

Another XML File

Let us suppose that there is an XML file known as “bookstore.xml“, which looks like this:

<?xml version="1.0" encoding="utf-8"?>
<bookstore>
   <book category="PHP">
      <title lang="en">PHP and MySQL Web Development</title>
      <author>Luke Welling</author>
      <year>2001</year>
      <price>41.99</price>
   </book>

   <book category="JAVA">
      <title lang="en">Java: A Beginner's Guide, Eight Edition</title>
      <author>Herbert Schildt</author>
      <year>2018</year>
      <price>25.15</price>
   </book>

   <book category="Python">
      <title lang="en-us">Programming Python</title>
      <author>Mark Lutz</author>
      <year>2014</year>
      <price>71.84</price>
   </book>

   <book category="JavaScript">
      <title lang="en-us">You Don't Know JS: ES6 and Beyond</title>
      <author>Kyle Simpson</author>
      <year>2015</year>
      <price>13.94</price>
   </book>
</bookstore>

PHP SimpleXML – Get Node Values of Specific Elements

The following example will show you how to get the value of the <title> element of the first and second <books> of the “bookstore.xml” file by using the following code:

Example: 

<?php $books=simplexml_load_file("bookstore.xml") or die("Error: Unable to create an object"); echo $books->book[0]->title . "<br>"; echo $books->book[1]->title; ?>
Here is what the code above will produce as an output:
Php Xml Get Node Values of Specific Elements

PHP SimpleXML Get Node Values Through Loop

In the following example, we loop over all of the <book> elements in the “bookstore.xml” file to get the node values for the nodes for the titles, authors, years, and prices for each book:

Example: 

<?php $books=simplexml_load_file("bookstore.xml") or die("Error: Unable to create an object"); foreach($books->children() as $book) { echo "Title: ". $book->title . "<br>"; echo "Author: ". $book->author . "<br>"; echo "Year: ". $book->year . "<br>"; echo "Price: ". $book->price . " "; } ?>

Example Explanation

In this example, we are using PHP to load an XML file called “bookstore.xml” and store it in a variable called “$books”. Then, we are using a “foreach” loop to iterate through each child element of the “books” element in the XML file.

For each “book” element, we are printing out its title, author, year, and price using the “echo” statement. Specifically, we are accessing the values of the “title”, “author”, “year”, and “price” child elements of the current “book” element using the “->” operator.

So, as a group, we are parsing an XML file and displaying the information for each book in a readable format.

So, here is what the code above will produce as an output:

Title: PHP and MySQL Web Development
Author: Luke Welling
Year: 2001
Price: 41.99

Title: Java: A Beginner's Guide, Eight Edition
Author: Herbert Schildt
Year: 2018
Price: 25.15

Title: Programming Python
Author: Mark Lutz
Year: 2014
Price: 71.84

Title: You Don't Know JS: ES6 & Beyond
Author: Kyle Simpson
Year: 2015
Price: 13.94

PHP SimpleXML Get Attribute Values

Here is an example of getting the “category” attribute value of the first book element and the “lang” attribute value of the second book element:

Example: 

<?php $books=simplexml_load_file("bookstore.xml") or die("Error: Unable to create an object"); echo $books->book[0]['category'] . "<br>"; echo $books->book[1]->title['lang']; ?>
Here is what the code above will produce as an output:
PHP
en

PHP SimpleXML Get Attribute Values – Loop

A simple example of getting the attribute values of the ‘title’ elements in the “bookstore.xml” file can be as follows:

Example: 

<?php $books=simplexml_load_file("bookstore.xml") or die("Error: Unable to create an object"); foreach ($books->children() as $book) { echo $book->title['lang']; echo " "; } ?>
Here is what the code above will produce as an output:
en
en
en-us
en-us

PHP XML Expat Parser

The PHP XML Expat Parser extension provides a versatile and useful solution that can be used to parse and manipulate XML data.

The Expat parser is different from other PHP extensions that can parse XML data.

It uses a stream-based approach to parse XML data, so event streams are mapped onto XML data. Therefore, it is able to handle large XML documents efficiently.


The XML Expat Parser

Expat parsers are event-based parsers, and they process a stream of events.

Here is an example of an XML fraction that you can look at:

<from>Mr Examples</from>

As a result of analyzing the XML above, an event-based parser reports it as three events:

  • Element start: from
  • Section CDATA, value: Mr Examples
  • Element Close: from

In PHP, the XML Expat Parser functions can also be found as part of the core code. It is not necessary to install any software in order to use these functions.


The XML File

As an example, let us use the XML file “data.xml” in the following example:

$xmlData ="<?xml version='1.0' encoding='UTF-8'?>
<data>
<to>Denis</to>
<from>Mr Examples</from>
<heading>Important</heading>
<body>simplexml_load_string() can be use to parse XML data</body>
</data>";

Initializing the XML Expat Parser

Parsing the XML file requires initializing the XML Expat Parser in PHP, defining some handlers for various XML events.

Example: 

<?php // Initializing XML parser $parser = xml_parser_create(); // Function that creates the start of an element function start($parser, $e_name, $e_attrs) { switch ($e_name) { case "NOTE": echo "– Note – "; break; case "TO": echo "To: "; break; case "FROM": echo "From: "; break; case "HEADING": echo "Heading: "; break; case "BODY": echo "Message: "; } } function stop($parser, $e_name) { echo " "; } function char($parser, $data) { echo $data; } // element handler xml_set_element_handler($parser, "start", "stop"); // data handler xml_set_character_data_handler($parser, "char"); // Open XML file $file = fopen("data.xml", "r"); // Read data while ($data = fread($file, 4096)) { xml_parse($parser, $data, feof($file)) or die(sprintf( "XML Error: %s at line %d", xml_error_string(xml_get_error_code($parser)), xml_get_current_line_number($parser) )); } // Free the XML parser xml_parser_free($parser); ?>

Example Explanation

We are using PHP to parse an XML file. First, we initialize the XML parser using the “xml_parser_create()” function. Then, we define three functions for handling XML elements and character data.

The “start” function is called when the parser encounters the start of an XML element. It checks the element name and prints out the appropriate text based on the element type. For example, if the element is “NOTE”, it will print out “- Note -“.

The “stop” function is called when the parser encounters the end of an XML element. It doesn’t do anything in this example.

The “char” function is called when the parser encounters character data within an element. It simply prints out the character data.

We then set the element and character data handlers for the parser using “xml_set_element_handler()” and “xml_set_character_data_handler()”.

Next, we open the XML file using the “fopen()” function and read the data using “fread()”. We parse the data using “xml_parse()” and continue reading until we reach the end of the file.

If there is an error during parsing, we use the “die()” function to print out the error message and exit the program.

Finally, we free the XML parser using “xml_parser_free()”


PHP XML DOM Parser

PHP XML DOM Parser is a powerful tool which allows programmers to work with XML documents in order to manipulate the document object model (DOM) within the documents.

The DOM Parser helps you to process and read data from XML files through PHP in a very easy and simple manner.

DOM parsers work in a tree-based fashion based on the structure of documents.

You can see a fraction of the XML document by looking at the following:

<?xml version=”1.0″ encoding=”UTF-8″?>
<from>Mr Examples</from>

According to the DOM, the above XML appears as a tree structure in the following manner:

  • Level 1: XML Document
  • Level 2: Root element: <from>
  • Level 3: Text element: “Mr Examples”

Installation

There are functions within the PHP core that are used to parse DOM elements.

These functions do not require any installation in order to be used.


The XML File

Throughout this example, we are going to use the XML file below (“data.xml”):

$xmlData ="<?xml version='1.0' encoding='UTF-8'?>
<data>
<to>Denis</to>
<from>Mr Examples</from>
<heading>Important</heading>
<body>simplexml_load_string() can be use to parse XML data</body>
</data>";

Load XML And Output

We need to initialize an XML parser, load an XML file, and output the result:

Example: 

<?php $dataDOM = new DOMDocument(); $dataDOM->load("data.xml"); print $dataDOM->saveXML(); ?>

Here is what the code above will produce as an output:

Denis Mr Examples Important simplexml_load_string() can be use to parse XML data

The following HTML code will appear in the browser window if you select “View source” in the browser window:

<?xml version=”1.0″ encoding=”UTF-8″?>
<data>
<to>Denis</to>
<from>Mr Examples</from>
<heading>Important</heading>
<body>simplexml_load_string() can be use to parse XML data</body>
</data>

Here is an example of a DOMDocument-Object that is created and loaded with the XML data from the “data.xml” file.

A string value is then passed into the saveXML() function, so that we can output data from the internal XML document.


Looping Through XML

There is a procedure that needs to be followed in order to initialize the XML parser, load the XML, and loop over all elements within the <data> element:

Example: 

<?php $dataDOM = new DOMDocument(); $dataDOM->load("data.xml"); $element = $dataDOM->documentElement; foreach ($element->childNodes AS $item) { print $item->nodeName . " = " . $item->nodeValue . " "; } ?>

Here is what the code above will produce as an output:

to = Denis
from = Mr Examples
heading = Important
body = simplexml_load_string() can be use to parse XML data

There are empty text nodes between each element in the example above, which shows that there is a gap between each element.

When XML is generated, it is often the case that between the nodes of the document there is white space. If you are not aware of these elements, as part of the XML DOM parser, they sometimes cause problems for you.


Using the PHP XML DOM Parser in Your Applications

Now that you understand the basics of the PHP XML DOM parser, you can start using it in your own PHP applications. Here are some tips to keep in mind:

  1. Always check for errors: When working with XML data, it’s important to check for errors that might occur during parsing. You can use the libxml_get_errors() function to retrieve any errors that occurred during parsing.
  2. Use XPath expressions: XPath is a powerful language for navigating XML documents. You can use XPath expressions with the PHP XML DOM parser to select specific elements or attributes in an XML document.
  3. Be careful with memory usage: When parsing large XML documents, the PHP XML DOM parser can use a lot of memory. Be sure to free up memory when you’re done working with a DOMDocument object by calling its dispose() method.

Conclusion

In conclusion, PHP provides several XML parsers that can help you parse and extract data from XML files. SimpleXML and XML Expat are two popular PHP XML parsers that offer different approaches to parsing XML files. With these parsers, you can easily extract data from XML files, and manipulate it as needed for your application.

If you liked this article and found it informative regarding PHP scripting language, you can leave your feedback by reacting below.

We value your feedback.
+1
0
+1
0
+1
0
+1
0
+1
0
+1
0
+1
0

Subscribe To Our Newsletter
Enter your email to receive a weekly round-up of our best posts. Learn more!
icon

Leave a Reply

Your email address will not be published. Required fields are marked *