September 30th, 2008XPath in SimpleXML

SimpleXML as it name imply, is a very simple API to traverse XML implemented specially in PHP language. It is very similar to the XPath, but since it has more PHP friendly syntax PHP developers really like to use it.
As an Example for this XML:

<dwml>
  <data>
    <location>
      <location-key>point1</location-key>
      	<point latitude="37.39" longitude="-122.07"></point>
      </location>
  </data>
  .....
</dwml>

XPATH Query to take the latitude in more general way

/dwml/data/location/point/@latitude

Where as with simple XML it is just a familiar PHP statement,

$simplexml->data->location->point->attributes()->latitude

Anyway still you can use the xpath inside your simplexml code. You can execute xpath queries by calling xpath function from any SimpleXMLEelment. It will return an array of SimpleXMLElement that match your query. So for the above example your XPath query would be something  like this,

$simplexml= new SimpleXMLElement($xml);
$lats =  $simplexml->xpath('/dwml/data/location/point/@latitude');
echo $lats[0];

This simplicity allows you to choose between these two methods interchangeably as best fit per your application. Here are some cases that I think use of XPath is more easy.

Ability to use of XPath shorthand
Take the above example XML it self. If there is only one attribute named ‘latitude’ throughout the XML you can call that value by

//@latitude

If XML node name or attribute name contains characters like ‘-’ which are not allowed in PHP for variable names
In the example if you want to access the value inside ‘location-key’ node using simplexml it would be like,

echo $simplexml->data->location->location-key;

This will not give you the expected result as PHP will try to think ‘location’ and ‘key’ as two taken in ‘location-key’. So this particular code can be replaced with the xpath function.

$keys =  $simplexml->xpath('/dwml/data/location/location-key');
echo $keys[0];

You want to iterate through node with a same name in an XML

If the nodes which we want to iterate is in organized positions in an XML (like the one in following) both approaches can be used with same easiness.

<root>
  <mynode>value1</mynode>
  <mynode>value2</mynode>
  <mynode>value3</mynode>
  <mynode>value4</mynode>
  <mynode>value5</mynode>
</root>

But how if the ‘mynode’ was in different locations in an XML like this,

<root>
  <anothernode>
     <mynode>value1</mynode>
  </anothernode>
  <anotheranothernode>
     <anotheranotheranothernode>
       <mynode>value2</mynode>
     </anotheranotheranothernode>
     <mynode>value3</mynode>
  </anotheranothernode>
  <mynode>value4</mynode>
</root>

You can iterate all the ‘mynode’ nodes with the following xpath query.

//mynode

Note that this case can be handled easily in DOM with the getElementsByName.

To use the power of XPath functions and Axes
You can use the XPath functions like last(), position() and even string manipulation functions like substring() in a XPath statement.
For an example in the above example, if you want only take the value of last ‘mynode’ just use this expression

//mynode[last()]

And you can use the power of axes in Xpath Queries. If you want to iterate all the ancestors from current node just use this query

'ancestor::*'

Access elements with different namespaces

<saleItems>
   <ns1:car xmlns:ns1="http:/toyota.xxx.com">$3000</ns1:car>
   <ns2:car xmlns:ns2="http:/suziki.rrr.com">$4000</ns2:car>
</saleItems>

You want to extract the cars from simpleXML. You can do this by following code.

$simplexml= new SimpleXMLElement($xml);
$ns1_childs = $simplexml->children("http:/toyota.xxx.com");
echo $ns1_childs->car;

$ns2_childs = $simplexml->children("http:/suziki.rrr.com");
echo $ns2_childs->car;

Every time you access a different namespace you have to call the children method with the namespace as an argument.

If you use XPath approach, you first register the namespces with an prefix and just use those prefix in your XPath queries.

$simplexml= new SimpleXMLElement($xml);

$simplexml->registerXPathNamespace("p1", "http:/toyota.xxx.com");
$simplexml->registerXPathNamespace("p2", "http:/suziki.rrr.com");

$toyota_cars = $simplexml->xpath('//p1:car');
$suziki_cars = $simplexml->xpath('//p2:car');

echo $toyota_cars[0];
echo $suziki_cars[0];

SimpleXML is simple and powerful in its native form. But whenever it is impossible or difficult to use you don’t need to go back for tedious DOM or manual string manipulation. You can use the xpath queries to get the work done within the simplexml environment itself.

The REST API for Twitter is very simple to learn and implement. And it has a comprehensive documentation.

Here is some selected operations to just to show its design. Note that here userid should be replaced with a valid twitter user id or user name and the format should be changed to the required output format (.xml, json, rss, atom are possible output formats)

Operation HTTP Verb URL Example HTTP Request (Setting username as ‘dimuthu’ and the output format as .xml)
Get public (all users) statuses GET http://twitter.com/statuses/public_timeline GET http://twitter.com/statuses/public_timeline
Get a user statuses GET http://twitter.com/statuses/user_timeline/userid.format GET http://twitter.com/statuses/user_timeline/dimuthu.xml
Get a particular status GET http://twitter.com/statuses/show/statusid.format GET http://twitter.com/statuses/show/938135815.xml
Create a new status POST http://twitter.com/statuses/update.format POST http://twitter.com/statuses/update.xml
Authorization: Basic xxxx
………..
<status>my status message</status>
Delete a particular status DELETE/ POST http://twitter.com/statuses/destroy/statusid.xml DELETE http://twitter.com/statuses/destroy/939390294.xml
Authorization: Basic xxxx
………..

After having look at this API, the first question I had was whether this API is actually RESTful. In RESTful design we expect to map a resource to a URL and do CRUD (Create, Read, Update and Delete) operations using request with different Http Verbs (POST, GET, PUT, DELETE) with that same URL. Look at my blog on RESTful CRUD Data Services Demo for more clarification.

So if ever the API is designed following the above theory it would have been like this.

Operation HTTP Request
Get all statuses GET http://twitter.com/statuses.xml
Get a particular user statuses GET http://twitter.com/users/{user_id}/statuses.xml
Get a particular statuses of a user GET http://twitter.com/users/{user_id}/statuses/{status_id}.xml
Crete a particular statuses of a user POST http://twitter.com/users/{user_id}/statuses.xml
Update a particular statuses of a user PUT http://twitter.com/users/{user_id}/statuses/{status_id}.xml
Delete a particular statuses of a user DELETE http://twitter.com/users/{user_id}/statuses/{status_id}.xml

So I think although Twitter API is really nice and easy, it is not really a RESTful API. If it was really RESTful, URLs might have been more organized so more easier to remember or predict. But still this API allows thousands of third party application to talk to the twitter, demonstrating the value of  providing web services over just providing some web pages in a website.

PHP Web Services Demo Site contains a set of nice tools that help development of web services in PHP.

  • WSDL2PHP tool - This allow you to generate PHP code for your WSDL. Note that this need your wsdl to be in a URL that it can access.
  • PHP2WSDL tool - Here you can paste your annotated PHP code and get the WSDL (both version 1.1 and 2.0) generated.  You can find the annotation syntax in here.
  • DBS2PHP tool - WSO2 has Data Services library implemented in both Java and PHP. In Java Data Services you give the configuration via an XML (in .dbs extension). Whereas in PHP you give the configuration via a simple PHP code which use arrays to feed the configuration parameters. If you are more familiar in writing XML than PHP, you can first write the XML and then convert it to PHP using DBS2PHP tool.