In Apache Axis2/C AXIOM is used as the basic object model to represent XML. AXIOM provide a DOM like API that allows to traverse and build the XML very easily.

Anyway in underneath, AXIOM is different from DOM, as it has used some techniques to optimize the parsing of the XML as suited specially for SOAP message processing in web services. For an example the SOAP processor can validate a SOAP message by reading only some parts of the SOAP header fields, and if it is not valid, they can completely skip processing the body part. And since AXIOM is designed to built from a stream of data retrieved from a transport, sometimes SOAP processors can validate the message without the need of reading the full stream.

Anyway there should be lot of application that needs this optimization in parsing XMLs. They can easily adapt AXIOM/C to their application. Here is an AXIOM/C tutorial that covers both parsing and building XMLs from AXIOM. In this post I’d like to mention a code that can be used to retrieve an AXIOM from a String (char buffer) which we call as deserialization.

    axiom_node_t* AXIS2_CALL
    deserialize_my_buffer (
        const axutil_env_t * env,
        char *buffer)
    {
        axiom_xml_reader_t *reader = NULL;
        axiom_stax_builder_t *builder = NULL;
        axiom_document_t *document = NULL;
        axiom_node_t *payload = NULL;

        reader = axiom_xml_reader_create_for_memory (env,
            buffer, axutil_strlen (buffer),
            AXIS2_UTF_8, AXIS2_XML_PARSER_TYPE_BUFFER);

        if (!reader)
        {
            return NULL;
        }

        builder = axiom_stax_builder_create (env, reader);

        if (!builder)
        {
            return NULL;
        }
        document = axiom_stax_builder_get_document (builder, env);
        if (!document)
        {
            AXIS2_LOG_ERROR (env->log, AXIS2_LOG_SI,
                    "Document is null for deserialization");
            return NULL;
        }

        payload = axiom_document_get_root_element (document, env);

        if (!payload)
        {
            AXIS2_LOG_ERROR (env->log, AXIS2_LOG_SI,
                    "Root element of the document is not found");
            return NULL;
        }
        axiom_document_build_all (document, env);

        axiom_stax_builder_free_self (builder, env);

        return payload;
    }

Regardless of the fact this piece of code is been used many time by Axis2 and application that uses Axis2, it has never been identified as a core AXIOM function. I think it is better we have this function as an alternative method to create an axiom.

axiom_node_t *AXIS2_CALL
axiom_node_create_from_buffer(const axutil_env_t *env, axis2_char_t *buffer);

I already suggested this in Axis2/C mailing list and hopefully it will be included from the next release.

Here when we create the axiom tree function from the character buffer, we used “axiom_xml_reader_create_for_memory” function. Anyway whenever transport read data stream from wire it always uses the “axiom_xml_reader_create_for_io” function.

    /**
     * This create an instance of axiom_xml_reader to
     * parse a xml document in a buffer. It takes a callback
     * function that takes a buffer and the size of the buffer
     * The user must implement a function that takes in buffer
     * and size and fill the buffer with specified size
     * with xml stream, parser will call this function to fill the
     * buffer on the fly while parsing.
     * @param env environment MUST NOT be NULL.
     * @param read_input_callback() callback function that fills
     * a char buffer.
     * @param close_input_callback() callback function that closes
     * the input stream.
     * @param ctx, context can be any data that needs to be passed
     * to the callback method.
     * @param encoding encoding scheme of the xml stream
     */
    AXIS2_EXTERN axiom_xml_reader_t *AXIS2_CALL
    axiom_xml_reader_create_for_io(
        const axutil_env_t * env,
        AXIS2_READ_INPUT_CALLBACK read_callback,
        AXIS2_CLOSE_INPUT_CALLBACK close_callback,
        void *ctx,
        const axis2_char_t * encoding);

As you may have noticed it requires us to implement a “read_callback” function. Here is an example function prototype to implement this callback.

    int AXIS2_CALL
    some_function(
            char *buffer,
            int size,
            void *ctx);

This function will be called by the parser as required to parse the XML read from some stream.

So if your application involves reading data from a stream you are always recommended to use this function (i.e. “axiom_xml_reader_create_for_io”) instead of “axiom_xml_read_create_for_buffer” to create the AXIOM model more effectively.

September 30th, 2008XPath in SimpleXML

SimpleXML as it name imply, is a very simple API to traverse XML implemented specially in PHP language. It is very similar to the XPath, but since it has more PHP friendly syntax PHP developers really like to use it.
As an Example for this XML:

<dwml>
  <data>
    <location>
      <location-key>point1</location-key>
      	<point latitude="37.39" longitude="-122.07"></point>
      </location>
  </data>
  .....
</dwml>

XPATH Query to take the latitude in more general way

/dwml/data/location/point/@latitude

Where as with simple XML it is just a familiar PHP statement,

$simplexml->data->location->point->attributes()->latitude

Anyway still you can use the xpath inside your simplexml code. You can execute xpath queries by calling xpath function from any SimpleXMLEelment. It will return an array of SimpleXMLElement that match your query. So for the above example your XPath query would be something  like this,

$simplexml= new SimpleXMLElement($xml);
$lats =  $simplexml->xpath('/dwml/data/location/point/@latitude');
echo $lats[0];

This simplicity allows you to choose between these two methods interchangeably as best fit per your application. Here are some cases that I think use of XPath is more easy.

Ability to use of XPath shorthand
Take the above example XML it self. If there is only one attribute named ‘latitude’ throughout the XML you can call that value by

//@latitude

If XML node name or attribute name contains characters like ‘-’ which are not allowed in PHP for variable names
In the example if you want to access the value inside ‘location-key’ node using simplexml it would be like,

echo $simplexml->data->location->location-key;

This will not give you the expected result as PHP will try to think ‘location’ and ‘key’ as two taken in ‘location-key’. So this particular code can be replaced with the xpath function.

$keys =  $simplexml->xpath('/dwml/data/location/location-key');
echo $keys[0];

You want to iterate through node with a same name in an XML

If the nodes which we want to iterate is in organized positions in an XML (like the one in following) both approaches can be used with same easiness.

<root>
  <mynode>value1</mynode>
  <mynode>value2</mynode>
  <mynode>value3</mynode>
  <mynode>value4</mynode>
  <mynode>value5</mynode>
</root>

But how if the ‘mynode’ was in different locations in an XML like this,

<root>
  <anothernode>
     <mynode>value1</mynode>
  </anothernode>
  <anotheranothernode>
     <anotheranotheranothernode>
       <mynode>value2</mynode>
     </anotheranotheranothernode>
     <mynode>value3</mynode>
  </anotheranothernode>
  <mynode>value4</mynode>
</root>

You can iterate all the ‘mynode’ nodes with the following xpath query.

//mynode

Note that this case can be handled easily in DOM with the getElementsByName.

To use the power of XPath functions and Axes
You can use the XPath functions like last(), position() and even string manipulation functions like substring() in a XPath statement.
For an example in the above example, if you want only take the value of last ‘mynode’ just use this expression

//mynode[last()]

And you can use the power of axes in Xpath Queries. If you want to iterate all the ancestors from current node just use this query

'ancestor::*'

Access elements with different namespaces

<saleItems>
   <ns1:car xmlns:ns1="http:/toyota.xxx.com">$3000</ns1:car>
   <ns2:car xmlns:ns2="http:/suziki.rrr.com">$4000</ns2:car>
</saleItems>

You want to extract the cars from simpleXML. You can do this by following code.

$simplexml= new SimpleXMLElement($xml);
$ns1_childs = $simplexml->children("http:/toyota.xxx.com");
echo $ns1_childs->car;

$ns2_childs = $simplexml->children("http:/suziki.rrr.com");
echo $ns2_childs->car;

Every time you access a different namespace you have to call the children method with the namespace as an argument.

If you use XPath approach, you first register the namespces with an prefix and just use those prefix in your XPath queries.

$simplexml= new SimpleXMLElement($xml);

$simplexml->registerXPathNamespace("p1", "http:/toyota.xxx.com");
$simplexml->registerXPathNamespace("p2", "http:/suziki.rrr.com");

$toyota_cars = $simplexml->xpath('//p1:car');
$suziki_cars = $simplexml->xpath('//p2:car');

echo $toyota_cars[0];
echo $suziki_cars[0];

SimpleXML is simple and powerful in its native form. But whenever it is impossible or difficult to use you don’t need to go back for tedious DOM or manual string manipulation. You can use the xpath queries to get the work done within the simplexml environment itself.

In a WSDL, XML Schema is the section where it define the message format for each operations, which eventually become the real API that users are interested. And it is the most tricky part of the WSDL. Nowadays there are many tools that you can design and use WSDLs without any needs in knowing the meaning of a single line of the WSDL. But there are situations that you may find it is better you have some knowledge in XML Schema section and in WSDL overall.
For this post I m taking a simple example of use of nillable=”true” and minOccurs=”0″. Take the following example.

<xs:element name="myelements"