In Apache Axis2/C AXIOM is used as the basic object model to represent XML. AXIOM provide a DOM like API that allows to traverse and build the XML very easily.

Anyway in underneath, AXIOM is different from DOM, as it has used some techniques to optimize the parsing of the XML as suited specially for SOAP message processing in web services. For an example the SOAP processor can validate a SOAP message by reading only some parts of the SOAP header fields, and if it is not valid, they can completely skip processing the body part. And since AXIOM is designed to built from a stream of data retrieved from a transport, sometimes SOAP processors can validate the message without the need of reading the full stream.

Anyway there should be lot of application that needs this optimization in parsing XMLs. They can easily adapt AXIOM/C to their application. Here is an AXIOM/C tutorial that covers both parsing and building XMLs from AXIOM. In this post I’d like to mention a code that can be used to retrieve an AXIOM from a String (char buffer) which we call as deserialization.

    axiom_node_t* AXIS2_CALL
    deserialize_my_buffer (
        const axutil_env_t * env,
        char *buffer)
    {
        axiom_xml_reader_t *reader = NULL;
        axiom_stax_builder_t *builder = NULL;
        axiom_document_t *document = NULL;
        axiom_node_t *payload = NULL;

        reader = axiom_xml_reader_create_for_memory (env,
            buffer, axutil_strlen (buffer),
            AXIS2_UTF_8, AXIS2_XML_PARSER_TYPE_BUFFER);

        if (!reader)
        {
            return NULL;
        }

        builder = axiom_stax_builder_create (env, reader);

        if (!builder)
        {
            return NULL;
        }
        document = axiom_stax_builder_get_document (builder, env);
        if (!document)
        {
            AXIS2_LOG_ERROR (env->log, AXIS2_LOG_SI,
                    "Document is null for deserialization");
            return NULL;
        }

        payload = axiom_document_get_root_element (document, env);

        if (!payload)
        {
            AXIS2_LOG_ERROR (env->log, AXIS2_LOG_SI,
                    "Root element of the document is not found");
            return NULL;
        }
        axiom_document_build_all (document, env);

        axiom_stax_builder_free_self (builder, env);

        return payload;
    }

Regardless of the fact this piece of code is been used many time by Axis2 and application that uses Axis2, it has never been identified as a core AXIOM function. I think it is better we have this function as an alternative method to create an axiom.

axiom_node_t *AXIS2_CALL
axiom_node_create_from_buffer(const axutil_env_t *env, axis2_char_t *buffer);

I already suggested this in Axis2/C mailing list and hopefully it will be included from the next release.

Here when we create the axiom tree function from the character buffer, we used “axiom_xml_reader_create_for_memory” function. Anyway whenever transport read data stream from wire it always uses the “axiom_xml_reader_create_for_io” function.

    /**
     * This create an instance of axiom_xml_reader to
     * parse a xml document in a buffer. It takes a callback
     * function that takes a buffer and the size of the buffer
     * The user must implement a function that takes in buffer
     * and size and fill the buffer with specified size
     * with xml stream, parser will call this function to fill the
     * buffer on the fly while parsing.
     * @param env environment MUST NOT be NULL.
     * @param read_input_callback() callback function that fills
     * a char buffer.
     * @param close_input_callback() callback function that closes
     * the input stream.
     * @param ctx, context can be any data that needs to be passed
     * to the callback method.
     * @param encoding encoding scheme of the xml stream
     */
    AXIS2_EXTERN axiom_xml_reader_t *AXIS2_CALL
    axiom_xml_reader_create_for_io(
        const axutil_env_t * env,
        AXIS2_READ_INPUT_CALLBACK read_callback,
        AXIS2_CLOSE_INPUT_CALLBACK close_callback,
        void *ctx,
        const axis2_char_t * encoding);

As you may have noticed it requires us to implement a “read_callback” function. Here is an example function prototype to implement this callback.

    int AXIS2_CALL
    some_function(
            char *buffer,
            int size,
            void *ctx);

This function will be called by the parser as required to parse the XML read from some stream.

So if your application involves reading data from a stream you are always recommended to use this function (i.e. “axiom_xml_reader_create_for_io”) instead of “axiom_xml_read_create_for_buffer” to create the AXIOM model more effectively.

Axis2/C ADB is a C language binding to the XML schema. ADB object model represents an XML specific to a given schema in a WSDL. You can use the Axis2 codegen tool to generate ADB codes for your WSDL and use that to build and parse your XMLs. The idea is, if you use ADB to build and parse your xmls, it will be really easy to do that and you don’t need to know or understand anything about the schema or the wsdl.

Apache Axis2/C to can be used to send and receive binaries as MTOM, SWA or base64 encoded. But ADB generated code still support to send and receive base64 encoded binaries only. So if you use contract first approach  with Axis2/C (i.e start with the WSDL, then write the service based on that), you have to use base64-encoded (non-optimized) as the binary transferring method. Note that you can use the other methods like MTOM or SWA, if you write the code to build and parse the xmls from AXIOM which is a general model for XML like DOM.

Say you have an element with base64Binary type in your request XML. So the schema for that element would be,

<xs:complexType name="Person">
    <xs:sequence>
        <xs:element name="image" type="xs:base64Binary"/>
        ... <!-- some more elements -->
    </xs:sequence>
</xs:complexType>

After you code generated, you will get the adb_person.h and adb_person.c files with the following function prototypes and the implementations,

        /**
         * Constructor for creating adb_Person_t
         * @param env pointer to environment struct
         * @return newly created adb_Person_t object
         */
        adb_Person_t* AXIS2_CALL
        adb_Person_create(
            const axutil_env_t *env );

        /**
         * Free adb_Person_t object
         * @param  _Person adb_Person_t object to free
         * @param env pointer to environment struct
         * @return AXIS2_SUCCESS on success, else AXIS2_FAILURE
         */
        axis2_status_t AXIS2_CALL
        adb_Person_free (
            adb_Person_t* _Person,
            const axutil_env_t *env);

        /**
         * Getter for image.
         * @param  _Person adb_Person_t object
         * @param env pointer to environment struct
         * @return axutil_base64_binary_t*
         */
        axutil_base64_binary_t* AXIS2_CALL
        adb_Person_get_image(
            adb_Person_t* _Person,
            const axutil_env_t *env);

        /**
         * Setter for image.
         * @param  _Person adb_Person_t object
         * @param env pointer to environment struct
         * @param arg_image axutil_base64_binary_t*
         * @return AXIS2_SUCCESS on success, else AXIS2_FAILURE
         */
        axis2_status_t AXIS2_CALL
        adb_Person_set_image(
            adb_Person_t* _Person,
            const axutil_env_t *env,
            axutil_base64_binary_t*  arg_image);

So you can manipulate the person element in c language using the create, get, set and free function.

    /* here the env is the axutil_env_t* instance - the axis2/c environment */
    FILE *f = fopen("./images/person.png", "r+");
    int binary_count;
    /* binary read a function you may write to read the binary data to the
     variable binary and the count to the variable binary_count */
    unsigned char *binary = binary_read(f, env, &binary_count);
    axutil_base64_binary_t *base64 = axutil_base64_binary_create_with_plain_binary(
                                        env, binary, binary_count);

    adb_Person_t *person = adb_Person_create(env);
    adb_Person_set_image(person, env, base64);

You can set this adb person directly to your request or to a setter of another adb instance to complete the ADB tree. So this way you can send binaries (as base64 encoded) using the Axis2/C ADB generated code.

Few weeks back I wrote a blog post about Writing RESTful Services in C which explain the use of Axis2/C REST API. Basically when you provide a HTTP Method (GET, POST, PUT or DELETE) and a HTTP URL, it is matched with a given HTTP method and a URL pattern in order to identify the operation and extract out the request parameters. For the example mentioned in the above blog, we can summarize the URL mapping like this.

Operation HTTP Method URL Pattern Example Requests
getSubjects GET subjects GET subjects
getSubjectInfoPerName GET subjects/{name} GET subjects/maths
getStudnets GET students GET students
getStudnetsInfoPerName GET students/{name} GET students/john
getMarksPerSubjectPerStudent GET students/{student}/marks/{subject} GET students/john/marks/maths

You can watch an application with this URL mapping in live, written using WSF/PHP which in fact run Axis2/C algorithms underneath.

Last week I updated this REST mapping algorithm and started a discussion about the changes on Axis2/C Dev list. I thought it would be better explain the idea on by blog too.

What the early algorithm (before my changes) did was, it search each pattern in the order it was declared, and returns when a match is found. Sequential searching for a matching pattern can reduce the performance as the number of operations grows. So my solutions was to keep the url pattern in a multi level (recursive) structure and match the url from one level to another.

Here is the structure of the ‘c struct’. (defined in src/core/util/core_utils.c)

/* internal structure to keep the rest map in a multi level hash */
typedef struct {
    /* the structure will keep as many as following fields */

    /* if the mapped value is directly the operation */
    axis2_op_t *op_desc;

    /* if the mapped value is a constant, this keeps a hash map of
    possible constants => corrosponding map_internal structure */
    axutil_hash_t *consts_map;

    /* if the mapped value is a param, this keeps a hash map of
    possible param_values => corrosponding_map_internal structre */
    axutil_hash_t *params_map;

} axutil_core_utils_map_internal_t;

Here is how it will looks like when the above URL pattern set (shown in the above table)