Organic Reaction Simulator

It is a tool that simulates the organic reactions and automatically generate the IUPAC names for the organic compounds drawn by you. This covers most of the syllabus of the Organic Chemistry for G.C.E. A/L examination.

Features

  • Nice Canvas to draw your Organic compounds.
  • Generates IUPAC names for the drawn component as possible. (It generates IUPAC names for almost all the compounds which G.C.E A/L examination expect students to know)
  • You can simulate reaction for organic compounds with preferred inorganic/organic compounds and the selected conditions. (Currently supporting 168 reactions)
  • Extensibility of adding IUPAC naming rules without the need of recompiling the code.
  • Extensibility of adding custom reactions, again without compiling a single line of code.

Here is a screen shot of the main panel when ‘Aldole reaction’ is simulated for some simple reactants.

Aldole Reaction - Reaction Simulator

Aldole Reaction - Reaction Simulator

The Drawing Panel

The ‘Reaction Simulator’ app provides 2 views for the user. One is the main panel which shows and manages reactions which is shown in the above figure. The other view is the drawing panel. It will be not much different from your favorite drawing application.

Drawing Panel - Reaction Simulator

Drawing Panel - Reaction Simulator

Here you can select elements and bond types to draw your organic compound. Additionally It has tools to move or delete selected parts of your drawing.

The other thing you may have already noticed, you don’t need to complete all the bonds for a particular element. Rather if you keep one side of a single bond unconnected, it will be automatically connected to a ‘H’(Hydrogen) at the rendering, similarly for double bond the default connecting element would be ‘O’(Oxygen), and for triple bond it is ‘N’ (Nytrogen). And you don’t even need to bother about connecting bonds with an element, as if you connected only one bond to ‘C’ all the other 3 possible bonds will be considered as connected with ‘H’s.

Just check the drawing panel on your own and find its user friendliness. Now after you finish designing your compound, just press “RETURN”.

IUPAC Name Generator - Reaction Simulator

IUPAC Name Generator - Reaction Simulator

Yea it will nicely render our compound, and look at the bottom, it has generated the IUPAC name for the compound.

The Main Panel

This contain a canvas that you can add, edit and delete organic compounds (so you will be directed to the ‘Drawing Panel’), some sidebar panels to select Inorganic compounds and additional conditions requires for the reactions and a set of buttons to start reactions and manage the result set. At the end of the sidebar there is a ‘help’ button that directed you to the user guide.

Extending IUPAC Naming Rules

If you go to the “Data/IUPAC/” directory (you can view svn from here, http://svn.dimuthu.org/organic_chemistry/Organic_Reaction_Simulator/Data/IUPAC/) you can find there are set of .txt files (infact xmls) that defines the IUPAC rules.

If you open one of them (we will take al.txt), it would be something like this

<IUPAC basename="al" subname="oxo" level="6" nonumber="true" affectto="1">
	<bond type="2">
		<element type="7">
		</element>
	</bond>
	<bond type="1">
		<element type="1">
		</element>
	</bond>
</IUPAC>

It is the naming rules for aldehyde compounds. (H-C=O). First it identify the the aldehyde by checking double bond to ‘O’ (which is the element type ‘7′), and a single bond to ‘H’ (which is the element type ‘1′). If the element group found, it will give the name ‘al’ or ‘oxo’ depending on whether it defines the main group of elements of the compound or not.

If you take a look at all these rules, you can have a good idea what each of these syntax mean. And may be you can add your own rules, if you find something missing or incorrect there.

Extending the Reactions

The reaction rules are stored in the “Data/Reactions/” directory ( http://svn.dimuthu.org/organic_chemistry/Organic_Reaction_Simulator/Data/Reactions/).

I will take the first reaction we studied in the ‘Organic Chemistry’ Class.

CH4 + Cl2 + hv (dim light) —————-> CCl4 + H2

Here is the rule defining that reaction. (cl2hv.txt)

<reaction inorganics="Cl2" conditions="hv">
	<check>
		<bond type="1">
			<element type="H">
			</element>
		</bond>
	</check>

	<reordering activity="remain">
		<bond type="1">
			<check>
				<element type="H">
				</element>
			</check>
			<reordering activity="remain">
				<element type="H">
					<reordering activity="replace" to="Cl">
					</reordering>
				</element>
			</reordering>
		</bond>
	</reordering>
</reaction>

If you take a close look at, what it says is if there exist inorganic ‘Cl2′ and condition ‘hv’ (reactions inorganics=”Cl2″ conditions=”hv”), check for a ‘H’ (element type=”H”) elements that is connected to a ‘C’ element (which is the root of the XML) through a single bond (bond type=”1″), then don’t replace the ‘C’ (reordering activity=”remain”) and don’t replace the bond type (reordering activity=”remain”), just replace the ‘H’ with ‘Cl’ (reordering activity=”replace” to=”Cl”). It is so simple as that.

These rules are applied to all the C elements in reactants. Look at the following figure for this rule in application.

Alkane + Cl2 Reaction - Reaction Simulator

Alkane + Cl2 Reaction - Reaction Simulator

These reaction rules can be as complex as you want. Specially when two or more reactants are involved in a reaction, rules defining that reaction will be little complex. Look at how aldole reaction (which is applied in the first figure of this post) is written in aldole.txt.

There are all together 168 reactions currently defined in this way. If you found some reaction is missing, feel free to add a another rule file defining that reaction. And if you want to share that with others, just let me know :) , so I can integrate it in to the distribution.

Download

This software application is not yet in a official release. I just thought use this post to do a pre release of the ‘Reaction Simulator’. You can download the windows binary of the pre release version (343KB) from http://downloads.dimuthu.org/bins/chemistry/reaction_simulator_pre_release/reaction_simulator_pre_release.zip

Source Code

SVN location for the source code of ‘Reaction Simulator’ is http://svn.dimuthu.org/organic_chemistry/Organic_Reaction_Simulator/

Known Limitations – Possible Improvements

  • Generation of IUPAC names and simulations of reactions are not supported for compounds which involves Benzene ring.
  • IUPAC names generation is not supported for cyclic compounds which are anyway not part of the G.C.E. A/L syllabus.
  • The set of elements, available in drawing compounds, set of inorganic compounds and conditions, available in reactions are fixed and cannot be extended without changing the code and recompiling.
  • You can’t start with an IUPAC name and derive the compound. Currently you always have to start with drawing the compound and then generate the IUPAC.
  • Only for windows!

Little Background

4 years ago, When I was a level 2 student in the University of Moratuwa, I participated to a competition for making educational software tools organized by C.S.E and N.I.E (National Institute of Education). I submitted a software that teaches Organic Chemistry. It consisted of interactive tutorials targeting local G.C.E A/L exams with exercises and revisions in both Sinhala(My Mother tongue) and English languages(not in Unicode though). I got the second price for that.

After the award I decided to improve my software application by adding an ‘Organic Reaction Simulator’. In fact in the vacation of 2 weeks for the New Year, I could complete it.

Although the N.I.E supposed to distribute the applications submitted to the competition throughout the country, it didn’t happened. So I decided to publish at least the ‘Reaction Simulator’ application which I actually didn’t submitted to the competition.

Technologies Used

This is completely written in C++. It hasn’t use MFC, because I thought MFC is too heavy for such a small application. I used a lightweight, small image library code taken with some custom changes from some windows game programming book.

This is using XML to load data about reactions and IUPAC names which I have described in details in the early part of the blog. It uses a small inbuilt xml parser (just one recursive function) to parse these xmls.

So it has no depenedencies to third party libraries, You can just chekcout the souce code from the svn, open the visual studio project and compile it (press ‘F7′).

Few weeks back I wrote a blog post about Writing RESTful Services in C which explain the use of Axis2/C REST API. Basically when you provide a HTTP Method (GET, POST, PUT or DELETE) and a HTTP URL, it is matched with a given HTTP method and a URL pattern in order to identify the operation and extract out the request parameters. For the example mentioned in the above blog, we can summarize the URL mapping like this.

Operation HTTP Method URL Pattern Example Requests
getSubjects GET subjects GET subjects
getSubjectInfoPerName GET subjects/{name} GET subjects/maths
getStudnets GET students GET students
getStudnetsInfoPerName GET students/{name} GET students/john
getMarksPerSubjectPerStudent GET students/{student}/marks/{subject} GET students/john/marks/maths

You can watch an application with this URL mapping in live, written using WSF/PHP which in fact run Axis2/C algorithms underneath.

Last week I updated this REST mapping algorithm and started a discussion about the changes on Axis2/C Dev list. I thought it would be better explain the idea on by blog too.

What the early algorithm (before my changes) did was, it search each pattern in the order it was declared, and returns when a match is found. Sequential searching for a matching pattern can reduce the performance as the number of operations grows. So my solutions was to keep the url pattern in a multi level (recursive) structure and match the url from one level to another.

Here is the structure of the ‘c struct’. (defined in src/core/util/core_utils.c)

/* internal structure to keep the rest map in a multi level hash */
typedef struct {
    /* the structure will keep as many as following fields */

    /* if the mapped value is directly the operation */
    axis2_op_t *op_desc;

    /* if the mapped value is a constant, this keeps a hash map of
    possible constants => corrosponding map_internal structure */
    axutil_hash_t *consts_map;

    /* if the mapped value is a param, this keeps a hash map of
    possible param_values => corrosponding_map_internal structre */
    axutil_hash_t *params_map;

} axutil_core_utils_map_internal_t;

Here is how it will looks like when the above URL pattern set (shown in the above table) is kept inside this multi-level (recursive) structure.

svc->op_rest_map  (hash)
                |
            "GET:students" --------- axutil_core_utils_map_internal_t (instance)
                |                                            |
                |                                        op_desc (axis2_op_t* for "GET students" op)
                |                                            |
                |                                        consts_map (empty hash)
                |                                            |
                |                                        params_map (hash)
                |                                                         |
                |                                                      "{student_id}" ------------- axutil_core_utils_map_internal_t (instance)
                |                                                                                            |
                |                                                                                op_desc (axis2_op_t* for "GET students/{student_id}" op)
                |                                                                                            |
                |                                                                                parms_map (empty hash)
                |                                                                                            |
                |                                                                                 const_map (hash)
                |                                                                                            |
                |                                                                                        "marks" ------------------- axutil_core_utils_map_internal_t (instance)
                |                                                                                                                            |
                |                                                                                                                    op_desc (NULL)
                |                                                                                                                            |
                |                                                                                                                   consts_map (empty hash)
                |                                                                                                                            |
                |                                                                                                                   params_map (hash)
                |                                                                                                                            |
                |                                                                                                                      "{subject_id}" ----------- axutil_core_utils_map_internal_t (instance)
                |                                                                                                                                                                               |
                |                                                                                                                                       op_desc (axis2_op_t* for "GET students/{student_id}/marks/{subject_id}" op)
                |                                                                                                                                                                               |
                |                                                                                                                                                                 consts_map / params_map (Both NULL)
                |
            "GET:students" --------- axutil_core_utils_map_internal_t (instance)
                                                            |
                                                        op_desc (axis2_op_t* for "GET students" op)
                                                            |
                                                        consts_map (empty hash)
                                                            |
                                                        params_map (hash)
                                                            |
                                                      "{student_id}" ------------- axutil_core_utils_map_internal_t (instance)
                                                                                                          |
                                                                                  op_desc (axis2_op_t* for "GET students/{student_id}" op)
                                                                                                          |
                                                                                             consts_map / params_map (Both NULL)

This structure is build at the time the server initialize the services. (from the “axis2_svc_get_rest_op_list_with_method_and_location” function in src/core/description/svc.c)

As each request hit the service, the request HTTP method and the URL is matched (which we call ‘rest dispatching’) with the above structure using the following algorithm. (defined in the “axis2_rest_disp_find_op” function in src/core/engine/rest_disp.c). Note that here we are extracting out the user REST parameters as well, but it is not shown in here.

  1. The request URL is spitted in to URL components from ‘/’ character. Retrive the instance of axutil_core_utils_map_internal_t  from the svc->rest_map to the varaible ‘mapping_struct’.
  2. Check the existance of URL components, count(URL components) > 0.
  3. If it doesn’t exist any URL components, get the value of mapping_struct->op_desc where the mapping_struct is the current mapping instance of axutil_core_utils_map_internal_t. if the mapping_struct->op_desc is not NULL, we found the operation. If it is NULL just exit returning NULL.
  4. Else If some URL component(s) exist, check the most former URL component in the mapping_struct->const_map hash. If mapping_struct->const_map['former_url_component'] is not NULL, assign the mapping struct->const_map['former_url_component'] value to mapping_struct and follow the step 2 with the remaining URL components. (note that here hash['key'] syntax is used to take the value for the key from the hash ‘hash’. If that returns TRUE, we found the opeartion, if not countine to step5.
  5. if mapping_struct->const_map['former_url_component'] is NULL, match the former url component with each key (which is a URL component pattern) in mapping_struct->param_map hash. (We use the function  “axis2_core_utils_match_url_component_with_pattern” in src/core/util/core_utils.c to map URL component with the URL component pattern). If matching pattern found assign the mapping_struct->param_map['key'] to mapping struct and follow the step 2 with the remaining URL components. If that returns TRUE for some key it will be the matching operation.

Where as the earlier algorithm can be simplified to,

  1. Match the request URL with URL patterns of each operation. This will be like calling the function “axis2_core_utils_match_url_component_with_pattern” (mentioned in step5 of the above algorithm) for the complete URL rather than for a URL component
  2. If the pattern is matched, matching operation is the selected operation for the request.

I approximately calculated the time complexity of both of these algorithm.

Here is the time complexity of the later described algorithm.

Average time complexity of iterating ‘n’ number of operations n/2 = O(n)
Time complexity of matching pattern with a URL with the length ‘p’ (complexity of the ‘axis2_core_utils_match_url_component_with_pattern’ function) O(p^2)
Complete time complexity of the algorithm O(n*p^2)

Time complexity of the formerly described algorithm. (which is currently in the SVN).

Time Complexity of a Hash Search O(1)
Average Number of has searches required. This is the average number of levels in the tree of recursive structures drawn above long(n)/2 = O(log(n))
Time complexity of matching pattern with a URL component with the average length ‘d’, d < p (p = the length of the complete URL) O(d^2)
Number of time pattern matching is required = number of param components in the URL = k, k < p/d (p = the length of the url, d = average length of the URL component)/ k = O(k)
Complete time complexity of the algorithm O(log(n)*d^2*k)

Considering the facts, O(logn) < O(n),d < p and k < p/d we can safely conclude

O(long(n)*d^2*k) < O(n*p^2)  => The newer algorithm has better (low) time complexity.

However the time complexity is valid only it takes high values for the parameters. For low values  the actual time taken by the newer algorithm can have high values, considering the constant overhead of the recursions and the hash search. So in order to judge the performance of the algorithm, we have to run some test cases and measure the actual times. Possibly a task for the weekend :)


© 2007 Dimuthu’s Blog | iKon Wordpress Theme by Windows Vista Administration | Powered by Wordpress