D.I.Y. messaging queues are not a good idea
Written by Matthew Cooper on April 04, 2013Tweet
At my day job we use an e-commerce system developed in-house that pushes operational data to the Netsuite ERP system where the logistic and accountancy teams do their thing. This push of information happens pretty much transparently and in real time. Working out the best way to synchronize information between the two systems however took a lot of trial and error and I think my experience makes for an interesting read of how to NOT to integrate with Netsuite. I will disclose some of the complications I faced, my initial workarounds and why I think a messaging queue is a necessity for logistical success for any serious e-commerce website connecting to an external ERP.
Cron jobs are only as good as the programs they execute
Due to time constraints and a lack of access to a sandbox environment during development, I had only tested the communication between the website and Netsuite locally – I had not done any kind of stress testing. Because of this I had no idea that the standard Netsuite webservice is slow and asynchronous. I found this out the hard way when we launched the integration and had massive amounts of visitors. Things went down hill fast. When you attempt to create more than one SOAP call with the same user credentials at the same time in Netsuite, only the first request will go through and the rest will be interrupted. This quickly sucked me into the 7th circle of programmer hell!
Overwhelmed with this limitation I came up with a simple queuing system which allowed me to run several SOAP users at the same time. This involved a MySQL table that stored a queue of information to be pushed to Netsuite and a cron job which executed every 5 minutes and instantiated a script that randomly divided the queue up into segments and pushed the data with one of several available SOAP users. Once processed the data was marked as processed.
El barato sale caro
Almost immediately I ran into bigger problems. Pushing sales data from the website to Netsuite was taking so long to that often before a request had actually finished processing an order another PHP PID had initiated by the cron job and so began to process the same information again resulting in duplicate sales orders. My first reaction was to blame Netsuite for being a certified piece of crap for allowing this inconsistent behavior, but actually the cause of the problem was not Netsuite's fault at all.
By this point the code was evolving into an unmaintainable mess. At times even I didn’t understand what it was doing. Either way we continued a little longer down the path of least resistance. So, we created another workaround for the duplication issue. This time the fix came in the form of a file locking mechanism. When one SOAP user was processing a collection of records an empty file was created. This empty file indicated to the next iteration of the cron job that it should exit, assuming that another process was already running. Once the original cron job finished, the lock was deleted allowing the next iteration to begin progressing a new collection of records. If, for whatever reasons though, the lock file wasn’t deleted cleanly after the original process had finished, a deadlock resulted. This simple I/O problem is a stressing situation for mission critical websites which depend on real time synchronization.
Save yourself a headache and just use a tested Messaging Queue.
Home made solutions such as I described above are simply not adequate. Theoretically even you are able to perfect a D.I.Y. queuing system, relying on cron jobs will mean that your system will never able to take advantage of “lost” processing time during intervals in which the cron job is “dormant”. There are many unseen complexities involved, which is why in the end I realized that a tested message broker was necessary. While are loads of open source messaging queues out there, I chose to implement RabbitMQ which is an implementation of the Advanced Messaging Queue Protocol written the Erlang programming language. It runs in the background of your website operating system as a daemon waiting for you to push information to it and queuing it for processing in any way you can think of. There are many extensions in several popular languages, which interface with Rabbit quite easily.
I used the PHP AMQP extension. Since I prefer to work at a higher level of abstraction, I developed a simple Zend Framework module to interact with Rabbit following the Zend_Queue structure. Scouring github.com you should be able to find similar implementations for different frameworks and languages, which work out of the box.
Since Implementing RabbitMQ sales are pushed to Netsuite a lot faster than with the cron job solution. Another plus is that accidental sales order duplicates are a thing of the past and if there is a server connectivity issue the queue does not get deleted but is stored and consumed later. This is a very valuable concept when dealing with S.A.A.S type architectures. Later I plan to use node.js to help visualize RabbitMQ through a browser. That way communication issues can be identified easily by anyone.
I hope you found this article interesting; if you did I would love to hear from you.
Follow me on Twitter @debugthat. Thanks!