Dec 112015
 

A friend of mine who works for a seismic processing startup was chatting with me about a problem he had about with sending off commands to various machines which are supposed to process some data. Essentially wanting the ability to queue those machines for processing. Since the nodes don’t need to communicate any information, it’s not necessary to use anything fancy like MPI, though he didn’t have any built in mechanism to manage jobs. He was able to accomplish this task with a clever, though crude, combination of bash, cron, and top. Of course curiosity got the best of me wondering how this might be implemented and shortly thereafter a python cluster manager was born.

Since I’ve been focused on C++ and CUDA lately, I thought it would be a nice refresher to write a small python package which provides a more robust solution to this scheduling problem. It gave me a chance to review some simply threading and socket communications using the Python standard library along with experiment with a couple other packages. Moreover, the more I worked on it, the more I wanted to to do with it. I only wanted to dedicate one weekend of casual coding but this took about 3 such weekends to put together. I give you, clustermuster:

https://github.com/bee-rock/clustermuster

Though the server itself required ssh authentication with it’s various nodes, I haven’t included a secure authentication mechanism for sending commands to the manager apart from requiring an appropriate schema. Using it within your own network would be okay, provided you trust everyone on your network. If you plan on doing something like this in production, I’m sure there are libraries out there that would accommodate your needs. To be honest, the socket library that comes with the standard Python library is very easy to use incorrectly. Though it’s a great exercise to figure out how to use and to play with, the next time I write an application using sockets in Python, I would certainly consider using the Tornado framework, http://www.tornadoweb.org/en/stable/.

I’ll likely do a demo with it in an upcoming post.

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">

(required)

(required)