Self-Starting Singleton Scripts

Stupid Python Tricks, No. 3453

I wrote some nonsense a while back about how to make a Python script run one instance of itself only.  This works very well for me, with scripts being (re)started at regular intervals by cron jobs, the new instances detecting that there’s an old instance and dying off.  But it’s sometimes a bit of a pain to have to manually kill off a task when I’ve updated the actual Python script.  I don’t want to arbitrarily kill it, because it might be in the middle of something that I want to run to completion – a lot of the database stuff that goes on is time-consuming and whilst a kill will roll back anything done so far, that’s not efficient.

So a few minutes’ though brought this to light (this includes the run-one-of-me-only code):

#Make a note of the time we start running.
startTime = long(time.time())
 
#Get the current PID of this process
pid = os.getpid()
 
#Use an external command to see if any other process is running this script
#Derive the scriptname from sys.argv.  Note that if there are scripts with
#the same name in different directories, this check will fail.
scriptName = os.path.basename(sys.argv[0])
ps = os.popen('ps --no-headers -elf|fgrep %s' % scriptName)
psl = ps.readlines()
ps.close()
for p in psl:
        if p.find('python')>-1 and p.find('trigger.py')>-1:
                otherpid = int(p.split()[3])
                if otherpid  pid: sys.exit()

Then, in the main loop of the script…

while True:
        #Sleep for a bit...
        time.sleep(SLEEPTIME)
 
        #Check to see if this script has been updated since the last
        #iteration.  If so, die now so that the new version can take over
        #and be restarted by cron.  Assumes that the working directory
        #is that of the script (or that from which the script was started).
        x = os.stat(sys.argv[0])
        #Check the modification time of the script against the time we started running
        if x.st_mtime > startTime:
                print "Script has changed; will exit to allow restart"
                sys.exit(0)

       
Cheap and cheerful stuff, but useful.  And hey, I didn’t even mention static typing once…

Total Eclipse Of The Python

Eclipse & Python, as the title suggests.  I’m not always elliptical in my references, you know.

I apologise if the title brings back memories of one of the several records with which Jim “I Produced Meatloaf, You Know” Steinman ruined Bonnie “Gravel Is For Gargling” Tyler’s 80s career, but if I have it in my head, I see no reason why it shouldn’t also end up in yours.  A problem shared is a problem halved.

Most of my development these days is done in a mixture of Java and Python, each language being used for the tasks for which it is best suited.  For a while there, I was doing Java in Eclipse 3.0 and Python in Boa Constructor, but even with 512Mb and not much else open, the laptop was beginning to groan.  Eclipse is approximating wonderful, but it eats memory like Americans eat corn-syrup-enhanced snacks.  Boa also likes to take up a fair amount of heap, so one of them had to go.  It had reached the point at which even partially exposing an Eclipse window after half an hour of ignoring it would slow everything down to a grind for a minute.  So I did some digging and discovered that Eclipse does support Python development.  Well, support is a relative term; it’s possible, if you’re willing to give up some things, or find workarounds.

First off, how to get Eclipse to handle Python at all.  There are at least five Eclipse Python plugins at some stage of development; after some testing (and not a few Eclipse crashes) I chose pydev.  The most obvious effect of installing this is that Eclipse recognises .py files and opens them in a Python-aware editor, with the Outline showing a nice list of methods imports and classes, including any if __name__ == ‘__main__’: block (no module-level variables shown, though).  The editor follows the Eclipse style of syntax colouring; might not be to your taste, but at least it’s more-or-less consistent between languages.

The most obvious lack is fully-Python-style code completion.  Many and better people than I have written on how damn difficult it is to implement this in Python anyway, so I suppose it’s a benefit to get anything.  Code completion’s off by default, and the PyDev Preferences tab for it contains dire warnings about how it works and what the drawbacks are.  In short, it “executes the code you write on the top level on the module” – I assume this means that it “imports” your code.  It looks like code completion’s then done with dir().  There are no tips for parameters to methods or functions, but you do at least get the docstring displayed.

Something else I missed pretty soon – a built-in Python shell.  I’ve impressed programmers in other languages before now by flipping between code-in-progress and an interactive shell to test out the statements I’m writing (avoids so many fence-post errors).  I’ve always found it especially good for regular expressions.  Also, of course, when you’re done coding, you have a bunch of handy test data and statements to hand to build the unit tests.  The quickest Eclipse workaround I’ve found is just to keep a Cygwin or Quasi window open running Python and to keep handy with the Windows console cut-and-paste.  I suppose SPE or PyCrust would do as well…

PyDev debugging has never worked for me, at all, though this is probably a blessing; meaning I tend to do much more test-driven development.  Since there are a bunch of other Python environments around if I really need to step through code, I haven’t bothered to dig into the depths of exactly why the PyDev Debug option never even appears for me.  Anyway, most of the Python code runs in Zope, and debugging Zope interactively is just… don’t go there.  Really.

A nice touch is that PyDev does it’s best to implement the Eclipse refactoring support, including renaming and extracting code to methods, though there are limitations on how far the renaming search appears to go.  Also, it sometimes seems to get slightly confused as to whether the source has changed or not.

Less obvious features: if you have the Subclipse plugin for Subversion integration with Eclipse, it’s now as easy to keep Python scripts and modules under control as it is Java (or any other file).  I especially like the ability to check in a set of Java and Python files all together when changes to both are needed to implement some system-wide change.  The Eclipse Local History (a sort of “persistent Undo”) also applies to Python scripts.

In short; it’s mature enough to do real serious work with, and the benefits of having all my code in one environment outweight the niggles.  After all, there’s no other Python IDE I’ve found that has everything I want in one package, let alone SubVersion integration.  In this world we are destined always to strive for perfection and never to find it.  I’ll settle for Eclipse & PyDev for the moment.

More info at http://www.python.org/moin/EclipsePythonIntegration

Sublunary Paths

Zope alternatives to GET/POST parameters.

Before you ask, sublunary paths comes from a poem by Philip Larkin (it’s Many famous feet have trod, partway down that page). It has nothing much to do with web programming, nor Zope, but every time I see the identifier subpath it reminds me. Besides, I’m sure you appreciate the break from IT once in a while?

A subpath in Zope terms is that part of a URL that comes after the script, method or whatever that’s actually executed. Bear in mind that Zope is all about inheritance and object-orientedness, so if we have a script available at:

http://www.myserver.com/scripts/myScript

…then it’s possible to access it via URLs like:

http://www.myserver.com/scripts/myScript/part1/part2/part3

…and you can consider that the “part3” URL “inherits” myScript.  Those extra parts (in bold) don’t need to refer to actual directories[1]; they can be pretty much aribtrary.  They’re made available in the subpath element of the REQUEST object, and here’s where that comes in handy.

If you point a browser at a URL that results in the contents of a file being returned, then the question arises of what the file name should be.  HTTP allows the server to be specific about the type of the data that’s sent, but there’s facility to supply a suitable name.  This makes sense; naming conventions are extremely varied across platforms, so there’s no guarantee that a given name will be appropriate.  Instead, what usually happens is that the browser deduces a name from the URL.  And here problems may occur.

Consider a URL of the form:

http://www.myserver.com/scripts/downloadFile?customer=144876ab6&transaction=76yghj576xz7

…which downloads a GIF file.  There are browsers in existence[0] that will propose the filename downloadFile?customer=144876ab6&transaction=76yghj576xz7.gif, which is less than useful.  The worst part is that the parameters to the request get added to the filename.  The request could be made as a POST rather than a GET to avoid this, but there’s still no way to specify the filename, meaning that all the files downloaded from this URL may end up with the same name.  Not good, especially if they’re being paid for and later ones overwrite previous ones.

The subpath trick, however, can be used in a couple of ways to work around this.
First, since the subpath is ignored by Zope, it can be used to suggest a filename when a POST request is made:

http://www.myserver.com/scripts/downloadFile/myNewFile.gif

That’s good, but we can go further.  The subpath can be used itself to pass the parameters to the script.  Consider this:

http://www.myserver.com/scripts/downloadFile/customer=144876ab6/transaction=76yghj576xz7/myNewFile.gif

In Python terms, we need to look along the subpath and extract those parameters.  Here’s a suitable method to do it:

#Regular expression to spot parameters of the form  = , where
#value may be empty.
paramRe = re.compile(r"(\w+)\s*=\s*(.*)")
def getParametersFromRequest(self):
    """Extract parameters from the subpath and return them in a dict."""
    params = {}
    #The subpath is a list of path elements; in other words, it's
    #been split by '/' characters for us already.
    for p in self.REQUEST.subpath:
        #Check if it's a parameter.
        m = paramRe.match(p)
        if m:
            #It is a match.  group(1) is the identifier,
            #group(2) is the value, which we strip.
            params[m.group(1)] = m.group(2).strip()
    return params

You can call this from any ExternalMethod or Script (Python), passing either self or the container respectively, and it’ll return a dict of parameters.

[0] This is an issue that crops up a lot on mobile phone browsers, but wget shows it too.
[1] This depends on what it is that you append the subpath to; it works for ExternalMethods, but other types of Zope object require the subdirectories to exist.  Which is a pain.