Zope, Threads and Things That Are Not Modules

Thread-local storage & thread-safe locking in Zope External Methods

You may, like me, have been misled by part of the Zope management interface into believing something that’s not true; that External Methods live in a module.  This would be a perfectly natural assumption; when you add an External Method, one of the fields in the form you fill out is titled “Module Name”.  If you read the Zope Book, you’ll also see phrases like “You’ve created a Python function in a Python module”, “You can define any number of functions in one module” or “put this code in a module called my_extensions.py”.

But they’re not modules; at least, not in key senses of the word.  What actually happens is that the source file is loaded, compiled and the resulting code object is then used to get access to the methods as needed.  This is clever, but has some unfortunate side-effects, one of which is that it isn’t possible to rely on certain module-level semantics.

One of the things that doesn’t appear to work is a trick like this:

import thread
 
my_module_level_lock = thread.allocate_lock()  
def my_mutex_method():   """A method that uses the lock to enforce thread-safety"""   my_module_level_lock.acquire()   try:     f = open("myfile.log","a")     f.write("Yowza!\n")     f.close()   finally:     my_module_level_lock.release()

Obviously, what we’re doing here is sharing the one module-level lock amongst all threads.  But because of the way Zope uses external methods, there may in fact be more than one module-level lock.  And thus, of course, you won’t get proper thread exclusion.  Even worse – it fails silently, so you may not even be aware that you’re not locking around access to shared resources.

Need proof?  Try this trick.  Create a “module level” class in your External Method file (ie, the file that contains the source to your External Methods; call it an External Method file).  Create a “module-level” instance of it that, in the __init__ method, dumps it’s id and the thread ident to stdout:

class Noddy:
    def __init__(self):
        print "Noddy %s" % self
        print "Created by thread %d" % thread.get_ident()
 
    def __del__(self):         print "Noddy %s is being deleted" % self  
noddy = Noddy()  #Allocate a module-level Noddy()

Use bin/runzope to run the Zope instance and catch the output.  Here’s some I collected earlier:

Noddy __builtin__.Noddy instance at 0x40fa8f6c
Created by thread 1026
Noddy __builtin__.Noddy instance at 0x4130a3cc
Created by thread 1026
Noddy __builtin__.Noddy instance at 0x40fa8f6c is being deleted
Noddy __builtin__.Noddy instance at 0x40fa8c6c
Created by thread 1026
Noddy __builtin__.Noddy instance at 0x4131afec
Created by thread 1026
Noddy __builtin__.Noddy instance at 0x4130202c
Created by thread 1026
Noddy __builtin__.Noddy instance at 0x40fd076c
Created by thread 1026
Noddy __builtin__.Noddy instance at 0x40fe084c
Created by thread 1026
Noddy __builtin__.Noddy instance at 0x41153fac
Created by thread 1026
Noddy __builtin__.Noddy instance at 0x4114ceac
Created by thread 1026

Yow.  The module level object is being created multiple times, all by the one thread.  Imagine if we’d put some useful resource-eating allocation at the module level.  However, if we put the Noddy stuff into a module that the External Method file imports, we get:

Noddy ThreadShared.Noddy instance at 0x40fa976c
Created by thread 1026

Just the one.  As expected.  And even if we run parallel requests to get multiple Zope threads working, we still get only the one instance of Noddy.

I ran into this problem whilst trying to allocate a single MySQLdb database connection per thread; for this, I wanted to keep a mapping of thread idents to MySQLdb Connection objects.  But whatever I did, the mapping refused to behave as a module-level object.  Thus, from necessity, was born the ThreadShared module, which returns a TLS (thread-local-storage) object per thread.

And it looks like this:

#!/usr/bin/env python
#Shared resource module for Zope-level ExternalMethod thread-local storage
 
import thread  
#A TLS is just an object on which arbitrary attributes may #be set.  In a Zope system, you could make this a #Products.PythonScripts.standard.Object #You can add methods on it to do anything you need. #If you have a thing about preferring dicts rather than arbitrary objects, #then use a dict. class TLS:     pass  
tlsMap = {} #dict that maps thread idents to TLS objects tlsLock = thread.allocate_lock()    #global lock object over map  
def getTLS():     """Return the thread-local-storage object for the current thread.     If there isn't one, create one."""  
    #Lock the tlsLock.     tlsLock.acquire()  
    #Obtain or create the object     try:         tls = tlsMap[thread.get_ident()]     except KeyError:         tls = TLS()         tlsMap[thread.get_ident()] = tls  
    #Release the lock     tlsLock.release()  
    return tls

Okay, so you have a TLS object, how might you use it?  Well, for anything that needs to be allocated once per thread.  For example, assuming you’ve imported ThreadShared:

tls = ThreadShared.getTLS()
conn = getattr(tls,'DatabaseConnection',None)
if not conn:
  conn = MySQLdb.connect( etc etc )
  setattr(tls,'DatabaseConnection',conn)
#And here we have a connection for this thread.

In this case, it might well be worth adding a method to the TLS class to generate the connection, but you get the general idea.

Another problem that comes from the same source is this; how do you get a module-level lock in an External Method file?  The answer is – put it in another module (a real module) that your External Method file imports.  Be careful with the import path, though – by default the Extensions instance directory isn’t on the import path, so if you like all your External Method code to live in the one place, you’ll need to mess with sys.path to add it.

Of course, all this will come as no surprise to seasoned Zopistas, and I expect the usual level of flames informing me of my rank ignorance and stupidity in not having worked this out ages ago (presumably by reading the source code).  Yet given that this is a pretty key point, you’d expect that, at the very least, External Method files weren’t called “modules” in so many places.  Because they’re not modules, are they?

Reality, modified

Fascinating article in Wired about a guy called Graham Flint, whose Gigapxl Project [sic] is capturing huge, um, gigapixel images of places in America.  But I found this odd; when talking about a picture of a nude-sunbathing beach, he’s quoted as saying: “We might have to add fig leaves in Photoshop, it’s that good”.

Why capture reality in as dense detail as you can… and then edit it to fit the retarded moral sensibilities of those who think the human body is something to be ashamed of?