Following on from my previous post about grimace, a Python package that allows generation of regular expressions using fluent syntax, I’ve added some more tricks.
In the previous post, I showed how one can use grimace to write regular expressions in this form:
All good, but do we really need to invoke each method and have all those parentheses cluttering up the line? Because grimace is written using functional principles, each method returns a new RE object, based on the previous one but modified. Thus in theory we must call a method because we need to do some work, not just return an attribute.
Python descriptors will save us. A descriptor is a class that defines one or more of the
__delete__ methods. Like properties in C#, they allow code to be executed when a descriptor is accessed as through it were an attribute.
Previously in grimace, we had methods like:
"""Return a new RE with the start anchor '^' appended to the element list"""
return RE(self, '^')
Now we can define an
Extender class and use it to rewrite
"""An Extender is a descriptor intended for use with RE objects, whose __get__ method returns a
new RE based on the RE on which it was invoked, but with the elements extended with a given
element, set when the Extender is initialized. It exists so that methods like start() and end()
can be invoked as attributes or methods."""
def __init__(self, element=None):
self.element = element
def __get__(self, instance, owner):
if isinstance(instance, RE):
if self.element is not None:
return RE(instance, self.element)
#... and here's how start() is now defined in the RE class...
start = Extender('^')
So far so good. We can now use grimace as in this example:
But what happens if a poor programmer gets confused and invokes
dot as methods? We can fix this easily. The result of getting
dot or any other
Extender is a new RE object, so executing
digits() will get that new RE and then try to call it. All we need to do is make a RE instance callable, by adding this
def __call__(self): return self
That’s all we need. It means that the execution of
digit() becomes get the digit attribute, which is a new RE instance, then call it, which returns itself.
There is one more nicely functional trick. We can use an
Extender to append more than just strings; we can also use it to append special objects, like the grimace
Repeater that defines how many repetitions of a character are required. Because all objects in grimace are functional, they’re immutable, and that means any instance may be shared between any number of threads or other bits of code. So I can write the
not_a method (which adds a Not instance to the regular expression, to invert the meaning of the next match) as:
not_a = Extender(Not())
That instance of Not() is then shared between every RE() object that uses it. Functional programming is an excellent tool.
(The documentation for grimace is now in the github wiki page)