The Nagging Doubts

Tuples, asserts Guido, are for heterogenous data.  The C programmer I used to be would, of course, throw together a struct (almost certainly typedef‘d) to be the equivalent.  As that same C programmer I was burned sufficiently often by rogue constants and editing errors to adopt a firm rule of thumb; if I could avoid hard-wiring of constants, I would.  This extended such that wherever possible, my code would be controlled by constants #defined at the top of the source file; change those, and the code changed with them.

Which brings me back to tuples.  Imagine that I construct a tuple thusly, to represent, say, a book on Amazon:

record = (author, title, ISBN, price)

To access it, I can write code like:

print "Author is %s" % record[0]

Yet the C-programmer-within spots that stray [0] and nags me: “rogue hardwired constant!”.  There’s a strong urge to replace it with a constant, something like:

AUTHOR=0
TITLE=1

print "Author is %s, Title is %s" % (record[AUTHOR], record[TITLE])

Yet even that doesn’t satisfy him; he worries that someone will write an assignment to the tuple and inadvertently transpose a couple of elements.  There is, he says, no way to hardwire the relationship between the order in which items are stuffed into a tuple and the indexes used to retrieve them.

To which I say; well, if you’re that worried use an object.  Yet still I find those stray remaining indexes… disturbing.

Just now I wrote (apropos of some database code):

# Read a record, take the first element, strip off whitespace and lower-case it.
value = cr.readone()
if value and value[0]: return value[0].strip().lower().split()

Now that really sets off the deep, C-derived danger instincts.  What, he cries, if anything were to return NULL in that wild and assumptive sequence of function calls?  I think I need to go and make him a cup of tea and settle his nerves; he’s still not entirely used to exceptions.

7 thoughts on “The Nagging Doubts

  1. Yay!

    I don’t think anyone uses tuples as data structures. There’s a difference between heterogenous data and a data structure. Ad-hoc data structures are usually created out of dicts, where tuples are generally used to reference multiple values that are traveling together.

    I like to think of enumerate() as the perfect example what Guido means by tuples containing hetrogenous data. It generates a tuple containing the index into a sequence and the data at that index. It’s rare that anyone ever using it has to code a “rogue hardwired constant” because iteration loops just have a “tuple” to unpack into (for idx, data in enumerate(somelist)). Returning multiple values from a function: a-ok. Returning an unpredictable number of values from a function: using a dict with documented expected keys is much more future-proof when you add new data.

    – Mathieu (http://stompstompstomp.com/)

    • Re: Yay!

      I agree that one would never expect to see a tuple with an indeterminate number of values (except in a “print %”, perhaps), and I didn’t intend to suggest that usage in any way.

      However, surely multiple values that are traveling together is a pretty good definition of a data structure? But if tuples were never intended to be indexed, merely unpacked, why allow it?

      Of course, unpacking a tuple via:
      (a,b) = myTuple
      is as transposition-error-prone as
      a = myTuple[0] (etc)

      🙂
      regards
      ben

      • Re: Yay!

        1) Function return values are traveling (into lvalues or into oblivion), objects are staying (until garbage collected).
        Data in objects are expected to be referenced as many times as necessary by whatever code is holding a reference, while return values can only be referenced once: a,b,c=someTupleFunction() is not dangerous because you need to know what the function returns and you depend on that order in one place only.
        2) Tuples are meant to be indexed: they have positional access in order to be sequences.
        Bundles of return values are only one use case; tuples can be an immutable list, a generic tree node, a set, etc.

      • Re: Yay!

        I wrote a little class to help with decoding tuples for those cases where you do not want to use a dictionary or class to directly hold the data.

        class DecoderRing(object):
        ….def __init__(self, codes):
        ……..for code,ndx in zip(codes, indices):
        …………setattr(self, code, ndx)
        ……..self.length = len(codes)

        ….def __len__(self):
        ……..return self.length

        hx = DecoderRing((“AB”,”R”,”H”,”HR”,”RBI”,”BB”,”SB”,”BA”,”OBP”,”SLG”,))

        hx helps decode a tuple of baseball batting statistics.

        average = tuple[hx.BA]

        In general, I think you are betterr off using dictionaries or custom classes. However, this approach has worked well for me.

  2. structs in python

    That’s really now how tuples are used. This is how you do what you want. It’s the same approach used in Ruby. Note though that the name I use here interferes with a standard module.

    >>> class struct:
    … def __init__(self, **kwargs):
    … self.__dict__.update(kwargs)

    >>> record = struct(author=’foo’, title=’bar’, ISBN=’phlegm’, price=’$0.02′)
    >>> print dir(record)
    [‘ISBN’, ‘__doc__’, ‘__init__’, ‘__module__’, ‘author’, ‘price’, ‘title’]
    >>> print record.ISBN
    phlegm
    >>>

    • Re: structs in python

      PEAK (the Python Enterprise Application Kit) has the best struct class I’ve seen for Python. From its docstring:

      Typed, immutable, multi-field object w/sequence and mapping interfaces

      Usage::

      class myRecord(struct):
      __fields__ = ‘first’, ‘second’, ‘third’

      # the following will now all produce identical objects
      # and they’ll all compare equal to the tuple (1,2,3):

      r = myRecord([1,2,3])
      r = myRecord(first=1, second=2, third=3)
      r = myRecord({‘first’:1, ‘second’:2, ‘third’:3})
      r = myRecord.fromMapping({‘first’:1, ‘second’:2, ‘third’:3})
      r = myRecord.extractFromMapping(
      {‘first’:1, ‘second’:2, ‘third’:3, ‘blue’:’lagoon’}
      )
      r = myRecord.fromMapping( myRecord([1,2,3]) )

      # the following will all print the same thing for any ‘r’ above:

      print r
      print (r.first, r.second, r.third)
      print (r[0], r[1], r[2])
      print (r[‘first’], r[‘second’], r[‘third’])

      If you want to define your own properties in place of the automagically
      generated ones, just include them in your class. Your defined properties
      will be inherited by subclasses, as long as the field of that name is at
      the same position in the record. If a subclass changes the field order,
      the inherited property will be overridden by a generated one, unless the
      subclass supplies a replacement as part of the class dictionary.

      The Struct module can be seen here via viewcvs (scroll down a bit — PEAK’s author likes a lot of vertical whitespace). Interestingly, it’s implemented as a subclass of tuple.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s