Home
About
Projects
Articles
Resources
Music

Mutable Lists in Python

Not long into my first project with geodjango, I found myself wanting more convenient ways to manipulate the GEOS geometry objects. Python's list interface seemed the most natural to me:

geometryObj[i:j] = ((33.42,-70.89),(34.819,-67.61))

Setting and retrieving individual elements was already supported, but slice and delete operations were not. I set about expanding the __getitem__ and __setitem__ implementations for the python GEOS geometry wrappers, and adding a definition of __delitem__.

ListMixins

As I began, I realized quickly that I could factor out a lot of the interface logic. I wanted to put the list-logic routines in a base class. The GEOS wrapper objects were designed with pointer-safety in mind, so very few direct mutations of the internal objects are performed. I needed to add a layer which would allow me to maintain this level of pointer safety behind the scenes.

I found a class similar to what I had in mind posted in the active state code repository. However, I ultimately designed my own ListMixin class to maximize convenience and simplicity while maintaining the pointer safety of the GEOS wrapper implementation.

The logic allows for only two very simple ways to modify the data: either modify a single element, or rebuild the entire list. The first one is optional. A custom metaclass fills in the necessary code at class creation time if the subclass doesn't provide a method for in-place mutations. The advantage of this is it becomes quite easy to make a mutable list wrapper around immutable objects. For example, to make a mutable string object, we can write the following:

from mutable_list import ListMixin

class mstring(ListMixin):
    _s = ''
    _allowed = (str,)

    def __init__(self, s=''):
        self._set_list(None, s)
        super(mstring,self).__init__()              # required

    def __getitem__(self, i):
        return ''.join(super(mstring,self).__getitem__(i))

    def _get_single_external(self, i):              # required
        return self._s[i]

    def _set_list(self, length, items):             # required
        self._s = ''.join(items)

    def __len__(self):  return len(self._s)         # required
    def __str__(self):  return self._s
    def __repr__(self): return repr(self._s)

Which gives us the following functionality:

>>> from mstring import mstring
>>> s = mstring('I love developing with django')
>>> s.append('!!')
>>> s
'I love developing with django!!'
>>> del s[:7]
>>> s[7:10] = ' more'
>>> s
'develop more with django!!'
>>> s.reverse()
>>> s.insert(0, 'Secret message: ')
>>> s
'Secret message: !!ognajd htiw erom poleved'

Here's the entire package, complete with unit tests.

Ordered Many-To-Many Fields

Though I realized when I build this class that it would be useful for wrapping any kind of external API, I found myself coming back to my new ListMixin class even sooner than I expected to create a new ordered relation field API for django. As previously noted, an ordered many-to-many relation field is a basic pattern with many applications. While the ability to designate an intermediate model in django makes this possible, it could be a lot more simple than that. Since Python's list interface allows natural and native manipulation of an ordered collection, the ListMixin class lends itself well to this purpose.

I set about creating a new OrderedManyToMany field which would:

  1. Provide ordering in one direction, meaning that the target model would be ordered with respect to the source model, but not the other way round. This implies that self-referential relations can not be symmetric.
  2. Not require the use of an intermediate model
  3. Allow simple manipulation of the list using Python's standard list interface
  4. Not require modifying the django source tree

Since django does not honor post_create_sql methods on relation fields, however, I abandoned the last requirement for the sake of the other three. I had to add a few lines to the django source in order to add an ordering column to the join table. Given goals 2 and 3, and for the sake of simplicity, I decided to prohibit the use of an intermediate model.

Here's a patch for the django 1.1 beta release which provides a very natural ordered many-to-many API. For example, with the following models definitions:

# models.py
from django.db import models

class Item(models.Model):
    name    = models.CharField(max_length=100)

    def __str__(self):
        return self.name

class Bin(models.Model):
    name    = models.CharField(max_length=100)
    items   = models.OrderedManyToManyField(Item)

    def __str__(self):
        return self.name

We can perform manipulations like this:

In [1]: from example.proof import models

In [2]: ra = models.Bin.objects.get(name='A')

In [3]: ra.items[:]
Out[3]: [<Item: 3>, <Item: 1>, <Item: 6>, <Item: 4>]

In [4]: ra.items[1::2] = [models.Item.objects.get(name=n)
   ...:                   for n in '97']

In [5]: ra.items[:]
Out[5]: [<Item: 3>, <Item: 9>, <Item: 6>, <Item: 7>]

In [6]: rb = models.Bin.objects.get(name='B')

In [7]: rb.items[:]
Out[7]: [<Item: 2>, <Item: 1>, <Item: 0>]

In [8]: rb.items.sort(reverse=True)

In [9]: rb.items[:]
Out[9]: [<Item: 0>, <Item: 1>, <Item: 2>]

In [10]: ra.items = rb.items

In [11]: ra.items[:]
Out[11]: [<Item: 0>, <Item: 1>, <Item: 2>]