Thursday, August 14, 2008

Python: reverse enumerate

This morning, I was needed to iterate trough a list from the end to the start and getting the index in addition to the value.

For getting the index, it's easy: we use the builtin function enumerate.
>>> a = ['a', 'b', 'c']
>>> it = enumerate(a)
>>> it
<enumerate object at ...>
>>> it.next()
(0, 'a')
>>> it.next()
(1, 'b')


For iterate in the other sens, it's also easy; there is the builtin function reversed.
>>> a = ['a', 'b', 'c']
>>> it = reversed(a)
>>> it
<listreverseiterator object at ...>
>>> it.next()
'c'
>>> it.next()
'b'


Therefore, getting the index with the element in the reverse order would be easy as the combination of the two functions:
>>> a = ['a', 'b', 'c']
>>> it = reversed(enumerate(a))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: argument to reversed() must be a sequence


Argh! We must transform the enumerate object to a list before passing it to reversed:
>>> a = ['a', 'b', 'c']
>>> it = reversed(list(enumerate(a)))
>>> it.next()
(2, 'c')


It works.
Yes, it works, but there is a problem. It's not evident for everybody but this technique isn't optimized. First we create an iterator object with enumerate, then we convert this object to a list before recreate another iterator object. With a tiny list of three elements like in my examples it's not a pain, but with big lists, it consume a lot of memory by doing a copy of the initial list with the indexes.

The solution is no more complicated but don't use the method enumerated. We simulate his behavior with the method izip from the itertools module and the method xrange. Both methods create iterator, which is our goal.
The itertools.izip method take n iterators in parameters and yield the mth element of each iterator. Example
>>> from itertools import izip
>>> a = ['a', 'b', 'c']
>>> b = ['d', 'e', 'f']
>>> it = izip(a, b)
>>> it.next()
('a', 'd')
>>> it.next()
('b', 'e')


Thus, we can rewrite the enumerate method like this:
>>> a = ['a', 'b', 'c']
>>> it = izip(xrange(len(a)), a)
>>> it.next()
(0, 'a')
>>> it.next()
(1, 'b')


Now, to reverse the order we just reverse all the izip arguments
>>> a = ['a', 'b', 'c']
>>> it = izip(xrange(len(a)-1, -1, -1), reversed(a))
>>> it.next()
(2, 'c')
>>> it.next()
(1, 'b')


To finish we hide this behind a lambda to have a more readable code...
>>> reverse_enumerate = lambda l: izip(xrange(len(l)-1, -1, -1), reversed(l))
>>> a = ['a', 'b', 'c']
>>> it = reverse_enumerate(a)
>>> it.next()
(2, c)


This simple code has been implemented into openobject-server with the revision 964.

4 comments:

eryksun said...

def r_enumerate(iterable):
"""enumerator for reverse iteration of an iterable"""
enum = enumerate(reversed(iterable))
last = len(iterable)-1
return ((last - i, x) for i,x in enum)

Frafra said...

def reverseEnumerate(obj, start=0):
for index in xrange(len(obj)-start-1, -1, -1):
yield index, obj[index]

eryksun said...

Courtesy of earthboundkid on reddit (about a week ago):

def r_enumerate(container):
    i = len(container)
    for item in reversed(container):
        i = i - 1
        yield i, item

This fails if an object implements __reversed__ without also implementing __len__. If needed, you could test for the latter case and first exhaust reversed(container) to get the length.

P.S. Why doesn't Blogger's comment system support a mechanism for pre-formatted text? I had to manually convert spaces to HTML non-breaking spaces.

ΤΖΩΤΖΙΟΥ said...

Um... the xrange bla -1 -1 can be substituted with reverse(xrange(len(iterable))) :)

>>> import itertools
>>> r_enumerate= lambda iterable: itertools.izip(reversed(xrange(len(iterable))), reversed(iterable))
>>> list(r_enumerate(['zero', 'one', 'two']))
[(2, 'two'), (1, 'one'), (0, 'zero')]