Thursday, August 14, 2008

New new domain notation

Last week, I told you about the new domain notation. After a little peer reviewing, it appears that this notation is not as good as it seems. The inability to negate an expression (and not just a leaf) is more problematic that I have imagined. Thus, after a tiny discussion with Fabien, we decided to implement a (nearly) true polish notation for the domains. From now, the domains look more like the older ones that in the last week notation. This means there is no more extra parenthesis.
In addition of the classic leafs, you can add operators.

« Wait... what do you name leafs? »
A leaf, is a tuple of 3 elements composed of:
  1. a field name
  2. an operator (as string)
  3. a value
Example:

('active', '=', True)
('foo', '!=', 'bar')
('id', 'in', [7, 13, 42, 666])
('id', '=', False)
('name', 'like', 'tiny')


« And which operators are allowed? »
Like the last week notation, the AND ('&') and OR ('|') operators are allowed . Unlike the last week notation, the arity of these operators is fixed to 2.
And yes, the one you have all been waiting for is now in: The NOT ('!') operator. This one has an arity of 1.
For compatibility with existing domains, the AND operator is applied by default.

« So, what looks like this new new notation? »
Some examples:

['&', ('active', '=', True), ('value', '!=', 'foo')]
['|', ('active', '=', True), ('state', 'in', ['open', 'draft'])
['&', ('active', '=', True), '|', '!', ('state', '=', 'closed'), ('state', '=', 'draft')]
['|', '|', ('state', '=', 'open'), ('state', '=', 'closed'), ('state', '=', 'draft')]
['!', '&', '!', ('id', 'in', [42, 666]), ('active', '=', False)]


PS: Tanks to Najlâa and Olivier for the review.

Python: reverse enumerate

This morning, I was needed to iterate trough a list from the end to the start and getting the index in addition to the value.

For getting the index, it's easy: we use the builtin function enumerate.
>>> a = ['a', 'b', 'c']
>>> it = enumerate(a)
>>> it
<enumerate object at ...>
>>> it.next()
(0, 'a')
>>> it.next()
(1, 'b')


For iterate in the other sens, it's also easy; there is the builtin function reversed.
>>> a = ['a', 'b', 'c']
>>> it = reversed(a)
>>> it
<listreverseiterator object at ...>
>>> it.next()
'c'
>>> it.next()
'b'


Therefore, getting the index with the element in the reverse order would be easy as the combination of the two functions:
>>> a = ['a', 'b', 'c']
>>> it = reversed(enumerate(a))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: argument to reversed() must be a sequence


Argh! We must transform the enumerate object to a list before passing it to reversed:
>>> a = ['a', 'b', 'c']
>>> it = reversed(list(enumerate(a)))
>>> it.next()
(2, 'c')


It works.
Yes, it works, but there is a problem. It's not evident for everybody but this technique isn't optimized. First we create an iterator object with enumerate, then we convert this object to a list before recreate another iterator object. With a tiny list of three elements like in my examples it's not a pain, but with big lists, it consume a lot of memory by doing a copy of the initial list with the indexes.

The solution is no more complicated but don't use the method enumerated. We simulate his behavior with the method izip from the itertools module and the method xrange. Both methods create iterator, which is our goal.
The itertools.izip method take n iterators in parameters and yield the mth element of each iterator. Example
>>> from itertools import izip
>>> a = ['a', 'b', 'c']
>>> b = ['d', 'e', 'f']
>>> it = izip(a, b)
>>> it.next()
('a', 'd')
>>> it.next()
('b', 'e')


Thus, we can rewrite the enumerate method like this:
>>> a = ['a', 'b', 'c']
>>> it = izip(xrange(len(a)), a)
>>> it.next()
(0, 'a')
>>> it.next()
(1, 'b')


Now, to reverse the order we just reverse all the izip arguments
>>> a = ['a', 'b', 'c']
>>> it = izip(xrange(len(a)-1, -1, -1), reversed(a))
>>> it.next()
(2, 'c')
>>> it.next()
(1, 'b')


To finish we hide this behind a lambda to have a more readable code...
>>> reverse_enumerate = lambda l: izip(xrange(len(l)-1, -1, -1), reversed(l))
>>> a = ['a', 'b', 'c']
>>> it = reverse_enumerate(a)
>>> it.next()
(2, c)


This simple code has been implemented into openobject-server with the revision 964.

Tuesday, August 5, 2008

New domain notation

During the last Community Days, somebody (it was Joël from C2C, I think) complain about the simplicity of the domains. For example, you can't include a OR condition, or negate the domain.

Negate a domain is a false problem: you may just invert the operator. So, [('active', '=', 1)] become [('active', '<>', 1)]

But, for the OR operator, the problem is really there. An implicit AND operation is made between each part of the domain. [('active', '=', 1), ('parent_id', '=', 42)] is translate into a SQL query that look like WHERE active = 1 AND parent_id = 42

Yes, adding a OR condition, is not a bad idea, but, how to include it without breaking all the domains already written ? After some discussions, we decide to uses a pseudo polish notation, always with a implicit AND operator. Now you can prefix domain parts with a '&' or a '|' and group part using parenthesis.

Here are some example of domains and the resulting SQL where clause:

[('foo', '=', 'bar')]
foo = 'bar'

[('id', 'in', [1,2,3])]
id in (1, 2, 3)

[('field', '=', 'value'), ('field', '<>', 42)]
( field = 'value' AND field <> 42 )

[('&', ('field', '<', 'value'), ('field', '>', 'value'))]
( field < 'value' AND field > 'value' )

[('|', ('field', '=', 'value'), ('field', '=', 'value'))]
( field = 'value' OR field = 'value' )

[('&', ('field1', '=', 'value'), ('field2', '=', 'value'), ('|', ('field3', '<>', 'value'), ('field4', '=', 'value')))]
( field1 = 'value' AND field2 = 'value' AND ( field3 <> 'value' OR field4 = 'value' ) )

[('&', ('|', ('a', '=', 1), ('b', '=', 2)), ('|', ('c', '=', 3), ('d', '=', 4)))]
( ( a = 1 OR b = 2 ) AND ( c = 3 OR d = 4 ) )

[('|', (('a', '=', 1), ('b', '=', 2)), (('c', '=', 3), ('d', '=', 4)))]
( ( a = 1 AND b = 2 ) OR ( c = 3 AND d = 4 ) )


Don't hesitate to ask questions if you don't understand.

Stay tuned.


PS: Thank you to Stéphane for the base code of the tree.