[PEAK] Re: PEAK-rules question

Discussion:

PJ Eby

2013-08-30 19:17:37 UTC

On Fri, Aug 30, 2013 at 10:59 AM, S?bastien de Menten

Hello,
I have two questions on your great PEAK-rules packages (thanks a lot for it
!) where I would more than happy to have your opinion/advice.
in a string") decorator, i find the "condition in string" syntax slightly
annoying as it is not recognised by the IDE (pycharm) or other code
inspector (flake, etc). For the IDE point, it is not very readable as there
is not syntax coloring and the autocompletion feature (and other refactoring
features) does not work ideally. For the code inspector, it detects falsely
that some imports are unused as they are in fact only used in the string.
Would you have some trick to avoid this ? I was thinking along in the lines
[proposal deleted]
In your opinion, is this meaningful/useful ? but also doable ?

Neither. First, if you put the condition in the code, it's going to
be executed a second time. About the best you could do here would be
to use a lambda in place of a string, provided that you had a
foolproof way to get the source. And of course, such a method won't
work unless the source is available, or you can reverse-engineer the
bytecode to a condition.

Either way, it's not something I have sufficient interest in; if you
want syntax highlighting of strings and import detection in your IDE,
it seems the simplest way to fix it is to tell the IDE that the
strings need highlighting and parsing, perhaps via some sort of
pragma. For example:

@when(
gf,
"condition here" # pragma: eval
)

If your tools are built for Python, it seems it would be good for them
to support some limited understanding of dynamic code patterns. ;-)

(Of course, the actual spelling of the pragma might be simpler, and in
keeping with whatever pragma facilites are already supported by the
tools.)

All that being said, if someone wants to implement the lambda idea, it
is quite possible by registering appropriate methods with the
PEAK-Rules core for parsing the "function" type. In peak.rules.core,
type and tuple handlers are registered, and in peak.rules.predicates,
a string handler are registered. So by registering a handler for
lambdas, you could implement that feature yourself.

(Personally, I think it would probably be easier to get the other
tools to recognize certain strings as code, than to reverse-engineer a
condition from a lambda or trying to extract its source, but it's up
to you.)

My second question is related to the use case for peak rules.
I am using peak-rules in a project essentially to replace standard object
methods in a system with a lot of composition of objects with inheritance.
So I have a class Operation which instances have two attributes which are
instances of two other classes Source and Destination. So the "abstract"
source = None (but should be an instance of Source)
destination = None (but should be an instance of Destination)
then I can subclass Operation to add new attributes/meaning for the class
and the same for Source and Destination.
After that, I have an engine that must interpret (=do calculation on) the
Operation with its Source and Destination and the interpretation is
dependent on the 3 objects (operation, source and destination).
I use peak.rules to dispatch to the correct algorithm in function of the
@abstract
pass
@when(interpret, ?isinstance(operation, SubOperationOfTypeZ) and
isinstance(operation.source, SubTypeWOfSource) and
operation.source.date_of_creation.month < 5?)
return ?blabla?
Is this usage of peak.rules OK ? would you do it differently ? My feeling it
is that it is one of the use of multi-methods ... but it is my first use of
this concept in a real project.

I don't know how to answer your question because the description is
too abstract. My feeling is that it depends a lot on why you are
representing these things this way in the first place, and what the
things are. What sort of operations, sources, and destinations are we
talking about? What does the system do, and why did you choose this
structure? Without knowing those things, I can only shrug in reply to
the question.

Sébastien de Menten

2013-09-01 21:15:21 UTC

Permalink

Hello,

After some research on the topic, I found a simple way to do what i wanted
with macropy.

I define the following macro in a module

# macro_module.py
from macropy.core.macros import Macros, Str, unparse

macros = Macros()

@macros.expr
def s(tree, **kw):
return Str(unparse(tree))
#===========

and then use it as:

# test.py
from macro_module import macros, s
from peak.rules import abstract, when

@abstract
def foo(a,b):
pass
foo.when = lambda condition: when(foo, condition) #<== this could be
better implemented in the abstract decorator

@foo.when(s[a>4 and b<56])
def foo_cond1(a,b):
return a+b
#===========

The code is processed automatically with all s[...] transformed in "..."
like:
@foo.when("a>4 and b<56")

which is exactly what i needed (even though there is a dependency on
macropy).

For my second question, explaining the real domain space would need pages
so I must find a 'to-the-point" analogy that captures the gist of the
problem. So here is one that does not use domain-specific terms but maps
100% with the problem.
I have different kind of cars (pickup, sedans, convertible, etc) which have
different transmissions (4x4, traction, propulsion) and different tires
(all season, snow, high performance, etc) and all combinations of Car x
Transmission x Tires are possible.
Now the generic functions I want to write are like :
- "optimal_acceleration(car, weather)"
- "maintenance_planning(car)"
- "customer_match(car, customer_profile)"
These functions may depend on the Car(Transmission, Tire) combination and
while all may not be supported, I want to be able to add support for any of
them. Moreover, if a user adds a new kind of transmission, or car, or tire,
I want to be able to add the functions for these new combinations (if they
make sense ... otherwise, an exception NoApplicableMethods is perfect !

I hope this analogy is clearer than my original description.

Sebastien
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.eby-sarna.com/pipermail/peak/attachments/20130901/5fa006d5/attachment.html

PJ Eby

2013-09-03 01:33:23 UTC

Permalink

Better still might be to implement @when() and other such APIs as
macros that transform just the second argument; that way your code
would not need the extra "s" function. Still, it's definitely a way
around the IDE issues, and doesn't suffer from the usual issues of
access to source code (I assume, anyway).

For my second question, explaining the real domain space would need pages so
I must find a 'to-the-point" analogy that captures the gist of the problem.
So here is one that does not use domain-specific terms but maps 100% with
the problem.
I have different kind of cars (pickup, sedans, convertible, etc) which have
different transmissions (4x4, traction, propulsion) and different tires (all
season, snow, high performance, etc) and all combinations of Car x
Transmission x Tires are possible.
- "optimal_acceleration(car, weather)"
- "maintenance_planning(car)"
- "customer_match(car, customer_profile)"
These functions may depend on the Car(Transmission, Tire) combination and
while all may not be supported, I want to be able to add support for any of
them. Moreover, if a user adds a new kind of transmission, or car, or tire,
I want to be able to add the functions for these new combinations (if they
make sense ... otherwise, an exception NoApplicableMethods is perfect !

Sounds just like what PEAK-Rules is intended for.

One thing you might want to watch out for in de-structuring rules like
this (ones that depend on attributes of a parameter), is to remember
that PEAK-Rules generally applies tests to parameters before testing
attributes. This doesn't affect rule *precedence* (as far as what's
most specific), but it does affect *evaluation order*.

For example, if all your rules for a given function read in this order:

isinstance(car.tires, Foo) and isinstance(car.transmission, Bar)

That is, if they all check the tires before checking the transmission,
then PEAK-Rules will conservatively assume that it's not safe to
access the transmission attribute until after the tires are checked.
This limits the shapes that the dispatch tree will be built in, even
though it doesn't affect the answers given. Under some circumstances,
this might create larger or slower dispatch trees.

In contrast, tests based directly on parameters are assumed to be
independent of each other (e.g. "param1>1 and param2>2" is assumed to
allow checking either condition first) and so can be used at any level
of the dispatch tree, and so the builder is free to use whatever shape
works best for that subtree (at a cost of a little more setup time to
decide which shape is best.)

I guess what I'm trying to say is that if you have a relatively flat
rule structure, based mainly on car, car.tires, and car.transmission,
you may find it useful to make the tires and transmissions direct
parameters, or else to deliberately vary the order they appear in your
rules, so that the rule system will have maximum freedom to construct
dispatch trees.

I wouldn't even bother to mention this, except that from your
description I don't have any way to know just how complex your
dispatch trees will be. In general, I find that generic functions
tend to follow one of two patterns:

1. "Registry" functions -- not heavily dependent on predicates, more
dependent on types of a few key parameters, relatively simple rules
with occasional extra criteria
2. "Pattern matching" or "compiler" functions -- while one or two
parameter types may be involved, the bulk of the rules are predicates,
often deeply nested predicates on component objects; often found in
compilers or compiler-like systems (such as PEAK-Rules itself) to
pattern-match subtrees with complex conditions.

In pattern #2, evaluation order may be rather important, but the tree
size is fairly limited by the fact that most of the actual predicates
exist for only a few rules. In pattern #1, the bulk of the indexing
is for that handful of parameters, so as long as they're
freely-orderable (i.e., can be tested independently), you can't get a
blowup of tree size. (And in pattern #1, the tests are usually done
directly on the parameters, not on attributes or methods or formulas
using the parameters.)

From your vague description, I can't tell if your use case is more

like #1 or #2, or some sort of hybrid. If it's basically #1 but using
lots of independent attribute-level checks, and you have very large
rulesets, then you may want to promote the attributes to parameters
(or access those attributes in different order some of the time), in
order to allow PEAK-Rules full tree reordering freedom. It will not
affect the answers you get, only memory usage (and possibly speed).
Even then, it's unlikely to do it in the kinds of uses I'm familiar
with, but since I'm not familiar with your case, I am being
extra-thorough and cautious. ;-)

Sébastien de Menten

2013-09-05 21:09:14 UTC

Permalink

Post by PJ Eby

@abstract
pass
foo.when = lambda condition: when(foo, condition) #<== this could be
better implemented in the abstract decorator

macros that transform just the second argument; that way your code
would not need the extra "s" function. Still, it's definitely a way
around the IDE issues, and doesn't suffer from the usual issues of
access to source code (I assume, anyway).
Indeed, good idea !

For my second question, explaining the real domain space would need

pages so

I must find a 'to-the-point" analogy that captures the gist of the

problem.

So here is one that does not use domain-specific terms but maps 100% with
the problem.
I have different kind of cars (pickup, sedans, convertible, etc) which

have

different transmissions (4x4, traction, propulsion) and different tires

(all

season, snow, high performance, etc) and all combinations of Car x
Transmission x Tires are possible.
- "optimal_acceleration(car, weather)"
- "maintenance_planning(car)"
- "customer_match(car, customer_profile)"
These functions may depend on the Car(Transmission, Tire) combination and
while all may not be supported, I want to be able to add support for any

them. Moreover, if a user adds a new kind of transmission, or car, or

tire,

I want to be able to add the functions for these new combinations (if

they

make sense ... otherwise, an exception NoApplicableMethods is perfect !

Sounds just like what PEAK-Rules is intended for.
One thing you might want to watch out for in de-structuring rules like
this (ones that depend on attributes of a parameter), is to remember
that PEAK-Rules generally applies tests to parameters before testing
attributes. This doesn't affect rule *precedence* (as far as what's
most specific), but it does affect *evaluation order*.
isinstance(car.tires, Foo) and isinstance(car.transmission, Bar)
That is, if they all check the tires before checking the transmission,
then PEAK-Rules will conservatively assume that it's not safe to
access the transmission attribute until after the tires are checked.
This limits the shapes that the dispatch tree will be built in, even
though it doesn't affect the answers given. Under some circumstances,
this might create larger or slower dispatch trees.
In contrast, tests based directly on parameters are assumed to be
independent of each other (e.g. "param1>1 and param2>2" is assumed to
allow checking either condition first) and so can be used at any level
of the dispatch tree, and so the builder is free to use whatever shape
works best for that subtree (at a cost of a little more setup time to
decide which shape is best.)
I guess what I'm trying to say is that if you have a relatively flat
rule structure, based mainly on car, car.tires, and car.transmission,
you may find it useful to make the tires and transmissions direct
parameters, or else to deliberately vary the order they appear in your
rules, so that the rule system will have maximum freedom to construct
dispatch trees.
I wouldn't even bother to mention this, except that from your
description I don't have any way to know just how complex your
dispatch trees will be. In general, I find that generic functions
1. "Registry" functions -- not heavily dependent on predicates, more
dependent on types of a few key parameters, relatively simple rules
with occasional extra criteria
2. "Pattern matching" or "compiler" functions -- while one or two
parameter types may be involved, the bulk of the rules are predicates,
often deeply nested predicates on component objects; often found in
compilers or compiler-like systems (such as PEAK-Rules itself) to
pattern-match subtrees with complex conditions.
In pattern #2, evaluation order may be rather important, but the tree
size is fairly limited by the fact that most of the actual predicates
exist for only a few rules. In pattern #1, the bulk of the indexing
is for that handful of parameters, so as long as they're
freely-orderable (i.e., can be tested independently), you can't get a
blowup of tree size. (And in pattern #1, the tests are usually done
directly on the parameters, not on attributes or methods or formulas
using the parameters.)
From your vague description, I can't tell if your use case is more
like #1 or #2, or some sort of hybrid. If it's basically #1 but using
lots of independent attribute-level checks, and you have very large
rulesets, then you may want to promote the attributes to parameters
(or access those attributes in different order some of the time), in
order to allow PEAK-Rules full tree reordering freedom. It will not
affect the answers you get, only memory usage (and possibly speed).
Even then, it's unlikely to do it in the kinds of uses I'm familiar
with, but since I'm not familiar with your case, I am being
extra-thorough and cautious. ;-)

It is definitely more the #2 (pattern matching). Is there some
introspection ability on the decision tree PEAK-rules build ? Otherwise,
i'll check first the overhead to see if it is penalising or not before
optimising.
Thank you for you detailed description of these mechanisms !
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.eby-sarna.com/pipermail/peak/attachments/20130905/07015cd8/attachment.html

PJ Eby

2013-09-05 23:05:40 UTC

Permalink

Post by PJ Eby
2. "Pattern matching" or "compiler" functions -- while one or two
parameter types may be involved, the bulk of the rules are predicates,
often deeply nested predicates on component objects; often found in
compilers or compiler-like systems (such as PEAK-Rules itself) to
pattern-match subtrees with complex conditions.
In pattern #2, evaluation order may be rather important, but the tree
size is fairly limited by the fact that most of the actual predicates
exist for only a few rules. In pattern #1, the bulk of the indexing
is for that handful of parameters, so as long as they're
freely-orderable (i.e., can be tested independently), you can't get a
blowup of tree size. (And in pattern #1, the tests are usually done
directly on the parameters, not on attributes or methods or formulas
using the parameters.)

From your vague description, I can't tell if your use case is more

It is definitely more the #2 (pattern matching).

Well, that's good; the base algorithm can usually perform at least as
well for that case as a human-optimized decision tree, in the case
where the human doesn't know anything about how often different
branches are taken. ;-)

(In theory I could add dynamic branch optimizations, but in most
applications it might use more time than it saves.)

Not at present, no.

Post by SÃ©bastien de Menten
Otherwise, i'll check first
the overhead to see if it is penalising or not before optimising.
Thank you for you detailed description of these mechanisms !

There is even more description in the various .txt files in the
source, which document and test various internals, if you want to know
more.