[PEAK] recursion limit

Discussion:

nicky van foreest

2011-07-15 18:44:26 UTC

Hi,

I ran into a RuntimeError: maximum recursion depth exceeded in cmp
with Trellis in this (simple) piece of code. I use it to compute the
arrival time, waiting time, etc, for a set of jobs in a simulation.
(These jobs are served on a single server.)

from peak.events import trellis

class Job(trellis.Component):
trellis.attrs(
a = 0, # interarrival time (to previous job)
b = 0, # processing time
)

def __init__(self, a, b, prev = None):
self.a = a
self.b = b
self.prev = prev

def __repr__(self):
ret = str(self.a) + " "
ret += str(self.b) + " "
ret += str(self.arTime) + " "
ret += str(self.waitTime) + " "
ret += str(self.sojourn) + " "
return ret

trellis.compute.attrs(
arTime = lambda self: self.prev.arTime + self.a if self.prev else 0,
waitTime = lambda self: max(self.prev.sojourn - self.a if
self.prev else 0, 0),
sojourn = lambda self: self.waitTime + self.b
)

prev = Job(0,4)
sim = [prev]
for i in range(30):
nxt = Job(1,8,prev)
sim.append(nxt)
prev = nxt

for j in sim:
print j

I think I can repair this by resetting the recursion depth. However, I
suppose this will not work if I run a simulation with a 10e6 jobs. Why
actually does trellis run into this problem? It makes me a bit
suspicious about the scaleability of trellis, or should I not worry
about this?

thanks for any hints.

Nicky

P.J. Eby

2011-07-15 19:42:00 UTC

Permalink

Post by nicky van foreest
Hi,
I ran into a RuntimeError: maximum recursion depth exceeded in cmp
with Trellis in this (simple) piece of code.

I ran the code and it does not produce an error for me; it just runs.

Post by nicky van foreest
I think I can repair this by resetting the recursion depth. However, I
suppose this will not work if I run a simulation with a 10e6 jobs. Why
actually does trellis run into this problem?

Since I'm not able to reproduce the problem, I couldn't
say. However, if you are getting recursion in a comparison, it's
possible that it's because you are using a __cmp__ or other
comparison method in your code that's not in the sample you sent me,
wherein that comparison relies on trellis properties. (OTOH, if that
were the case, then changing the recursion depth wouldn't fix it.)

Post by nicky van foreest
It makes me a bit
suspicious about the scaleability of trellis, or should I not worry
about this?

I don't think so. The recalculation algorithm is only recursive if
you request calculation of a value that has not previously been
calculated. That is, if you set up a chain of a million items, and
only then request some info about the last one, *and* the values it
depends on are lazy (that is, they don't get automatically computed
just by initializing their owner object), then yes, it will have to
recurse to one million depth (times a constant).

So, if you are going to set up a heavily chained calculation, you can
make it non-recursive simply by ensuring that the desired property is
calculated eagerly, e.g. by using trellis.maintain(). This will both
amortize the initial calculation over your setup time, *and* ensure
that any subsequent read operations can't end up being recursive.

Post by nicky van foreest
thanks for any hints.

FYI, there is potentially a bug in your code if a Job's 'prev'
attribute can be changed. As your code stands, changing a job's
'prev' attribute will leave its calculations linked to the old
'prev', until something causes the rules to be recalculated. (At
which time they'll link to the new 'prev'.) You should make 'prev' a
Trellis attribute if it's not a constant. Then, changing it will
cause the appropriate recalcuations to occur.

P.J. Eby

2011-07-15 23:05:09 UTC

Permalink

For me this works too, but if you change the range from 30 to 3000,
say, I suspect you will get an error. In fact, at my machine I already
get an error when the range is 100.

Ok, confirmed at 100, and switching the 'compute' attrs to 'maintain'
attrs makes it go away, even up to 10000.

Ok. I get this. So the trick is to prevent this, like you propose with
using trellis.maintain.

Yes. Lazy computation is for things that you may or may not need to
compute. If you know you are going to need it from the start, then
you should use maintain.

I have some more theoretical questions about trellis. If you don't
have time, please skip them. First, if I were to start something like
trellis, I think I would use some topological sort on a graph. I
cannot find this in the trellis code.

Cell objects have a level attribute that's used to indicate their
distance from a dependency graph root; is that what you're asking about?

Second, I would like to have a
program/parser that figures out by itself the dependency graph of the
attributes and functions. Now I have to make explicit the
trellis.attrs and the trellis.compute.attrs/trellis.maintain.attrs,
which I would prefer not to do by hand, but let the computer figure it
out. Why is this is not implemented in trellis? (I must admit that I
don't quite know how to do this; it is just a question out of
interest.)

I don't understand what feature you're asking for here. The
distinction between compute, maintain, and attrs is part of the
design of your program, not something that a parser can figure
out. By stating that something is a maintained value, it means that
it may have side-effects and also that it should be able to access
the previously computed value. By saying it's computed, you are
saying there are no side-effects to the computation and that it
should only be calculated as needed. Otherwise, it is just an
observable variable.

There is no way in a language like Python to determine by mere
code-parsing whether something has side-effects, so that distinction
at least *cannot* be implemented in an automatic way, no matter how
good a programmer you are. ;-)

In any case, the declarations of your attributes and functions serve
not only as implementation instructions for the Trellis, but also as
documentation of your intentions to the reader of your code.

Nicky

P.S. When replying to a message that's on a mailing list like this,
it's best to use "Reply To All" so that the mailing list is
included. (i.e., your message went only to me, and not to the list.)

nicky van foreest

2011-07-16 21:48:55 UTC

Permalink

HI Philip,

Thanks for your explanations.

Ok, confirmed at 100, and switching the 'compute' attrs to 'maintain' attrs
makes it go away, even up to 10000.

Here it also works at 10000 too, but it becomes noticebly slow.
However, this is as expected, given the amount of work.

Cell objects have a level attribute that's used to indicate their distance
from a dependency graph root; is that what you're asking about?

Yes.

?Second, I would like to have a
program/parser that figures out by itself the dependency graph of the
attributes and functions. Now I have to make explicit the
trellis.attrs and the trellis.compute.attrs/trellis.maintain.attrs,
which I would prefer not to do by hand, but let the computer figure it
out. Why is this is not implemented in trellis? (I must admit that I
don't quite know how to do this; it is just a question out of
interest.)

I don't understand what feature you're asking for here. ?The distinction
between compute, maintain, and attrs is part of the design of your program,
not something that a parser can figure out. ?By stating that something is a
maintained value, it means that it may have side-effects and also that it
should be able to access the previously computed value. ?By saying it's
computed, you are saying there are no side-effects to the computation and
that it should only be calculated as needed. ?Otherwise, it is just an
observable variable.

Ok. I did not understand this distinction well enough up to now, at
least not in the sense that this distinction can (and should) be used
as a part of the design of a program.

Nicky