Hey so this is a kind of long back and forth that's taking place on the
babel-users list, but for our devs I think it's something we're going to
want to keep an eye on:
http://lists.alioth.debian.org/pipermail/babel-users/2015-December/002183.h…
The gist of it is that the wlanslovenija project switched to babeld some
time ago and is starting to encounter a few issues. One in particular might
be w/ regards to their topology (which is similar to ours) in that a
central node with an enormous routing table might not be a use case that
babeld in its current implementation supports well.
It's not clear that there is any single determined cause of the issues with
the slovenia network, but this back and forth is somewhat illustrative:
The amount of state that a Babel node maintains is proportional to v*r,
> where v is the number of neighbours and r the number of routes. Your
> network is somewhat unusual in that it has some very central nodes -- 75
> neighbours max, I believe --, which is something that Babel doesn't like
> very much. The protocol should be able to deal with that (75 * 500 is
> less than 40000), but the implementation will likely need some tuning.
> I'm hoping that you can help me do the tuning.
> > Or are we now the largest network using it and this is why we are
> > getting in all this trouble?
> You are the largest Babel network right now. I'm very excited about your
> deployment, and I'm looking forward to tuning the babeld implementation to
> work well enough for your needs.
> > So this is just another academic project which looks good on the paper
> > but in practice it is not really production grade?
> Most academic projects produce no useful software, just simulation. We
> are doing our best to provide production-quality software, and as a matter
> of fact babeld is running right now in a production network of 200 nodes.
> However, Nexedi's network has been designed with Babel in mind, and it
> doesn't have any central nodes -- all nodes have roughly the same number
> of neighbours.
> > We had to turn of Babel in the network and go back to OLSRv1.
> Which is a reasonable thing to do in order to solve your short-term
> issues. I hope that you'll remain open to working with me to get babeld
> to scale to your needs -- I assure you that it can be done, but I need
> profiling data in order to do that.
We haven't run into any issues yet, but we have a much smaller network than
the slovenian one. We should keep our eyes out for any resolutions of this
issue over on the babel-users list and watch for any error messages we
might receive from any of our nodes, in particular our exit server.
Max