Hey so this is a kind of long back and forth that's taking place on the babel-users list, but for our devs I think it's something we're going to want to keep an eye on:

http://lists.alioth.debian.org/pipermail/babel-users/2015-December/002183.html


The gist of it is that the wlanslovenija project switched to babeld some time ago and is starting to encounter a few issues. One in particular might be w/ regards to their topology (which is similar to ours) in that a central node with an enormous routing table might not be a use case that babeld in its current implementation supports well.

It's not clear that there is any single determined cause of the issues with the slovenia network, but this back and forth is somewhat illustrative:

 The amount of state that a Babel node maintains is proportional to v*r,
where v is the number of neighbours and r the number of routes.  Your
network is somewhat unusual in that it has some very central nodes -- 75
neighbours max, I believe --, which is something that Babel doesn't like
very much.  The protocol should be able to deal with that (75 * 500 is
less than 40000), but the implementation will likely need some tuning.
I'm hoping that you can help me do the tuning.
> Or are we now the largest network using it and this is why we are
> getting in all this trouble?
You are the largest Babel network right now.  I'm very excited about your
deployment, and I'm looking forward to tuning the babeld implementation to
work well enough for your needs.
> So this is just another academic project which looks good on the paper
> but in practice it is not really production grade?
Most academic projects produce no useful software, just simulation.  We
are doing our best to provide production-quality software, and as a matter
of fact babeld is running right now in a production network of 200 nodes.
However, Nexedi's network has been designed with Babel in mind, and it
doesn't have any central nodes -- all nodes have roughly the same number
of neighbours.
> We had to turn of Babel in the network and go back to OLSRv1.
Which is a reasonable thing to do in order to solve your short-term
issues.  I hope that you'll remain open to working with me to get babeld
to scale to your needs -- I assure you that it can be done, but I need
profiling data in order to do that.

We haven't run into any issues yet, but we have a much smaller network than the slovenian one. We should keep our eyes out for any resolutions of this issue over on the babel-users list and watch for any error messages we might receive from any of our nodes, in particular our exit server.


Max