On Mon, Mar 30, 2015 at 3:07 AM, Marc Juul <juul(a)labitat.dk> wrote:
Oh look: It's another very long email from juul!
I've been working on the config for the N750 + antenna-node configuration
and here are my thoughts so far.
We should _not_ let the LAN ports be one big ethernet interface for use
both as a wired version of open0 _and_ a way to connect antenna-nodes. We
never want to bridge two interfaces with attached antenna-nodes but
treating the multiple LAN ethernet interfaces as one interface is
effectively the same as bridging.
Example scenario: You have to nanobridges on your roof linking you to two
parts of the mesh. They are both plugged into LAN ports on your N750 and
since the LAN ports are treated as one interface then the two nanobridges
are able to communicate on layer 2. The nodes at the houses to which you
are connecting with your nanobridges have the same setup. Now we
effectively have the spanning tree protocol as our mesh protocol instead of
babel.
It was Alex who pointed this out when we were talking about bridging but
for some reason I hadn't connected the dots and recognized that the same is
true when telling the built-in switch to treat the interfaces as one.
This means that each physical port on the N750 that we wish to connect to
an antenna-node must be its own interface and should have a /32 netmask.
These interfaces can still have the same node IP as open0, etc, without any
problems.
I suggest we allocate two of the four ports for this purpose until we can
make something that intelligently changes the config when an antenna-node
has been connected.
We now have two remaining LAN ports that can act as a single interface. We
could then bridge this LAN interface to open0. However, we want to avoid
channel interference when sending from/to open0 but there is no reason we
should try to avoid channel interference for traffic coming from the wired
interfaces. If we bridge them we cannot treat them differently, so we
should not bridge.
It is not clear if babel channel diversity also takes into account channel
information for manually published routes such as the open0 route we
currently publish with this rule:
redistribute if open0 metric 128
I'll have to look at the source code to check. If the functionality is not
there then it should be fairly easy to add.
Aaaanyway: Since we don't want to bridge open0 and LAN, this complicates
things because we then need two subnets per node (otherwise we have the
same /26 on both LAN and open0, which is not going to work). Less than a
/26 on either interface gets iffy, so it seems like we'll need to have two
/26 subnets per node now.
One last but important thing I realized: If the antenna-nodes have their
wifi and ethernet interfaces bridged, then we will have problems. Imagine
the following setup:
(N750 A) ------ (nanobridge A) ~~~~ (nanobridge B) ----- (N750 B)
Where "-----" means ethernet and "~~~~" means wifi.
If both nanobridge A and nanobridge B are simply bridging their ethernet
and wifi interfaces, then we have the following problems:
1. If e.g. nanobridge A sends a DHCP request, then it will be received by
both N750 A and N750 B and they will have no reliable way of know if the
request was sent by "their" nanobridge or the remote nanobridge.
2. When managing e.g. nanobridge A via the N750 A web admin interface it
will be impossible for nanobridge A to know whether it should grant admin
access to N750 A or N750 B.
I can think of only two solutions to this:
1. Pre-configure all antenna-nodes with static IPs and knowledge of which
N750 node is their parent.
2. Don't bridge the ethernet and wifi interfaces on antenna-nodes and
instead run babel on them.
There's a third solution: On boot-up the antenna-nodes do not bridge their
interfaces. They then get an IP from the N750 and run a hook script that
bridges the two interfaces. Another hook script is run to unbridge upon
physical ethernet disconnect.
Unfortunately it seems like it's not possible to use dnsmasq as the dhcp
server on these interfaces since the netmask will be /32 on the N750 and
dnsmasq figures out which interface to use based on the subnet. It might be
better to make a very very simple dhcp-like server that uses a different
port and only ever gives out one specific IP for each port. This might not
be a bad thing since it will prevent normal DHCP clients from getting an IP
when they connect to the N750 ethernet ports dedicated to antenna-nodes.
I like solution 2 much better since it's easier
for both us and node
operators. Not sure if we'll see a performance hit if we don't bridge. We
have a few nanobridges though so we could easily test this.
What do y'all say?
PS: Obviously babel channel diversity doesn't apply to antenna nodes since
babel sees them as ethernet interfaces but since they are mostly
directional and far away from the N750 it is fine to treat them as having
no interference, which is the default for ethernet interfaces anyway.
PPS: Did you know that the default STP delay when adding an interface to a
bridge is apparently 30 seconds? Not sure if OpenWRT has a different
default or maybe uses RSTP (rapid spanning tree protocol) to deal with
this, but if not then it is something to be aware of in order to now go
insane when troubleshooting. From:
http://www.linuxfoundation.org/collaborate/workgroups/networking/bridge#Doe…
--
marc/juul