[mesh-dev] Thinking through LAN port usage, or: Why bridges are bad news

Marc Juul juul at labitat.dk
Mon Mar 30 03:07:19 PDT 2015


Oh look: It's another very long email from juul!

I've been working on the config for the N750 + antenna-node configuration
and here are my thoughts so far.

We should _not_ let the LAN ports be one big ethernet interface for use
both as a wired version of open0 _and_ a way to connect antenna-nodes. We
never want to bridge two interfaces with attached antenna-nodes but
treating the multiple LAN ethernet interfaces as one interface is
effectively the same as bridging.

Example scenario: You have to nanobridges on your roof linking you to two
parts of the mesh. They are both plugged into LAN ports on your N750 and
since the LAN ports are treated as one interface then the two nanobridges
are able to communicate on layer 2. The nodes at the houses to which you
are connecting with your nanobridges have the same setup. Now we
effectively have the spanning tree protocol as our mesh protocol instead of
babel.

It was Alex who pointed this out when we were talking about bridging but
for some reason I hadn't connected the dots and recognized that the same is
true when telling the built-in switch to treat the interfaces as one.

This means that each physical port on the N750 that we wish to connect to
an antenna-node must be its own interface and should have a /32 netmask.
These interfaces can still have the same node IP as open0, etc, without any
problems.

I suggest we allocate two of the four ports for this purpose until we can
make something that intelligently changes the config when an antenna-node
has been connected.

We now have two remaining LAN ports that can act as a single interface. We
could then bridge this LAN interface to open0. However, we want to avoid
channel interference when sending from/to open0 but there is no reason we
should try to avoid channel interference for traffic coming from the wired
interfaces. If we bridge them we cannot treat them differently, so we
should not bridge.

It is not clear if babel channel diversity also takes into account channel
information for manually published routes such as the open0 route we
currently publish with this rule:

  redistribute  if open0  metric 128

I'll have to look at the source code to check. If the functionality is not
there then it should be fairly easy to add.

Aaaanyway: Since we don't want to bridge open0 and LAN, this complicates
things because we then need two subnets per node (otherwise we have the
same /26 on both LAN and open0, which is not going to work). Less than a
/26 on either interface gets iffy, so it seems like we'll need to have two
/26 subnets per node now.

One last but important thing I realized: If the antenna-nodes have their
wifi and ethernet interfaces bridged, then we will have problems. Imagine
the following setup:

(N750 A) ------ (nanobridge A) ~~~~ (nanobridge B) ----- (N750 B)

Where "-----" means ethernet and "~~~~" means wifi.

If both nanobridge A and nanobridge B are simply bridging their ethernet
and wifi interfaces, then we have the following problems:

1. If e.g. nanobridge A sends a DHCP request, then it will be received by
both N750 A and N750 B and they will have no reliable way of know if the
request was sent by "their" nanobridge or the remote nanobridge.

2. When managing e.g. nanobridge A via the N750 A web admin interface it
will be impossible for nanobridge A to know whether it should grant admin
access to N750 A or N750 B.

I can think of only two solutions to this:

1. Pre-configure all antenna-nodes with static IPs and knowledge of which
N750 node is their parent.

2. Don't bridge the ethernet and wifi interfaces on antenna-nodes and
instead run babel on them.

I like solution 2 much better since it's easier for both us and node
operators. Not sure if we'll see a performance hit if we don't bridge. We
have a few nanobridges though so we could easily test this.


What do y'all say?


PS: Obviously babel channel diversity doesn't apply to antenna nodes since
babel sees them as ethernet interfaces but since they are mostly
directional and far away from the N750 it is fine to treat them as having
no interference, which is the default for ethernet interfaces anyway.

PPS: Did you know that the default STP delay when adding an interface to a
bridge is apparently 30 seconds? Not sure if OpenWRT has a different
default or maybe uses RSTP (rapid spanning tree protocol) to deal with
this, but if not then it is something to be aware of in order to now go
insane when troubleshooting. From:
http://www.linuxfoundation.org/collaborate/workgroups/networking/bridge#Does_DHCP_work_over.2Fthrough_a_bridge.3F


-- 
marc/juul
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://sudoroom.org/lists/private/mesh-dev/attachments/20150330/2249799f/attachment.html>


More information about the mesh-dev mailing list