At Omni we're using TP-Link dual-band home routers (N750) and modifying
them for PoE.
These routers have gigabit ethernet.
Unfortunately gigabit ethernet and PoE at the same requires special
ethernet transformers and these routers don't have that type of transformer.
This means that they drop to 100 mbit when we modify them for PoE (but only
on the port we modify).
I was looking into how to get around this problem.
I thought maybe there was a way to do gigabit half-duplex on two or three
pairs, but it seems like that's not possible.
BUT! It looks like TP-Link makes a PoE splitter which can apparently handle
gigabit:
The TL-POE10R
Here it is for $12:
http://www.ebay.com/itm/TP-LINK-TL-POE10R-Gigabit-PoE-Splitter-Adapter-IEEE…
I ordered one so we can try it out. If it works we should just do that to
all of the routers. No reason not to have gigabit if it's only $12 extra
per router. It's still only $77 total for router + PoE splitter.
oh and btw, someone is selling dirt cheap Ubiquiti Unifi UAP devices:
http://www.ebay.com/itm/New-Ubiquiti-UniFi-UAP-802-11n-300-Mbps-Wireless-Ac…
Caution: They are 2.4 GHz only.
I ordered one for the mesh, just so we have it to test our firmware on.
--
marc/juul
Here's what went down.
# Added functionality to our babeld fork
-x for dynamically removing interfaces (we only had -a to add them)
-F to enable the dynamic functionality (fungible mode)
-i to print the "kill -USR1" information for the running babeld
babeld no longer requires any interfaces to be specified when initially
started if fungible mode is enabled
# Switched our firmwares to using our babeld fork
Here it is:
https://github.com/sudomesh/sudowrt-packages/tree/master/net/babeld-sudowrt
We were only using them on the VPuN (exit) server before.
I haven't tried to recompile the firmware with this package added. Maybe
someone else can test that this compiles correctly?
max: be aware that VPuN servers will now have to start new versions with -F
to get the dynamic functionality
# Completed extender-node functionality
Everything now works as expected with babeld running on the extender nodes.
The extender nodes come up automatically and both the open and adhoc
networks work.
Due to feedback by Dave Taht I abandoned adding avahi-daemon as a reflector
on the extender nodes and pushed forwarding of mDNS traffic to the
milestone for a future release.
The one thing left to do for the extender nodes is to re-compile both
firmwares from scratch, flash two nodes and test that it all comes up as
expected. I've tried hard to bring the repositories in line with the
working configuration on my two test nodes, but I may have missed something.
# Added milestones and issues on github
Milestones:
https://github.com/sudomesh/sudowrt-firmware/milestones
Issues for upcoming version 0.2:
https://github.com/sudomesh/sudowrt-firmware/milestones/0.2%20-%20initial%2…
Please add any issues I may have missed. Also, please change things if you
disagree :) I just did what I thought made sense but I'm not married to
anything.
Yay progress!
--
marc/juul
# The problem
Our extender-nodes can't bridge their adhoc wifi interface to their
ethernet interface since bridging adhoc is not possible. A layer 3
pseudo-bridge using relayd should work but both Babel and mDNS rely on
multicast which is not handled by relayd.
# Solutions
It looks like we have two options:
1. Forward both babel and mDNS IPv4 and IPv6 multicast messages between
interfaces selectively based on the multicast addresses of these protocols
2. Have extender-nodes forward / route multicast traffic in general
(pimd/pim6sd).
Running babeld is a solution, but not ideal since it makes Babel deal with
extender-nodes as real nodes, adding more signalling traffic to the network
which should be unnecessary and complicates our network graphs.
## Selective multicast routing/forwarding
mDNS has existing solutions available, such as:
* avahi with enable-reflector option (kinda bulky)
* mdns-repeater (no openwrt package exists)
For Babel we are not so lucky. Stripping babeld down so it is only a dumb
forwarder for Babel multicast traffic and then running that on the
extender-nodes might be a way to go.
Another option is to use a general-purpose IPv4/IPv6 multicast
reflector/forwarder that can be configured to relay or route traffic for a
set of multicast addresses and ports.
RFC 6621 which is implemented in nrlsmf is something like that but:
* It is a userland process that sniffs packets in promiscious mode (shit
performance)
* It forwards _all_ multicast traffic (which would be fine for out use case
if it wasn't doing this in userland)
* It has no specified license though source is available
mcproxy (similar to igmpproxy but more modern) is not too far from what we
need. It seems likely that mcproxy could be fairly easily modified for
selective forwarding. mcproxy is nice because it doesn't intercept data in
userland in order to forward (it uses the kernel multicast routing
features). Currently it listens for multicast subscriptions from the
downstream network and then begins forwarding multicast traffic on the
subscribed IP to the downstream interface.
What we need is a simpler daemon that doesn't listen for subscriptions but
instead forwards in both directions based on the following configuration
file options:
* IP version / Protocol (IGMPv2, IGMPv3, MLDv1, etc)
* Multicast IP to forward for (e.g. 224.0.0.111
* Interfaces to forward between
So you could tell it, e.g:
"forward all 224.0.0.111 and ff02::1:6 traffic between eth0.1 and adhoc0"
(babel traffic)
or:
"forward all 224.0.0.251 and ff02::fb traffic between eth0.1 and adhoc0"
(mDNS traffic)
# General-purpose multicast routing/forwarding
Instead of selectively forwarding we could run real multicast routing
daemons and simply configure them to forward all multicast traffic. The two
standard packages for this seem to be pimd for IPv4 and pim6sd for IPv6.
Attitude Adjustment has a pim6sd package but seems to be missing a package
for the normal pimd package. Even worse, it looks like Chaos Calmer drops
the pim6sd package as well (I really wish there was a central place where
one could read about all OpenWRT dropped/added packages and reasons for
doing so).
We could try to run these two daemons, but we'd have to make a package for
pimd and figure out why they're droppping pim6sd. Doesn't seem like the
worst chore.
If we wanted to avoid pimd, then we could disable mDNS on IPv4, which
shouldn't present a problem since all mesh-connected devices will have IPv6
addresses anyway, but it's probably unlikely that babeld can function on
both IPv4 and IPv6 without IPv4 multicast (unless we change a bunch of
code). Totally worth checking if it already has that ability though, since
avoiding IPv4 multicast would be a bonus.
# Tricks
If you run "mcproxy -c" it will check if all relevant kernel config options
are enabled for IPv4 and IPv6 multicast routing. It looks like default
Attitude Adjustment has all but one feature enabled. Here's the output:
```
# mcproxy -c
Check the currently available kernel features.
- root privileges: Ok!
- ipv4 multicast: Ok!
- ipv4 multiple routing tables: Ok!
- ipv4 routing tables: Ok!
- ipv4 mcproxy was able to join 40+ groups successfully (no limit found)
- ipv4 mcproxy was able to set 40+ filters successfully (no limit found)
- ipv6 multicast: Ok!
ERROR: failed to set kernel table! Error: Protocol not available errno: 99
- ipv6 multiple routing tables: Failed!
- ipv6 routing tables: Ok!
- ipv6 mcproxy was able to join 40+ groups successfully (no limit found)
- ipv6 mcproxy was able to set 40+ filters successfully (no limit found)
```
It's unclear if we need support for multiple IPv6 routing tables (probably
not), but it's probably trivial to enable the kernel option if we do.
# Conclusion
I propose that we run babeld on the extender nodes and use avahi as a
reflector for now. We can then revisit this for e.g. version 0.4.
What do you all say?
# Ponderings
Based on this preliminary research it looks like the long-term solution
involving the smallest amount of work is probably running pim6sd and pimd.
This is gratifying since it would be really cool to have a mesh that does
real actual multicast routing.
I'm especially excited about this since the film-maker collective at Omni
(Optik Allusions) seem to be growing and because I have been working hard
on video streaming solutions for Omni. It would be really cool if we could
multicast event video over the mesh and to other local radical spaces. E.g
Having places like the Long Haul or other hackerspaces function as overflow
seating/participation for workshops/lectures and vice-versa or streaming
performances into local pubs instead of whatever corrupt pro sports they
normally offer!
--
marc/juul
As I am rolling out a bunch of new babel nodes, I decided to get a
cluster (2 nanos and a pico) up in the lab, where I have good
connectivity to the rest of the network, to replace an aging cluster
by the pool.
So I booted it up and configured it for the right channels and a new
set of ip addresses... didnt have good LED support at all (RSSI does
not seem to do anything)...
I got blinkenlights to sort of work, and they were lit up, kind of
solid, for some reason... [1]
...people started wandering by to complain about the network...
naturally I didnt notice because I was even closer to the exit points
than anyone else...
...to discover that I was offering the shortest path to the exit
nodes, and thus had bypassed the two existing ~50mbit links into lab
links that were located indoors and going through a thousand+ meters
of trees... that was barely doing a megabit with 800+ms of delay.
(channel diversity not working did not help either)
After that experience, I decided that I would make the firmware for
unconfigured nodes export a 512 metric, and use a high rxcost until
they were fully configured AND in place. I might disable ipv4 entirely
in favor of the autoconfigured ula openwrt has, and just start
configuring stuff based on the appearance of new ulas in the network.
[1] if you come up with a useful LED config for nanostations and
picostations, let me know.
--
Dave Täht
worldwide bufferbloat report:
http://www.dslreports.com/speedtest/results/bufferbloat
And:
What will it take to vastly improve wifi for everyone?
https://plus.google.com/u/0/explore/makewififast
I am not normally on this list, but for some reason a search turned it
up this morning.
I run a babel-based mesh network down in los gatos and perhaps I have
an insight or two to help.
0) Bridging over wifi sucks, particularly when mdns or other multicast
traffic is present. It takes very little multicast to mess up your
whole day
1) Distributing ipv4 addresses and prefixes is a pita. The new hnetd
protocol intends to try and make that more automagic, but it has a
desire to be god-like over everything and every daemon. Still, more
eyeballs on it and perhaps it could be made to work.
See: http://www.homewrt.org/doku.php?id=overview
DNS is a pita, also. - some hope for mdns-proxy...
1.5) I used to use AHCP for interconnecting mesh nodes, which worked
quite well, particularly for ipv6. It did not distribute prefixes,
however, which led to (the thus far abortive) hnetd.
2) Babel's diversity algorithm seems to be suffering from an old
system call which only works in adhoc mode, not AP mode. It also
contains some bridge disambiguation logic (but failing to get the
channel right in that case does not help). It does sort of do the
right thing, even not getting the right info, shipping "interfereing"
around... and it looks like short work to get it to get the right
channel automatically given example code in the iw utility and olsrv2
- but I lack the time, personally, to do that right now.
Relevant thread and pointers here:
http://lists.alioth.debian.org/pipermail/babel-users/2015-June/002056.html
3) openwrt, at least, has moved away from dnsmasq being the dhcp
server in favor of their own server implementation. In neither case
does dhcp handle ptp /32s, so you need another address distribution
mechanism for this anyway.
Given the headaches in doing this automagically I lean towards static
assignment for ipv4.
4) babeld-1.6.0 had a bug in source specific routing, now fixed in
babeld-1.6.1. At the moment I am exploring various facets of that to
leverage building up the prefix distribution scheme.
--
Dave Täht
worldwide bufferbloat report:
http://www.dslreports.com/speedtest/results/bufferbloat
And:
What will it take to vastly improve wifi for everyone?
https://plus.google.com/u/0/explore/makewififast
We tested the extender node / home node auto-configuration and verified
that it works for extending the open network (peoplesopen.net ssid) but not
the adhoc/mesh network. The adhoc is more difficult since adhoc networks
cannot be bridged.
Max got the extender node firmware working on a Nanobridge, work on porting
the TP-Link home-node configuration over to a WD My Node and found a couple
of nothdcp bugs.
I worked on getting the adhoc extending to work. It's not quite there yet
but it seems like a combination of relayd and igmpproxy should simulate a
bridge adequately. This adds a bit of complexity but it doesn't seem like
there's any way around it. This pseudo-bridge setup is the last crucial
part of the whole notdhcp / extender node adventure. I am hopeful that we
can finish it this Thursday. A fallback solution is to just run babeld on
the extender nodes.
--
marc/juul
Chris J worked on a circuit for the garden node that will prevent it from
booting when the voltage is below a certain minimum.
I worked on extender node hook scripts and added features to makenode and
notdhcp for the extender nodes. My laptop doesn't have 5 ghz support so I
couldn't actually connect to the extender node and test, but it really
seems like everything is getting auto-configued correctly.
I expect that we'll need to tweak the meshrouting script and babel
configuration slightly to get everything playing nicely together.
Max: It'd be great to sit down and test and tweak this some day/evening
soon?
Next steps:
* Testing and tweaking current setup
* Ensure that it's also working for nanostation, nanobridge and my net
Then we could call what we have 0,2 beta, deploy a bunch of nodes and test?
btw the TP-Link for sunday nodes have arrived. For Sunday need two things:
* Hammers
* These things:
https://encrypted-tbn3.gstatic.com/shopping?q=tbn:ANd9GcRp3RcfPtQYgNMnkCkQv…
Actually it'd be good to have some black and some white cable clips. We can
do a home depot run on Sunday if nothing else.
--
marc/juul
Hello mesh adventures!
I recently moved into a house that has multiple 100 - 125ft Redwood trees
in the back yard located in the heart of Downtown Berkeley.
I'd like to do something akin to:
https://www.youtube.com/watch?v=wutvDSTfEk4
Anyone interested in helping me get a couple nodes setup?
snakewrangler:
* Tested the sudowrt rebuild script and it seemed to work
* Sketched a design for ubus forwarder and started implementing
chris:
* Started working on voltage level cutoff circuit for the ESP8266 garden
nodes
* Probably other stuff that I was too distracted to keep up with
juul:
* Got notdhcp hook script close to final state
I ran into a few issues that need resolution:
* notdhcp needs to send the VLAN ID (small change)
* notdhcp has a gaping security hole due to lacking validation (easy to
fix though)
* It is not possible to bridge an adhoc wifi interface...
What?! No really. It's true: No bridging of adhoc on Linux. I can't help
but think that this would have become a serious problem if we were still
using batman-adv
As it is, we have the option of doing a "layer 3 bridge". But how? I've
never actually done this before. Here's what I ended up with. Haven't
tested it yet though:
# stuff in mesh_wlan routing table goes out to eth0.10
ip route add default dev eth0.10 table mesh_wlan
# stuff coming in from adhoc0 goes to the mesh_wlan routing table
ip rule add iif adhoc0 lookup mesh_wlan
# stuff in mesh_eth routing table goes out to adhoc0
ip route add default dev adhoc0 table mesh_eth
# stuff coming in from eth0.10 goes to the mesh_eth routing table
ip rule add iif eth0.10 lookup mesh_eth
eth0.10 is the ethernet VLAN that carries the adhoc/babel network
adhoc0 is the wireless adhoc network
Is this crazy? It really does need to just forward everything coming in on
one interface out another interface and vice-versa. Since we can't bridge,
and we don't really need to do this below layer 3, this was the best I
could come up with. Thoughts? Suggestions? Facepalms?
Assuming some solution for an adhoc bridging alternative works out in the
next few days I should be able to finalize this by Tuesday and switch to
working on the new web UI.
--
marc/juul
Oh look: It's another very long email from juul!
I've been working on the config for the N750 + antenna-node configuration
and here are my thoughts so far.
We should _not_ let the LAN ports be one big ethernet interface for use
both as a wired version of open0 _and_ a way to connect antenna-nodes. We
never want to bridge two interfaces with attached antenna-nodes but
treating the multiple LAN ethernet interfaces as one interface is
effectively the same as bridging.
Example scenario: You have to nanobridges on your roof linking you to two
parts of the mesh. They are both plugged into LAN ports on your N750 and
since the LAN ports are treated as one interface then the two nanobridges
are able to communicate on layer 2. The nodes at the houses to which you
are connecting with your nanobridges have the same setup. Now we
effectively have the spanning tree protocol as our mesh protocol instead of
babel.
It was Alex who pointed this out when we were talking about bridging but
for some reason I hadn't connected the dots and recognized that the same is
true when telling the built-in switch to treat the interfaces as one.
This means that each physical port on the N750 that we wish to connect to
an antenna-node must be its own interface and should have a /32 netmask.
These interfaces can still have the same node IP as open0, etc, without any
problems.
I suggest we allocate two of the four ports for this purpose until we can
make something that intelligently changes the config when an antenna-node
has been connected.
We now have two remaining LAN ports that can act as a single interface. We
could then bridge this LAN interface to open0. However, we want to avoid
channel interference when sending from/to open0 but there is no reason we
should try to avoid channel interference for traffic coming from the wired
interfaces. If we bridge them we cannot treat them differently, so we
should not bridge.
It is not clear if babel channel diversity also takes into account channel
information for manually published routes such as the open0 route we
currently publish with this rule:
redistribute if open0 metric 128
I'll have to look at the source code to check. If the functionality is not
there then it should be fairly easy to add.
Aaaanyway: Since we don't want to bridge open0 and LAN, this complicates
things because we then need two subnets per node (otherwise we have the
same /26 on both LAN and open0, which is not going to work). Less than a
/26 on either interface gets iffy, so it seems like we'll need to have two
/26 subnets per node now.
One last but important thing I realized: If the antenna-nodes have their
wifi and ethernet interfaces bridged, then we will have problems. Imagine
the following setup:
(N750 A) ------ (nanobridge A) ~~~~ (nanobridge B) ----- (N750 B)
Where "-----" means ethernet and "~~~~" means wifi.
If both nanobridge A and nanobridge B are simply bridging their ethernet
and wifi interfaces, then we have the following problems:
1. If e.g. nanobridge A sends a DHCP request, then it will be received by
both N750 A and N750 B and they will have no reliable way of know if the
request was sent by "their" nanobridge or the remote nanobridge.
2. When managing e.g. nanobridge A via the N750 A web admin interface it
will be impossible for nanobridge A to know whether it should grant admin
access to N750 A or N750 B.
I can think of only two solutions to this:
1. Pre-configure all antenna-nodes with static IPs and knowledge of which
N750 node is their parent.
2. Don't bridge the ethernet and wifi interfaces on antenna-nodes and
instead run babel on them.
I like solution 2 much better since it's easier for both us and node
operators. Not sure if we'll see a performance hit if we don't bridge. We
have a few nanobridges though so we could easily test this.
What do y'all say?
PS: Obviously babel channel diversity doesn't apply to antenna nodes since
babel sees them as ethernet interfaces but since they are mostly
directional and far away from the N750 it is fine to treat them as having
no interference, which is the default for ethernet interfaces anyway.
PPS: Did you know that the default STP delay when adding an interface to a
bridge is apparently 30 seconds? Not sure if OpenWRT has a different
default or maybe uses RSTP (rapid spanning tree protocol) to deal with
this, but if not then it is something to be aware of in order to now go
insane when troubleshooting. From:
http://www.linuxfoundation.org/collaborate/workgroups/networking/bridge#Doe…
--
marc/juul