mesh-dev June 2015

mesh-dev@sudoroom.org

7 participants
12 discussions

by Marc Juul

At Omni we're using TP-Link dual-band home routers (N750) and modifying them for PoE. These routers have gigabit ethernet. Unfortunately gigabit ethernet and PoE at the same requires special ethernet transformers and these routers don't have that type of transformer. This means that they drop to 100 mbit when we modify them for PoE (but only on the port we modify). I was looking into how to get around this problem. I thought maybe there was a way to do gigabit half-duplex on two or three pairs, but it seems like that's not possible. BUT! It looks like TP-Link makes a PoE splitter which can apparently handle gigabit: The TL-POE10R Here it is for $12: http://www.ebay.com/itm/TP-LINK-TL-POE10R-Gigabit-PoE-Splitter-Adapter-IEEE… I ordered one so we can try it out. If it works we should just do that to all of the routers. No reason not to have gigabit if it's only $12 extra per router. It's still only $77 total for router + PoE splitter. oh and btw, someone is selling dirt cheap Ubiquiti Unifi UAP devices: http://www.ebay.com/itm/New-Ubiquiti-UniFi-UAP-802-11n-300-Mbps-Wireless-Ac… Caution: They are 2.4 GHz only. I ordered one for the mesh, just so we have it to test our firmware on. -- marc/juul

11 years

Hacknight progress

by Marc Juul

Here's what went down. # Added functionality to our babeld fork -x for dynamically removing interfaces (we only had -a to add them) -F to enable the dynamic functionality (fungible mode) -i to print the "kill -USR1" information for the running babeld babeld no longer requires any interfaces to be specified when initially started if fungible mode is enabled # Switched our firmwares to using our babeld fork Here it is: https://github.com/sudomesh/sudowrt-packages/tree/master/net/babeld-sudowrt We were only using them on the VPuN (exit) server before. I haven't tried to recompile the firmware with this package added. Maybe someone else can test that this compiles correctly? max: be aware that VPuN servers will now have to start new versions with -F to get the dynamic functionality # Completed extender-node functionality Everything now works as expected with babeld running on the extender nodes. The extender nodes come up automatically and both the open and adhoc networks work. Due to feedback by Dave Taht I abandoned adding avahi-daemon as a reflector on the extender nodes and pushed forwarding of mDNS traffic to the milestone for a future release. The one thing left to do for the extender nodes is to re-compile both firmwares from scratch, flash two nodes and test that it all comes up as expected. I've tried hard to bring the repositories in line with the working configuration on my two test nodes, but I may have missed something. # Added milestones and issues on github Milestones: https://github.com/sudomesh/sudowrt-firmware/milestones Issues for upcoming version 0.2: https://github.com/sudomesh/sudowrt-firmware/milestones/0.2%20-%20initial%2… Please add any issues I may have missed. Also, please change things if you disagree :) I just did what I thought made sense but I'm not married to anything. Yay progress! -- marc/juul

11 years

Multicast forwarding for layer 3 pseudo-bridges

by Marc Juul

# The problem Our extender-nodes can't bridge their adhoc wifi interface to their ethernet interface since bridging adhoc is not possible. A layer 3 pseudo-bridge using relayd should work but both Babel and mDNS rely on multicast which is not handled by relayd. # Solutions It looks like we have two options: 1. Forward both babel and mDNS IPv4 and IPv6 multicast messages between interfaces selectively based on the multicast addresses of these protocols 2. Have extender-nodes forward / route multicast traffic in general (pimd/pim6sd). Running babeld is a solution, but not ideal since it makes Babel deal with extender-nodes as real nodes, adding more signalling traffic to the network which should be unnecessary and complicates our network graphs. ## Selective multicast routing/forwarding mDNS has existing solutions available, such as: * avahi with enable-reflector option (kinda bulky) * mdns-repeater (no openwrt package exists) For Babel we are not so lucky. Stripping babeld down so it is only a dumb forwarder for Babel multicast traffic and then running that on the extender-nodes might be a way to go. Another option is to use a general-purpose IPv4/IPv6 multicast reflector/forwarder that can be configured to relay or route traffic for a set of multicast addresses and ports. RFC 6621 which is implemented in nrlsmf is something like that but: * It is a userland process that sniffs packets in promiscious mode (shit performance) * It forwards _all_ multicast traffic (which would be fine for out use case if it wasn't doing this in userland) * It has no specified license though source is available mcproxy (similar to igmpproxy but more modern) is not too far from what we need. It seems likely that mcproxy could be fairly easily modified for selective forwarding. mcproxy is nice because it doesn't intercept data in userland in order to forward (it uses the kernel multicast routing features). Currently it listens for multicast subscriptions from the downstream network and then begins forwarding multicast traffic on the subscribed IP to the downstream interface. What we need is a simpler daemon that doesn't listen for subscriptions but instead forwards in both directions based on the following configuration file options: * IP version / Protocol (IGMPv2, IGMPv3, MLDv1, etc) * Multicast IP to forward for (e.g. 224.0.0.111 * Interfaces to forward between So you could tell it, e.g: "forward all 224.0.0.111 and ff02::1:6 traffic between eth0.1 and adhoc0" (babel traffic) or: "forward all 224.0.0.251 and ff02::fb traffic between eth0.1 and adhoc0" (mDNS traffic) # General-purpose multicast routing/forwarding Instead of selectively forwarding we could run real multicast routing daemons and simply configure them to forward all multicast traffic. The two standard packages for this seem to be pimd for IPv4 and pim6sd for IPv6. Attitude Adjustment has a pim6sd package but seems to be missing a package for the normal pimd package. Even worse, it looks like Chaos Calmer drops the pim6sd package as well (I really wish there was a central place where one could read about all OpenWRT dropped/added packages and reasons for doing so). We could try to run these two daemons, but we'd have to make a package for pimd and figure out why they're droppping pim6sd. Doesn't seem like the worst chore. If we wanted to avoid pimd, then we could disable mDNS on IPv4, which shouldn't present a problem since all mesh-connected devices will have IPv6 addresses anyway, but it's probably unlikely that babeld can function on both IPv4 and IPv6 without IPv4 multicast (unless we change a bunch of code). Totally worth checking if it already has that ability though, since avoiding IPv4 multicast would be a bonus. # Tricks If you run "mcproxy -c" it will check if all relevant kernel config options are enabled for IPv4 and IPv6 multicast routing. It looks like default Attitude Adjustment has all but one feature enabled. Here's the output: ``` # mcproxy -c Check the currently available kernel features. - root privileges: Ok! - ipv4 multicast: Ok! - ipv4 multiple routing tables: Ok! - ipv4 routing tables: Ok! - ipv4 mcproxy was able to join 40+ groups successfully (no limit found) - ipv4 mcproxy was able to set 40+ filters successfully (no limit found) - ipv6 multicast: Ok! ERROR: failed to set kernel table! Error: Protocol not available errno: 99 - ipv6 multiple routing tables: Failed! - ipv6 routing tables: Ok! - ipv6 mcproxy was able to join 40+ groups successfully (no limit found) - ipv6 mcproxy was able to set 40+ filters successfully (no limit found) ``` It's unclear if we need support for multiple IPv6 routing tables (probably not), but it's probably trivial to enable the kernel option if we do. # Conclusion I propose that we run babeld on the extender nodes and use avahi as a reflector for now. We can then revisit this for e.g. version 0.4. What do you all say? # Ponderings Based on this preliminary research it looks like the long-term solution involving the smallest amount of work is probably running pim6sd and pimd. This is gratifying since it would be really cool to have a mesh that does real actual multicast routing. I'm especially excited about this since the film-maker collective at Omni (Optik Allusions) seem to be growing and because I have been working hard on video streaming solutions for Omni. It would be really cool if we could multicast event video over the mesh and to other local radical spaces. E.g Having places like the Long Haul or other hackerspaces function as overflow seating/participation for workshops/lectures and vice-versa or streaming performances into local pubs instead of whatever corrupt pro sports they normally offer! -- marc/juul

11 years

a cautionary note on setting up new babel nodes

by Dave Taht

As I am rolling out a bunch of new babel nodes, I decided to get a cluster (2 nanos and a pico) up in the lab, where I have good connectivity to the rest of the network, to replace an aging cluster by the pool. So I booted it up and configured it for the right channels and a new set of ip addresses... didnt have good LED support at all (RSSI does not seem to do anything)... I got blinkenlights to sort of work, and they were lit up, kind of solid, for some reason... [1] ...people started wandering by to complain about the network... naturally I didnt notice because I was even closer to the exit points than anyone else... ...to discover that I was offering the shortest path to the exit nodes, and thus had bypassed the two existing ~50mbit links into lab links that were located indoors and going through a thousand+ meters of trees... that was barely doing a megabit with 800+ms of delay. (channel diversity not working did not help either) After that experience, I decided that I would make the firmware for unconfigured nodes export a 512 metric, and use a high rxcost until they were fully configured AND in place. I might disable ipv4 entirely in favor of the autoconfigured ula openwrt has, and just start configuring stuff based on the appearance of new ulas in the network. [1] if you come up with a useful LED config for nanostations and picostations, let me know. -- Dave Täht worldwide bufferbloat report: http://www.dslreports.com/speedtest/results/bufferbloat And: What will it take to vastly improve wifi for everyone? https://plus.google.com/u/0/explore/makewififast

11 years

meshing/bridging and babel

by Dave Taht

I am not normally on this list, but for some reason a search turned it up this morning. I run a babel-based mesh network down in los gatos and perhaps I have an insight or two to help. 0) Bridging over wifi sucks, particularly when mdns or other multicast traffic is present. It takes very little multicast to mess up your whole day 1) Distributing ipv4 addresses and prefixes is a pita. The new hnetd protocol intends to try and make that more automagic, but it has a desire to be god-like over everything and every daemon. Still, more eyeballs on it and perhaps it could be made to work. See: http://www.homewrt.org/doku.php?id=overview DNS is a pita, also. - some hope for mdns-proxy... 1.5) I used to use AHCP for interconnecting mesh nodes, which worked quite well, particularly for ipv6. It did not distribute prefixes, however, which led to (the thus far abortive) hnetd. 2) Babel's diversity algorithm seems to be suffering from an old system call which only works in adhoc mode, not AP mode. It also contains some bridge disambiguation logic (but failing to get the channel right in that case does not help). It does sort of do the right thing, even not getting the right info, shipping "interfereing" around... and it looks like short work to get it to get the right channel automatically given example code in the iw utility and olsrv2 - but I lack the time, personally, to do that right now. Relevant thread and pointers here: http://lists.alioth.debian.org/pipermail/babel-users/2015-June/002056.html 3) openwrt, at least, has moved away from dnsmasq being the dhcp server in favor of their own server implementation. In neither case does dhcp handle ptp /32s, so you need another address distribution mechanism for this anyway. Given the headaches in doing this automagically I lean towards static assignment for ipv4. 4) babeld-1.6.0 had a bug in source specific routing, now fixed in babeld-1.6.1. At the moment I am exploring various facets of that to leverage building up the prefix distribution scheme. -- Dave Täht worldwide bufferbloat report: http://www.dslreports.com/speedtest/results/bufferbloat And: What will it take to vastly improve wifi for everyone? https://plus.google.com/u/0/explore/makewififast

11 years, 1 month

Hacknight progress

by Marc Juul

We tested the extender node / home node auto-configuration and verified that it works for extending the open network (peoplesopen.net ssid) but not the adhoc/mesh network. The adhoc is more difficult since adhoc networks cannot be bridged. Max got the extender node firmware working on a Nanobridge, work on porting the TP-Link home-node configuration over to a WD My Node and found a couple of nothdcp bugs. I worked on getting the adhoc extending to work. It's not quite there yet but it seems like a combination of relayd and igmpproxy should simulate a bridge adequately. This adds a bit of complexity but it doesn't seem like there's any way around it. This pseudo-bridge setup is the last crucial part of the whole notdhcp / extender node adventure. I am hopeful that we can finish it this Thursday. A fallback solution is to just run babeld on the extender nodes. -- marc/juul

11 years, 1 month

Hacknight progress

by Marc Juul

Chris J worked on a circuit for the garden node that will prevent it from booting when the voltage is below a certain minimum. I worked on extender node hook scripts and added features to makenode and notdhcp for the extender nodes. My laptop doesn't have 5 ghz support so I couldn't actually connect to the extender node and test, but it really seems like everything is getting auto-configued correctly. I expect that we'll need to tweak the meshrouting script and babel configuration slightly to get everything playing nicely together. Max: It'd be great to sit down and test and tweak this some day/evening soon? Next steps: * Testing and tweaking current setup * Ensure that it's also working for nanostation, nanobridge and my net Then we could call what we have 0,2 beta, deploy a bunch of nodes and test? btw the TP-Link for sunday nodes have arrived. For Sunday need two things: * Hammers * These things: https://encrypted-tbn3.gstatic.com/shopping?q=tbn:ANd9GcRp3RcfPtQYgNMnkCkQv… Actually it'd be good to have some black and some white cable clips. We can do a home depot run on Sunday if nothing else. -- marc/juul

11 years, 1 month

Let's climb a Redwood Tree!

by Mo Balaa

Hello mesh adventures! I recently moved into a house that has multiple 100 - 125ft Redwood trees in the back yard located in the heart of Downtown Berkeley. I'd like to do something akin to: https://www.youtube.com/watch?v=wutvDSTfEk4 Anyone interested in helping me get a couple nodes setup?

11 years, 1 month

Hacknight progress

by Marc Juul

snakewrangler: * Tested the sudowrt rebuild script and it seemed to work * Sketched a design for ubus forwarder and started implementing chris: * Started working on voltage level cutoff circuit for the ESP8266 garden nodes * Probably other stuff that I was too distracted to keep up with juul: * Got notdhcp hook script close to final state I ran into a few issues that need resolution: * notdhcp needs to send the VLAN ID (small change) * notdhcp has a gaping security hole due to lacking validation (easy to fix though) * It is not possible to bridge an adhoc wifi interface... What?! No really. It's true: No bridging of adhoc on Linux. I can't help but think that this would have become a serious problem if we were still using batman-adv As it is, we have the option of doing a "layer 3 bridge". But how? I've never actually done this before. Here's what I ended up with. Haven't tested it yet though: # stuff in mesh_wlan routing table goes out to eth0.10 ip route add default dev eth0.10 table mesh_wlan # stuff coming in from adhoc0 goes to the mesh_wlan routing table ip rule add iif adhoc0 lookup mesh_wlan # stuff in mesh_eth routing table goes out to adhoc0 ip route add default dev adhoc0 table mesh_eth # stuff coming in from eth0.10 goes to the mesh_eth routing table ip rule add iif eth0.10 lookup mesh_eth eth0.10 is the ethernet VLAN that carries the adhoc/babel network adhoc0 is the wireless adhoc network Is this crazy? It really does need to just forward everything coming in on one interface out another interface and vice-versa. Since we can't bridge, and we don't really need to do this below layer 3, this was the best I could come up with. Thoughts? Suggestions? Facepalms? Assuming some solution for an adhoc bridging alternative works out in the next few days I should be able to finalize this by Tuesday and switch to working on the new web UI. -- marc/juul

11 years, 1 month

Thinking through LAN port usage, or: Why bridges are bad news

by Marc Juul

Oh look: It's another very long email from juul! I've been working on the config for the N750 + antenna-node configuration and here are my thoughts so far. We should _not_ let the LAN ports be one big ethernet interface for use both as a wired version of open0 _and_ a way to connect antenna-nodes. We never want to bridge two interfaces with attached antenna-nodes but treating the multiple LAN ethernet interfaces as one interface is effectively the same as bridging. Example scenario: You have to nanobridges on your roof linking you to two parts of the mesh. They are both plugged into LAN ports on your N750 and since the LAN ports are treated as one interface then the two nanobridges are able to communicate on layer 2. The nodes at the houses to which you are connecting with your nanobridges have the same setup. Now we effectively have the spanning tree protocol as our mesh protocol instead of babel. It was Alex who pointed this out when we were talking about bridging but for some reason I hadn't connected the dots and recognized that the same is true when telling the built-in switch to treat the interfaces as one. This means that each physical port on the N750 that we wish to connect to an antenna-node must be its own interface and should have a /32 netmask. These interfaces can still have the same node IP as open0, etc, without any problems. I suggest we allocate two of the four ports for this purpose until we can make something that intelligently changes the config when an antenna-node has been connected. We now have two remaining LAN ports that can act as a single interface. We could then bridge this LAN interface to open0. However, we want to avoid channel interference when sending from/to open0 but there is no reason we should try to avoid channel interference for traffic coming from the wired interfaces. If we bridge them we cannot treat them differently, so we should not bridge. It is not clear if babel channel diversity also takes into account channel information for manually published routes such as the open0 route we currently publish with this rule: redistribute if open0 metric 128 I'll have to look at the source code to check. If the functionality is not there then it should be fairly easy to add. Aaaanyway: Since we don't want to bridge open0 and LAN, this complicates things because we then need two subnets per node (otherwise we have the same /26 on both LAN and open0, which is not going to work). Less than a /26 on either interface gets iffy, so it seems like we'll need to have two /26 subnets per node now. One last but important thing I realized: If the antenna-nodes have their wifi and ethernet interfaces bridged, then we will have problems. Imagine the following setup: (N750 A) ------ (nanobridge A) ~~~~ (nanobridge B) ----- (N750 B) Where "-----" means ethernet and "~~~~" means wifi. If both nanobridge A and nanobridge B are simply bridging their ethernet and wifi interfaces, then we have the following problems: 1. If e.g. nanobridge A sends a DHCP request, then it will be received by both N750 A and N750 B and they will have no reliable way of know if the request was sent by "their" nanobridge or the remote nanobridge. 2. When managing e.g. nanobridge A via the N750 A web admin interface it will be impossible for nanobridge A to know whether it should grant admin access to N750 A or N750 B. I can think of only two solutions to this: 1. Pre-configure all antenna-nodes with static IPs and knowledge of which N750 node is their parent. 2. Don't bridge the ethernet and wifi interfaces on antenna-nodes and instead run babel on them. I like solution 2 much better since it's easier for both us and node operators. Not sure if we'll see a performance hit if we don't bridge. We have a few nanobridges though so we could easily test this. What do y'all say? PS: Obviously babel channel diversity doesn't apply to antenna nodes since babel sees them as ethernet interfaces but since they are mostly directional and far away from the N750 it is fine to treat them as having no interference, which is the default for ethernet interfaces anyway. PPS: Did you know that the default STP delay when adding an interface to a bridge is apparently 30 seconds? Not sure if OpenWRT has a different default or maybe uses RSTP (rapid spanning tree protocol) to deal with this, but if not then it is something to be aware of in order to now go insane when troubleshooting. From: http://www.linuxfoundation.org/collaborate/workgroups/networking/bridge#Doe… -- marc/juul

11 years, 1 month

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

mesh-dev June 2015