Mesh/Firmware
Documentation for the sudo mesh firmware.
ToDo
Stuff we _need_ before beta launch:
- The production exit nodes intermittently stops working.
- Until we figure out why, the script /root/scripts/check.sh is executed by cron every minute and if no nodes show up with "batctl o" the server reboot. It also saves syslog in /root/LAST_SYSLOG
- Get watchdog working on ar71xx (?)
- Better interface and security for node database
- Make analog hardware watchdog for old chipset and solder to devices.
- Set up a remote monitoring solution
- Run a node with serial monitoring to see if we can learn why it crashes.
- Detect ethernet being plugged/unplugged and send DHCP requests accordingly.
- Deal with situations where internet is not shared or not present. (juul)
- Deal with situations where the node's eth0 is plugged into a 10.0.0.0/8 network (juul)
- Implement remote password reset (via h.sudomesh.org)
- Implement error reporting in web admin interface
- Implement a remote logging/monitoring solution (how far along is nodewatcher 2?)
- Maybe with a script that checks for issues and reports back, e.g. "Ethernet is connected and DHCP server is present but DHCP server does not want to give me (the node) an IP. DHCP server probably has MAC address filtering enabled."
- Select a wifi channel, both 2.4 ghz and 5 ghz.
- Test if batman Internet gateway selection is working correctly (juul)
- Currently DHCP hands out exit node IP as gateway, which means that the bandwidth specified in the batman gw_mode will be taken into account for DHCP server selection but not for gateway selection. Gateway selection will just happen based on normal batman-adv routing.
- DHCP servers should maybe hand out node IP as gateway, but since both are on same subnet this will cause ICMP redirect messages to be generated.
Issues for later versions:
- IPv6 support (possibly switching to IPv6 entirely)
- Figure out how to legally use lower 5 ghz frequencies
- Figure out a solution for MAC address anonymizing.
- Set up OpenVPN on exit node.
- Implement statistics for web admin interface.
- Manual (or automatic?) speed and output power selection via web interface.
- Find better solution for batman/tunneldigger related MTU issues (Juul).
- Currently setting client MTU with DHCP (does not work on e.g. Win 7) and using TCP MSS clamping on exit node (obviously only works for TCP).
- Support TDMA on Linux (Adri is working on FreeBSD support, maybe we can port).
- Make it possible to contribute to html/css without installing a dev environment (e.g. for luci)
Firmware generation features
It should be easy to generate a new firmware with the following custom config:
- Location and ownership information.
- Contact info should be saved in a secure database but maybe not on the node itself?
- Randomly generated passwords set for wpa2, admin interface and ssh.
- The SSH password should be stored securely and a couple of stickers with the wpa2 and admin password should be printed for the user.
- Web interface
- ssh key generation
Freifunk Meshkit is pretty neat!
We'll be dividing the image generation and node configuration aspects into two parts.
Sudomesh Firmware Image Builder Github Repo has our image builder and
Sudomesh Node Configurator Github Repo is our node configurator.
Sudomesh OpenWrt Packages has all of the sudomesh openwrt packages that we're using/we've written.
We flash nodes with the sudomesh image and then we use the node configurator to set them up with networking configs, ssh keys, etc. We also use the node-configurator to write pertinent info to a database.
Status:
Pretty much finished! We're testing the last few issues!
Stuff the firmware should have
Ranked from most to least important
InternetIsDownRedirect
When the node doesn't have internet access, it will redirect traffic to our mesh hosted Splash Page.
We need something hosted on the node that can check if it has access to the internet. There's a bit of an issue where certain OSes won't connect to APs that don't have internet access. Juul will look into building a hack that properly manages these requests and redirects them to our node-hosted site.
InternetIsDownRedirect may also have to fake the expected captive portal detection responses? We need to figure out if android/iOS/Mac/Windows will connect to a wifi that does not have internet access.
Status: Implemented except for OS-specific captive portal requests.
Splash page
We can capture OS specific probes in order to specifically redirect captive portal requests without affecting any other network traffic.
Features:
- Brief info on the mesh
- Link to our website?
Status:
maxb has pretty much finished implementing this. We still need a services list page.
SSH server
The SSH server should be contactable from any interface. It should initially allow root access using a random generated password that the mesh group has and that the node owner can get and change if they are so inclined.
Status: Implemented. Mostly openwrt stock but we've added keygen features for the node-configurator
Mesh Protocol
BATMAN-adv was the protocol that we had assumed we'd be able to use. Unfortunately, it looks as though it won't support tunneling over the internet, which is one of the primary features we had been hoping to implement. See our mailing list convo.
We're now looking into bmx6 and babel. We're currently experimenting with both of these protocols to better understand their appropriateness for our project.
Status: In progress
Multiple virtual network interfaces with their own SSIDs
- One ad-hock mode, unencrypted interface for the mesh nodes, e.g. sudomesh-backchannel
- One access point mode, unencrypted interface, for non-mesh devices to connect to the mesh, e.g. sudomesh.
- One access point mode, private interface with WPA2, for the people who own the nodes. [optional]
Traffic on the private interface should be completely separated from traffic on the non-private interfaces unless a client connected to the private interface requests an IP on the mesh.
Maybe the last one is optional because some people may not need that feature (they already have another access point and they want to keep it), but then how do people administrate the router?
In order to serve a secure web admin config to home users, we'll probably always serve 3 APs with one private WPA encrypted home link so that users can access their admin page.
Status: Implemented
Web admin interface
Development information should be put in Web Admin Dev. This section can remain a wish-list.
A very simple one-page interface. It should do at least the following:
- Display some set of user statistics
- Ideally we could list/graph the number of people who have associated with your mesh node.
- We could also just list/graph the up/down data of people who have been using the mesh.
- LightSquid (used by pfSense)
- Set location, name, description.
- But do you want to know the location centrally as well so that you can display nodes on the map? Will people enter this information twice or will you pull this information from nodes and then display on the map? Same for name and description. I would suggest that information is stored only once. In your case on the node itself. So probably you can then pull this information through nodewatcher scripts on nodes and then display nodes the map. Just really should not require people to enter or maintain information on two places because it desyncs very fast. Mitar (talk) 22:20, 24 July 2013 (PDT)
- Let people select how much bandwidth they share.
- They always share 100% when they're not using the connection themselves.
- This works if people are using their private SSIDs on the node. But if the node is connected to their existing home network you might not easily configure such sharing. But maybe there is a way to detect that host network is free and can limits can be increased. Mitar (talk) 22:20, 24 July 2013 (PDT)
- Do any ISPs have bandwidth caps around here? If so, let people specify how many MB to share per month.
- Maybe also a button for temporary increase limits (make them more restrictive) which are then after some time automatically restored.
- Let people change the admin password and the private wifi wpa2 password.
- Probably private SSID as well.
- Donate / "buy routers as presents for your friends"-button.
- One idea we had (but this is probably better for splash screen) is "adopt a node". Where a neighbor who uses a node a lot and depends on the node can donate some money to keep it up, but can then give a nickname or avatar to the node. Or something. Mitar (talk) 22:20, 24 July 2013 (PDT)
Status: Maxb is implementing. Pretty much finished. Still needs graphs, etc. but has most of the other functionality including bandwidth shaping controls.
Source here:
https://github.com/sudomesh/luci-app-peopleswifi
Remote Updates
Once we deploy nodes, we'll want a way to update them as appropriate. The already built node configurator operates along similar lines, but we'd need to do some tweaking in order to make it work on the mesh. Also, we'd want to give the users the options to turn remote updates off. A somewhat decentralized system would be nice as well.
Watchdog script
Node tests itself to see if it has connectivity, etc and resets itself if necessary. OpenWrt supports the hardware watchdog on our PicoStations without any additional hacking, yay!
By default the hardware watchdog will automatically hard-reset the AP if /dev/watchdog is not written to at least once every 60 seconds. A Lua library has been written to interface with the batman-adv kernel module through the batctl command line utility. We need to identify a list of conditions that require a hard-reset and work them into the Lua watchdog script in the openwrt-firmware repository.
The Freifunk group has an awesome watchdog setup, details: http://wiki.freifunk.net/Kamikaze/LuCI/Watchdog
list of possible reset conditions: high sustained load, cron goes down, sshd goes down.
Potential use of Quilt to update nodes.
Quality of Service (QoS)
We want those who own a node to decide how much internet they share with the network. This software allows users to shape their bandwidth based on type. There's an paper regarding layer 7 traffic shaping too.
Our supported hardware needs a very lightweight software, which is why we've been using tc (traffic control). It only allows the users to determine how much internet they share with the network.
Complete Distributions
These have firewall and network management tools included with the distribution.
- pfSense - a widely used firewall distribution, but there are most definitely difficulties with it.
- Zentyal - a firewall distribution with easy to use graphical interface.
- m0n0wall - a lightweight firewall distribution meant for embedded systems.
Packages
These are tools often used in network management distributions.
- netfilter/iptables - a set of hooks inside the Linux kernel that allows kernel modules to register callback functions with the network stack.
- iproute2 - a collection of utilities for controlling TCP / IP networking and traffic control.
- l7-filter (p2p filtering) - identifies packets based on application layer data. It classifies packets to be used with a bandwidth shaper.
- ipp2p (p2p filtering) - identifies peer-to-peer (P2P) data in IP traffic.
- Suricata - a high performance network intrusion detection system (IDS), intrusion prevention system (IPS), and network security monitoring engine.
- ipfirewall (ipfw) - a freeBSD firewall that uses netdummy.
- netdummy - a freeBSD traffic shaper and bandwidth manager.
- ipfw-classifyd - an application layer classifier for ipfw firewall for freeBSD.
Virtual Private Network (VPN)
The firmware should tunnel all internet traffic from the mesh through a VPN server, unless this feature is specifically disabled. This should not be a single server, as that would be a single point of failure.
- TunnelDigger - a lightweight tunneling client/server.
- OpenMesher - another option, but not ideal because of memory constraints on embedded systems.
Here is our Network Topology.
Mesh VPN
If the mesh does not see any other nodes (and maybe even if it does?), and it has internet, then it should connect to another node or two over VPN. The easy solution is to use the same VPN servers as for the internet.
Status: Implemented
DHCP and batman-adv gateway mode
Nodes with an internet connection should run DHCP and batman-adv gateway mode. We want to detect if the node can connect to a relay in which case it should configure as a batman-adv gateway server node. Otherwise they should configure as batman-adv gateway clients.
Staus: Implemented
Location and status reporting
Something that reports location and status when polled.
We developed this format and easy to publish status data from nodes for our nodewatcher. OpenWrt packages are here. Mitar (talk) 22:02, 11 July 2013 (PDT)
Nice to have:
- Status info: How many nodes is your node connected to. Is the internet link working.
- An "I don't know what my internet bandwidth is, test it for me"-function.
- Usage statistics (so people can see how many people they helped get internet!)
- This is the most important thing! Mitar (talk) 22:20, 24 July 2013 (PDT)
- You should add as well graphs on how much bandwidth was consumed by the node. This is useful when hosts see that their Internet is slow and believe that it was because of the node. Then they can check and see if it is really node (which often is not) or maybe just ISP has problems. Important because people like to attribute issues they have to nodes they don't understand. Mitar (talk) 22:20, 24 July 2013 (PDT)
- Let people put up a bit of info about their node / house / co-op, on a simple web page that people can access only if they're connected to that node. It could be shown as part of the splash page.
Status: Waiting for nodewatcher project to finish
Intelligent Wifi Channel Switching
It would be nice to be able to have the network intelligently determine channels
IPv6 support
We should have IPv6 support, but I am ok with launching the mesh with only IPv4 and adding in IPv6 later. (Juul (talk))
Stuff the firmware could have
DNS server
Each node could run its own (caching) DNS server.
For now, if you're logged into the private network on a node, going to http://my.node will take you to the web admin interface
Status:
Implemented web admin URL, but no caching DNS server yet.
RSSI Testing and Logging
At intervals, the nodes could conduct RSSI tests and log them with some way to compare and visualize signal strengths over time.
Caching web proxy
We could use Polipo to improve people's browsing experience. Not sure how much cpu and memory this would need. We may not be able to run it on the routers with less than 32 MB ram (e.g. the Bullet 2 HPs).
Block ads and tracking
We could use e.g. Polipo with the sources from both adblock plus and ghostery. If we implement this, it should be an optional (default off) feature that you can select on the splash page, with a "remember this" that remembers either using a cookie or using your MAC (but then we'd be logging people's MAC addresses :-S). The block should probably be time-limited (e.g. 30 days).
Compatible devices
We should have ready-made images for:
- One really cheap indoor router (with 3G usb stick support?) like TP-Link TL-WR703N
- One nice high-speed indoor router (300 mbps 802.11n)
- Ubiquiti hardware. Most of the AirMAX stuff.