Difference between revisions of "Mesh/Monitoring"
|  (→Icinga) | |||
| (58 intermediate revisions by one other user not shown) | |||
| Line 1: | Line 1: | ||
| See Also: [[Technical Documentation]] | See Also: [[Mesh#Technical_Documentation | Technical Documentation]],  [[Mesh/Bandwidth Quotas | Bandwidth Quotas]] | ||
| [[Category:Monitoring]] | |||
| [[Category:Technical]] | |||
| This is all currently out of date. The currently implemented monitor server is running on VPS, which is tunneled into the network. | |||
| It's running [https://monitor.sudomesh.org/cacti/graph_view.php?action=tree&tree_id=1&leaf_id=45 Cacti] and [https://monitor.sudomesh.org/smokeping/smokeping.cgi?target=Mesh Smokeping] as well as gathering logs. | |||
| When I have the time, I'll update the details here, but I've documented the setup as provisioning code here: https://github.com/max-b/mesh-playbooks - [[User:Maxb]] | |||
| == The Basics == | |||
| See Also: [[Mesh/Icinga 2]] | |||
| === The Hardware === | |||
| Small frame Dell PC, service tag: 2FDSGC1, green tape with info on the front. | |||
| === The Software === | |||
| * OS:  Linux monitor 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u3 x86_64 GNU/Linux | |||
| === Use It === | |||
| ssh to: | |||
| * monitor or monitor.local - (as of 10.30.2014, not accessible by this host name, perhaps dns is not correctly defined) | |||
| * 192.168.50.15 | |||
| * login: sudoroom:sudomesh | |||
| Icinga web UI: | |||
| * http://192.168.50.15/icinga2-classicui (login: icingaadmin:mojave) | |||
| Charting: | |||
| * Graphite - http://192.168.50.15/ | |||
| * Grafana - http://192.168.50.15/grafana  | |||
| * Mesh Charts - http://192.168.50.15/grafana/#/dashboard/db/sudomesh  (works best in firefox) | |||
| Github repo:  | |||
| * https://github.com/TinajaLabs/sudomesh_icinga | |||
| == List of hosts == | |||
| Run nmap (09.18.2014) to see a list of hosts which we might want to monitor... | |||
| Most of these are probably laptops connected to the internet. | |||
|   nmap -sn 192.168.50.0/24 | |||
| * 192.168.50.15 - monitoring server | |||
| * | |||
| == Icinga == | == Icinga == | ||
| Interesting article:  | Interesting article: [http://www.smallbusinesstech.net/more-complicated-instructions/nagios/setting-up-nagios-on-a-debian-server-to-remotely-monitor-an-openwrt-router Setting up Icinga on a Debian Server to Remotely Monitor an OpenWrt Router] | ||
| Icinga is a Nagios fork which, as of Fall 2013, has more development involvement.  Icinga is the central system that pings other systems like openWRT.  Icinga gathers the data and can track and send notifications when values drift beyond normal tolerances.  On the remote hosts it is required to install nrpe and a basic set of nrpe plugins.  The article referenced above shows how it is possible to install nrpe on openwrt through the openwrt web interface.  After that one can ssh into the router and configure it. | Icinga is a Nagios fork which, as of Fall 2013, has more development involvement.  Icinga is the central system that pings other systems like openWRT.  Icinga gathers the data and can track and send notifications when values drift beyond normal tolerances.  On the remote hosts it is required to install '''nrpe''' and a basic set of '''nrpe plugins'''.  The article referenced above shows how it is possible to install nrpe on openwrt through the openwrt web interface.  After that one can ssh into the router and configure it. | ||
| Once the router is configured it is necessary to configure Icinga with   | Once the router is configured it is necessary to configure the central Icinga server with: | ||
| * the IP address of each node it will track | * the IP address of each node it will track | ||
| * the host groups | * the host groups | ||
| * the services that are to be monitored. | * the services that are to be monitored. | ||
| == Icinga 2 == | |||
| Icinga 2 is a rewrite of Icinga with a cleaner implementation and configuration structure.  It will be able to run SNMP calls to nodes which run mini-snmpd and be able to send performance data to charting apps like graphite. | |||
| See Install notes at: [[Icinga 2]] | |||
| == Access to remote hosts == | |||
| Icinga can use various methods to monitor remote hosts, NRPE, SNMP, etc. | |||
| === IMCP === | |||
| The ping service; for determining if the host is reachable | |||
| === SNMP === | |||
| We will probably start off with simple SNMP monitoring.  It return very basic info but it does not require too much setup on the remote hosts. | |||
| === OpenWRT Package: nrpe === | |||
| NRPE requires a daemon running on the remote host and a number of mostly bash scripts for specialized info.  Bash scripts can be developed to read disk stats, memory usage, etc.  There are about 30 scripts which come with NRPE. | |||
| <pre> | |||
| Package: nrpe | |||
| Version: 2.12-4 | |||
| Depends: libc, librt, libpthread, libopenssl, libwrap | |||
| Source: feeds/packages/admin/nrpe | |||
| SourceFile: nrpe-2.12.tar.gz | |||
| SourceURL: @SF/nagios | |||
| Section: admin | |||
| Architecture: ib42x0 | |||
| Installed-Size: 19018 | |||
| Filename: nrpe_2.12-4_ib42x0.ipk | |||
| Size: 19801 | |||
| MD5Sum: f36019344c747a1e88f5aab50776bd4e | |||
| Description:  The NRPE addon is designed to allow you to execute Nagios plugins on | |||
|  remote Linux/Unix machines.  The main reason for doing this is to allow | |||
|  Nagios to monitor "local" resources (like CPU load, memory usage, etc.) | |||
|  on remote machines.  Since these public resources are not usually | |||
|  exposed to external machines, an agent like NRPE must be installed on | |||
|  the remote Linux/Unix machines. | |||
| </pre> | |||
| == Alternative tools == | |||
| === RRD === | |||
| * http://oss.oetiker.ch/rrdtool/ | |||
| * also look at whisper, a light weight rrd - http://graphite.wikidot.com/whisper | |||
| * | |||
| === collectd === | |||
| * http://collectd.org/ | |||
| === Sensu - Deprecated === | |||
| See our notes about the Sensu install: [[Mesh/Sensu_Page | Sensu Page]] | |||
| '''Requires a client application service and not useful for our needs for monitoring mesh nodes...''' | |||
| * The sensu web page can be accessed internally at:  | |||
| ** http://192.168.42.65:8080/# (as of 2014.07.25 no services defined) | |||
| ** user: admin | |||
| ** pw: secret | |||
| ==== About Sensu - From Sensu site ==== | |||
| Sensu is often described as the "monitoring router". Essentially, Sensu takes the results of "check" scripts run across many systems, and if certain conditions are met; passes their information to one or more "handlers". Checks are used, for example, to determine if a service like Apache is up or down. Checks can also be used to collect data, such as MySQL query statistics or Rails application metrics. Handlers take actions, using result information, such as sending an email, messaging a chat room, or adding a data point to a graph. There are several types of handlers, but the most common and most powerful is "pipe", a script that receives data via standard input. Check and handler scripts can be written in any language, and the community repository continues to grow! | |||
| Fun Sensu facts: | |||
| * Written in Ruby, using EventMachine. | |||
| * Has great test coverage with continuous integration via Travis CI. | |||
| * Can use existing Nagios plugins. | |||
| * Configuration is all in JSON. | |||
| * Has a message-oriented architecture, using RabbitMQ and JSON payloads. | |||
| * Packages are "omnibus", for consistency, isolation, and low-friction deployment. | |||
| * Sensu is designed for modern infrastructures and to be driven by configuration management tools, designed for the "cloud". | |||
| == Charting == | |||
| === Graphite === | |||
| * http://www.slideshare.net/reyjrar/graphite-overview | |||
| * http://graphite.wikidot.com/whisper | |||
| * http://graphite.wikidot.com/ | |||
| * http://graphite.readthedocs.org/en/latest/config-local-settings.html | |||
| * graphite admin:  http://192.168.50.15/admin - sudomesh:sudomesh | |||
| * 09.18.2014, ChrisJ started installing this on the monitor server.  Not finished... | |||
| * 10.02.2014 - another night... continuing. | |||
| * installed on host: monitor:/opt/graphite/ | |||
| * install tutorial: https://tipstricks.itmatrix.eu/installing-icinga2-in-debian-wheezy/ | |||
| ==== Sample live charts rendered from Graphite ==== | |||
| http://192.168.50.15/render?width=400&from=-2hours&until=now&height=250&target=icinga.localhost.ping4.rta&_uniq=0.43809576937928796&title=icinga.localhost.ping4.rta&.png | |||
| <br>Round trip average ping over the last 2 hours: | |||
| === Grafana === | |||
| http://grafana.org/ | |||
| * a cool dashboard that you can install under apache (maybe ngnx) and crete a dashboard of data streamed from graphite | |||
| * limited to line charts, no meters, gauges, etc. | |||
| Installed in: /opt/grafana | |||
| As of 10.30.2014 - net yet working...  has an issue connecting to Elasticsearch to save settings in grafana. | |||
| https://github.com/grafana/grafana/issues/330 | |||
| ==== Setup ==== | |||
| http://192.168.50.15/grafana | |||
| http://192.168.50.15/grafana/#/dashboard/db/sudomesh - works best in firefox... | |||
| ==== Elasticsearch ==== | |||
| * requires elasticsearch - https://xenforo.com/community/threads/how-to-basic-elasticsearch-installation-debian-ubuntu.26163/ | |||
| http://192.168.50.15:9200 - returns something like this: | |||
| <pre> | |||
| { | |||
| "status" : 200, | |||
| "name" : "Exterminator", | |||
| "version" : { | |||
| "number" : "1.0.3", | |||
| "build_hash" : "61bfb72d845a59a58cd9910e47515665f6478a5c", | |||
| "build_timestamp" : "2014-04-16T14:43:11Z", | |||
| "build_snapshot" : false, | |||
| "lucene_version" : "4.6" | |||
| }, | |||
| "tagline" : "You Know, for Search" | |||
| } | |||
| </pre> | |||
| === Graphene === | |||
| * https://github.com/jondot/graphene | |||
| * http://jondot.github.io/graphene - demo | |||
| === Cricket === | |||
| * http://cricket.sourceforge.net/ | |||
| * | |||
| == motd == | |||
| We can set a boot message for the /etc/motd file.   | |||
| These samples from http://patorjk.com/software/taag/#p=testall&h=0&v=0&f=Mer&t=SudoMesh | |||
| <pre> | |||
|    ____           __        __  ___             __  | |||
|   / __/ __ __ ___/ / ___   /  |/  / ___   ___  / /  | |||
|  _\ \  / // // _  / / _ \ / /|_/ / / -_) (_-< / _ \ | |||
| /___/  \_,_/ \_,_/  \___//_/  /_/  \__/ /___//_//_/ | |||
| ______________________________________________________ | |||
|     SudoMesh Monitoring Server - Fall, 2014 | |||
|  _______            __         _______                __     | |||
| |     __|.--.--..--|  |.-----.|   |   |.-----..-----.|  |--. | |||
| |__     ||  |  ||  _  ||  _  ||       ||  -__||__ --||     | | |||
| |_______||_____||_____||_____||__|_|__||_____||_____||__|__| | |||
|     SudoMesh Monitoring Server - Fall, 2014                                                           | |||
|  ____                  __                                          __          | |||
| /\  _`\               /\ \             /'\_/`\                    /\ \         | |||
| \ \,\L\_\    __  __   \_\ \     ___   /\      \      __     ____  \ \ \___     | |||
|  \/_\__ \   /\ \/\ \  /'_` \   / __`\ \ \ \__\ \   /'__`\  /',__\  \ \  _ `\   | |||
|    /\ \L\ \ \ \ \_\ \/\ \L\ \ /\ \L\ \ \ \ \_/\ \ /\  __/ /\__, `\  \ \ \ \ \  | |||
|    \ `\____\ \ \____/\ \___,_\\ \____/  \ \_\\ \_\\ \____\\/\____/   \ \_\ \_\ | |||
|     \/_____/  \/___/  \/__,_ / \/___/    \/_/ \/_/ \/____/ \/___/     \/_/\/_/ | |||
|      SudoMesh Monitoring Server - Fall, 2014 | |||
|   ____                _           __  __                _      | |||
|  / ___|   _   _    __| |   ___   |  \/  |   ___   ___  | |__   | |||
|  \___ \  | | | |  / _` |  / _ \  | |\/| |  / _ \ / __| | '_ \  | |||
|   ___) | | |_| | | (_| | | (_) | | |  | | |  __/ \__ \ | | | | | |||
|  |____/   \__,_|  \__,_|  \___/  |_|  |_|  \___| |___/ |_| |_| | |||
|      SudoMesh Monitoring Server - Fall, 2014 | |||
| </pre> | |||
Latest revision as of 13:36, 30 December 2015
See Also: Technical Documentation, Bandwidth Quotas
This is all currently out of date. The currently implemented monitor server is running on VPS, which is tunneled into the network. It's running Cacti and Smokeping as well as gathering logs.
When I have the time, I'll update the details here, but I've documented the setup as provisioning code here: https://github.com/max-b/mesh-playbooks - User:Maxb
The Basics
See Also: Mesh/Icinga 2
The Hardware
Small frame Dell PC, service tag: 2FDSGC1, green tape with info on the front.
The Software
- OS: Linux monitor 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u3 x86_64 GNU/Linux
Use It
ssh to:
- monitor or monitor.local - (as of 10.30.2014, not accessible by this host name, perhaps dns is not correctly defined)
- 192.168.50.15
- login: sudoroom:sudomesh
Icinga web UI:
- http://192.168.50.15/icinga2-classicui (login: icingaadmin:mojave)
Charting:
- Graphite - http://192.168.50.15/
- Grafana - http://192.168.50.15/grafana
- Mesh Charts - http://192.168.50.15/grafana/#/dashboard/db/sudomesh (works best in firefox)
Github repo:
List of hosts
Run nmap (09.18.2014) to see a list of hosts which we might want to monitor...
Most of these are probably laptops connected to the internet.
nmap -sn 192.168.50.0/24
- 192.168.50.15 - monitoring server
Icinga
Interesting article: Setting up Icinga on a Debian Server to Remotely Monitor an OpenWrt Router
Icinga is a Nagios fork which, as of Fall 2013, has more development involvement. Icinga is the central system that pings other systems like openWRT. Icinga gathers the data and can track and send notifications when values drift beyond normal tolerances. On the remote hosts it is required to install nrpe and a basic set of nrpe plugins. The article referenced above shows how it is possible to install nrpe on openwrt through the openwrt web interface. After that one can ssh into the router and configure it.
Once the router is configured it is necessary to configure the central Icinga server with:
- the IP address of each node it will track
- the host groups
- the services that are to be monitored.
Icinga 2
Icinga 2 is a rewrite of Icinga with a cleaner implementation and configuration structure. It will be able to run SNMP calls to nodes which run mini-snmpd and be able to send performance data to charting apps like graphite.
See Install notes at: Icinga 2
Access to remote hosts
Icinga can use various methods to monitor remote hosts, NRPE, SNMP, etc.
IMCP
The ping service; for determining if the host is reachable
SNMP
We will probably start off with simple SNMP monitoring. It return very basic info but it does not require too much setup on the remote hosts.
OpenWRT Package: nrpe
NRPE requires a daemon running on the remote host and a number of mostly bash scripts for specialized info. Bash scripts can be developed to read disk stats, memory usage, etc. There are about 30 scripts which come with NRPE.
Package: nrpe Version: 2.12-4 Depends: libc, librt, libpthread, libopenssl, libwrap Source: feeds/packages/admin/nrpe SourceFile: nrpe-2.12.tar.gz SourceURL: @SF/nagios Section: admin Architecture: ib42x0 Installed-Size: 19018 Filename: nrpe_2.12-4_ib42x0.ipk Size: 19801 MD5Sum: f36019344c747a1e88f5aab50776bd4e Description: The NRPE addon is designed to allow you to execute Nagios plugins on remote Linux/Unix machines. The main reason for doing this is to allow Nagios to monitor "local" resources (like CPU load, memory usage, etc.) on remote machines. Since these public resources are not usually exposed to external machines, an agent like NRPE must be installed on the remote Linux/Unix machines.
Alternative tools
RRD
- http://oss.oetiker.ch/rrdtool/
- also look at whisper, a light weight rrd - http://graphite.wikidot.com/whisper
collectd
Sensu - Deprecated
See our notes about the Sensu install: Sensu Page
Requires a client application service and not useful for our needs for monitoring mesh nodes...
- The sensu web page can be accessed internally at:
- http://192.168.42.65:8080/# (as of 2014.07.25 no services defined)
- user: admin
- pw: secret
 
About Sensu - From Sensu site
Sensu is often described as the "monitoring router". Essentially, Sensu takes the results of "check" scripts run across many systems, and if certain conditions are met; passes their information to one or more "handlers". Checks are used, for example, to determine if a service like Apache is up or down. Checks can also be used to collect data, such as MySQL query statistics or Rails application metrics. Handlers take actions, using result information, such as sending an email, messaging a chat room, or adding a data point to a graph. There are several types of handlers, but the most common and most powerful is "pipe", a script that receives data via standard input. Check and handler scripts can be written in any language, and the community repository continues to grow!
Fun Sensu facts:
- Written in Ruby, using EventMachine.
- Has great test coverage with continuous integration via Travis CI.
- Can use existing Nagios plugins.
- Configuration is all in JSON.
- Has a message-oriented architecture, using RabbitMQ and JSON payloads.
- Packages are "omnibus", for consistency, isolation, and low-friction deployment.
- Sensu is designed for modern infrastructures and to be driven by configuration management tools, designed for the "cloud".
Charting
Graphite
- http://www.slideshare.net/reyjrar/graphite-overview
- http://graphite.wikidot.com/whisper
- http://graphite.wikidot.com/
- http://graphite.readthedocs.org/en/latest/config-local-settings.html
- graphite admin: http://192.168.50.15/admin - sudomesh:sudomesh
- 09.18.2014, ChrisJ started installing this on the monitor server. Not finished...
- 10.02.2014 - another night... continuing.
- installed on host: monitor:/opt/graphite/
- install tutorial: https://tipstricks.itmatrix.eu/installing-icinga2-in-debian-wheezy/
Sample live charts rendered from Graphite
 
Round trip average ping over the last 2 hours:
Grafana
- a cool dashboard that you can install under apache (maybe ngnx) and crete a dashboard of data streamed from graphite
- limited to line charts, no meters, gauges, etc.
Installed in: /opt/grafana
As of 10.30.2014 - net yet working... has an issue connecting to Elasticsearch to save settings in grafana. https://github.com/grafana/grafana/issues/330
Setup
http://192.168.50.15/grafana/#/dashboard/db/sudomesh - works best in firefox...
Elasticsearch
- requires elasticsearch - https://xenforo.com/community/threads/how-to-basic-elasticsearch-installation-debian-ubuntu.26163/
http://192.168.50.15:9200 - returns something like this:
{
"status" : 200,
"name" : "Exterminator",
"version" : {
"number" : "1.0.3",
"build_hash" : "61bfb72d845a59a58cd9910e47515665f6478a5c",
"build_timestamp" : "2014-04-16T14:43:11Z",
"build_snapshot" : false,
"lucene_version" : "4.6"
},
"tagline" : "You Know, for Search"
}
Graphene
Cricket
motd
We can set a boot message for the /etc/motd file.
These samples from http://patorjk.com/software/taag/#p=testall&h=0&v=0&f=Mer&t=SudoMesh
   ____           __        __  ___             __ 
  / __/ __ __ ___/ / ___   /  |/  / ___   ___  / / 
 _\ \  / // // _  / / _ \ / /|_/ / / -_) (_-< / _ \
/___/  \_,_/ \_,_/  \___//_/  /_/  \__/ /___//_//_/
______________________________________________________
    SudoMesh Monitoring Server - Fall, 2014
 _______            __         _______                __    
|     __|.--.--..--|  |.-----.|   |   |.-----..-----.|  |--.
|__     ||  |  ||  _  ||  _  ||       ||  -__||__ --||     |
|_______||_____||_____||_____||__|_|__||_____||_____||__|__|
    SudoMesh Monitoring Server - Fall, 2014                                                          
 ____                  __                                          __         
/\  _`\               /\ \             /'\_/`\                    /\ \        
\ \,\L\_\    __  __   \_\ \     ___   /\      \      __     ____  \ \ \___    
 \/_\__ \   /\ \/\ \  /'_` \   / __`\ \ \ \__\ \   /'__`\  /',__\  \ \  _ `\  
   /\ \L\ \ \ \ \_\ \/\ \L\ \ /\ \L\ \ \ \ \_/\ \ /\  __/ /\__, `\  \ \ \ \ \ 
   \ `\____\ \ \____/\ \___,_\\ \____/  \ \_\\ \_\\ \____\\/\____/   \ \_\ \_\
    \/_____/  \/___/  \/__,_ / \/___/    \/_/ \/_/ \/____/ \/___/     \/_/\/_/
     SudoMesh Monitoring Server - Fall, 2014
  ____                _           __  __                _     
 / ___|   _   _    __| |   ___   |  \/  |   ___   ___  | |__  
 \___ \  | | | |  / _` |  / _ \  | |\/| |  / _ \ / __| | '_ \ 
  ___) | | |_| | | (_| | | (_) | | |  | | |  __/ \__ \ | | | |
 |____/   \__,_|  \__,_|  \___/  |_|  |_|  \___| |___/ |_| |_|
     SudoMesh Monitoring Server - Fall, 2014