20130222

How to monitor a not centralized - best effort BGP based wireless network

Some thoughts on how to monitor the health of a non-centralized best effort amateur mostly-BGP wireless network.

Why not asking all the backbone node operators to send a detailed email with the peering IP addresses or even better give them the password to a nagios to enter them by themselves?

Unfortunately, this is not going to work --not in Greece, not for AWMN at least.

Why not using the Wireless Node Database for input to the monitoring system?

Well, the Wireless Node Database contains information `maintained` by the backbone operators and unfortunately even though Backbone operators usually put their links in there ( sign of status ) they forget to take them off once they are dead and they usually do not put peering IP addresses.

In a best effort non-professional internet one cannot rely on the operators for accurate information. Even in a professional Internet one cannot rely on the human operators. This is something that needs to be automated.

In a BGP based network finding all the routers along with their peering IP addresses and checking them from monitoring systems placed in different parts of the network should be sufficient.

I thought a couple of ways of finding all the peering IP addresses and both start with a BGP feed --a "show ip bgp" on a router or a quagga daemon.

This way I get the routes advertized along with AS paths and the originating AS numbers.

Great! Now one could go two ways and the first one sounds easier at first.

1) Scan the advertised IP space for daemons listening on the tcp port 179 ( BGP ). ~Trivial with a scanner like nmap or a little perl script. Some `passive` scanning could tell apart unique routers but I would still need to figure out the site - side of a link - peering for a router. Also, a daemon listening on 179 does not necessarily mean the edge of a link.

2) Traceroute --one to each route. A `traditional` traceroute will show only one side for each link, therefore only half of the peering IP addresses can be seen from one point assuming all paths are symmetric. For the purposes of a monitoring system that should be good enough. Knowing all the peering IP addresses would be even better since inner-node problems could be spotted easier on monitoring. Some claim to have written reverse traceroute tools but I have not been able to find any sources.

It' s late and I need to sleep. END of Part 1.