I have a centOS box set up as a gateway with IPTables (among many things such as proxy cache using Squid). My LAN consists of ~30 machines, all connect to the internet through the gateway's eth0 (which is the only computer connected to the DSL modem - via eth1).
Facing a monthyl download quota, I installed bandwidthd on the gateway to monitor the PC's bandwidth use, and set to monitor eth0.
All the network traffic coming to the gateway is forwarded to port 3128 for Squid to handle, but on bandwidthd it doesnt seem to be accurate. For instance, a 3.3 MB download on one of the computers after starting bandwidthd states 4.8 M for that computer under HTTP. I understand that by monitoring eth0 I'm monitoring all network activity and not just internet usage (port 80, etc..), but isn't the HTTP tab in bandwidthd relevant to packets on port 80 or internet usage?
I need to know how much of the internet download quota each IP used on a daily basis.
What to do :) ?
The problem I find with counter applications is that once you have usage counts, or graphs, the inevitable question becomes "what is that" (when pointing at a total or peak on a graph).
I find the best way to investigate these things is through netflows and nfsen. A netflow is a record of a conversation: source, destination, ports, bytecounts, time. Think a wire capture where you don't care about the actual bytes transferred, just the aggregate information. By using something like nfsen to do analysis on a netflow collection, you can see who is talking to who, and by looking at ports, can usually make good guesses as to what they were doing. And best of all, you can go back in time to look at old conversations.
Here are my notes for installing nfsen on CentOS.
Also an old but trusty tool would be iptraf. It comes with every distribution,small and quick to install. It's nice that it is ncursed based so you can run it from the command line quickly to get realtime info about the flow of traffic on that box.
You could probably set a specific iptables allow rule for port 3128, since that is all of your outbound traffic and then use "/sbin/iptables -L -v" and it shows you the bandwidth and amount of packets that have came through each rule.
Trust bandwidthd. A 3.3MB file COULD conceivably use 4.8MB of bandwidth. I don't think you are accounting for overhead. Each packet, or bit of data from that file has information attached to it that helps it get to it's destination.
I don't know how to calculate the amount of overhead you should see. Hopefully someone can chime in on that and add to this answer.
i decided on using ntop, its rather simple to install with a very rich web interface. it will most probably do the job :)
edit: my problem's solved. under HTTP with ntop, while monitoring the eth0 to which all traffic is routed, i can see accurately the size of everything i download to every ip.
What is the best tool to monitor/analyze network traffic on an entire network (several subnets)?
I'm looking for something that will help me toubleshoot bandwidth problems when, for instance, users start complaining that the "network is slow"
I'm assuming you have a commercial router/switch, it most likely has SNMP which you can combine with MRTG for a nice traffic graph.
I think your best bet is going to be a mixture of Cacti and Ntop.
ntop is going to provide you information about the traffic on your network, like the hosts that are consuming the most... what traffic is causing slowdowns, etc...
Cacti is going to give long term trends about your bandwidth consumption so you can tell how you networks traffic has changed over time.
When you have users reporting 'network issues', the problem could relate to a multitude of issues (routing, switching, host configuration, unicast, multicast, security policy, hardware failure). It's very unlikely that you'll find one piece of software to monitor all your different potential problems.
Instead, focus on two things:
Instrumentation: come up with a monitoring strategy that allows you proactively monitor for those faults that occur regularly. See this previous answer for more detail.
Troubleshooting: come up with a quick, standard series of tests that you can run to immediately try and isolate where the problem might be, and publish it to your users.
Some example tests:
These kinds of simple diagnostics can often point you very quickly in the right direction. Finally, if you can, always get a source IP, a destination IP, and a destination port. Try and educate your users; ambigious complaints like 'the network is slow' can't be easily diagnosed.
Try MRTG and/or ntop.
I have been using smoothwall at home with great success, it does a great job monitoring traffic and a ton more.
It comes in a corporate edition as well that does some more fancy stuff.
I was trying to figure out why I kept on running out of bandwidth (in Australia we have limits) turns out it was my fault :)
I'm working at an organization that has a small to medium sized network (~500 users) and about a dozen /24 subnets (and a handful of smaller ones behind NAT). We use a variety monitoring software that allows us to keep tabs on remote parts of the network and respond to problems proactively.
Check out the products from VSS Monitoring. They have several different in-line fail safe products for monitoring network traffic remotely. Once you have them peered into your network(s) and on the backbone, it is as good as being there.
If you have a router capable of reporting netflows, look into a netflow handler. Where MRTG will provide link utilization, netflows report IP and protocol usage flowing through the router. So, instead of "Suzy in accounting it using a lot of traffic" or "The port the WAP is on has high utilization", you could see "Suzy in accounting is 10% LAN traffic, 40% streaming media, and 50% internet HTTP traffic.
Unfortunately I don't have a recommendation for a free flow aggregator. After a net monitoring company tried to sell my company a solution and I determined that their whole product was based on netflows, I made a note to research them. Before I got around to it we bought another NOC solution that also included a flow aggregator.
I've been using Wireshark for years. Love it.
First of all, are they users complaining about your local network ?
The fileserver is slow!
The fileserver is slow!
or are they complaining about remote websites ?
Facebook is slow! I can't do my work!
Facebook is slow! I can't do my work!
If it's the former, then I would start with the fileserver in question and work backwards. First of all check the fileserver, is it's utilization out of the ordinary ? Check the interface that user traffic flows over. Is it pegged ? Is auto negotiation enabled ? Is it enabled on both ends ...
If everything looks ok there and the server is not under any undue load, try the routers and switches in the path between the user and the server. Are they overloaded ? auto neg enabled ? check the interface counters for errors.
If that appears to be nothing wrong, then the problem may be local to the users work station. Is it under undue load ? Are there any hardware errors (disk errors causing blocking while the firmware retries) ? Is their machine low on real memory (firefox paging hard) ?
This usual solves 99% of the problems.
Depending on the frequency you have to deal with these requests you may prefer to reverse the order of these steps.
Alternatively if it's a problem with a remote site, after debugging your network, and the users workstation try tools like mtr to detect packet loss between you and the remote site. If the problem is not local to your network then your options are probably limited to logging a case with your provider, or waiting till the remote site gets over whatever tizzy it's having.