Sunday, November 16, 2014

Monitor Outgoing Internet Connections - #1 (Starting)

Originally posted at OpenWRT forum at https://forum.openwrt.org/viewtopic.php?id=54048 

Hi folks!
Wanted to share my experience and get some feedback and advice. My objective is to log and analyze/visualize all Internet connections of my local network to get visibility of internet usage by individual devices.
Setup: cable modem (100MB down, 6MB up) – with Fritzbox 7390 for WLAN connectivity. Amazingly many networked devices (iPad, Chromebook, smartphone, Internet radio, Skype appliance, home pc, work laptop, blue ray player, little NAS, …).
Got me a TP-Link WDR3600 for 44€ at my local Mediamarkt and installed OpenWRT – worked flawless. Nice.
Now, how do I capture/log all Internet traffic?

Option 1 – transparent proxy with TinyProxy
Installation worked fine (opkg install tinyproxy luci-app-tinyproxy). Before changing routing for transparent proxy I manually changed client to use proxy. Web performance was noticable slower. And it proxies HTTP only, for HTTPS I have to install my own certificates on all devices? Is there a simpler way? And honestly, routing on OpenWRT is a piece of work. By default, the logs show the URL without the hostname.
Internet reference (http://www.farville.com/home-networks-a-transparent-proxy-to-monitor-kids/) requires another server for log processing (yes, I already got a Raspberry lying around, but no…).

Option 2 – TCPDUMP
If setting up a transparent proxy and logging is so complex, why not listening to the traffic directly? Wireshark/Tshark is not available on OpenWRT, but TCPdump is.
Installing TCPdump is very easy: “opkg install tcpdump”.
Then use “tcpdump –D” to show all network interfaces, or “tcpdump –q –tttt” to show all connections with timestamp.
tcpdump -q -i br-lan –tttt
2014-11-14 16:51:16.282484 IP iPadwwwi2daycom.lan.53516 > ea-in-f94.1e100.net.https: tcp 0
2014-11-14 16:51:16.282818 IP iPadwwwi2daycom.lan.53516 > ea-in-f94.1e100.net.https: tcp 0
But holy moly, it logs every fart in the air. Every Ping, every ARP, everything. Piping the simplified output (without the actual packet data) to file generates a few MB for a few minutes. And without some magic it only shows the host, not the visited URL.
tcpdump –q –i br-lan –tttt > tcpdump.log
2014-11-14 19:11:04.356741 IP iPadwwwi2daycom.lan.54532 > 195.10.18.43.https: tcp 0
2014-11-14 19:11:04.357311 IP iPadwwwi2daycom.lan.54533 > 195.10.18.43.https: tcp 0
2014-11-14 19:11:04.400291 IP 195.10.18.43.https > iPadwwwi2daycom.lan.54532: tcp 0
But holy moly, it logs every fart in the air. Every Ping, every ARP, everything. Piping the simplified output (without the actual packet data) to file generates a few MB for a few minutes. And without some magic it only shows the host, not the visited URL.

tcpdump –q –i br-lan –tttt > tcpdump.log
2014-11-14 19:11:04.356741 IP iPadwwwi2daycom.lan.54532 > 195.10.18.43.https: tcp 0
2014-11-14 19:11:04.357311 IP iPadwwwi2daycom.lan.54533 > 195.10.18.43.https: tcp 0
2014-11-14 19:11:04.400291 IP 195.10.18.43.https > iPadwwwi2daycom.lan.54532: tcp 0

I guess I have to invest some time in setting up proper filtering.
But here is the showstopper: tcpdump does not run in background. After you kill the SSH session, TCPdump will stop too (even if you run with “&” parameter). Did not find any successful recipe on the Internet (tried “screen” too).

Option 3  - DNS logging
Another direction is to monitor all DNS requests coming from the local network. OpenDNS offers some services/functionality here. Of course, this will only display the Internet host and not the full URL either.
Apparently OpenWRT ships with DNSMASQ for DHCP and DNS. The manpage at http://www.thekelleys.org.uk/dnsmasq/docs/dnsmasq-man.html shows us how to manipulate the DNS answers for a short Time-To-Live – so that every website visit triggers a new DNS query (even if the client should remember the correct IP from a minute ago).
First install text editor nano (alternative to VI) via “opkg install nano”.  Now configure DNSMASQ with “nano /etc/dnsmasq.conf”.
Add the following lines:
# Set the TTL value returned in answers from the authoritative server.
max-ttl=0
auth-ttl=0
Finally restart dnsmasq with “reboot” or “killall dnsmasq” and “/etc/init.d/dnsmasq start”.
Now you can read every DNS query on syslog with “logread | grep "query\[A"” (try logread to see every message). And with “logread –f | grep “query\[A” >> dnsmasq.log &” we write all new entries to a logfile in the background (can disconnect from SSH session). The logfile is only a few dozen kB after an hour and looks like this:
Sun Nov 16 18:14:56 2014 daemon.info dnsmasq[11659]: query[A] [url=http://www.facebook.com]www.facebook.com[/url] from 192.168.1.244
Sun Nov 16 18:14:56 2014 daemon.info dnsmasq[11659]: query[AAAA] [url=http://www.facebook.com]www.facebook.com[/url] from 192.168.1.244
Sun Nov 16 18:14:57 2014 daemon.info dnsmasq[11659]: query[A] farm.plista.com from 192.168.1.244
Sun Nov 16 18:14:57 2014 daemon.info dnsmasq[11659]: query[AAAA] farm.plista.com from 192.168.1.244
Sun Nov 16 18:14:57 2014 daemon.info dnsmasq[11659]: query[A] csi.gstatic.com from 192.168.1.244
Sun Nov 16 18:14:57 2014 daemon.info dnsmasq[11659]: query[AAAA] csi.gstatic.com from 192.168.1.244
Sun Nov 16 18:14:58 2014 daemon.info dnsmasq[11659]: query[A] pubads.g.doubleclick.net from 192.168.1.244
Sun Nov 16 18:14:58 2014 daemon.info dnsmasq[11659]: query[AAAA] pubads.g.doubleclick.net from 192.168.1.244
Sun Nov 16 18:14:58 2014 daemon.info dnsmasq[11659]: query[A] [url=http://www.google-analytics.com]www.google-analytics.com[/url] from 192.168.1.244
Sun Nov 16 18:14:58 2014 daemon.info dnsmasq[11659]: query[AAAA] [url=http://www.google-analytics.com]www.google-analytics.com[/url] from 192.168.1.244
Sun Nov 16 18:14:58 2014 daemon.info dnsmasq[11659]: query[AAAA] partnerad.l.doubleclick.net from 192.168.1.244
Nice, excellent! Small logfile, every webvisit on there, timestamp and IP address of local host. Now I just have to filter out all the advertising BS associated with every website (create your own adblocker by manipulating the DNS records for all these trackers).
Now I just have to download the logfile, and display connections over time (still have to figure this one out).


Summary:
I am surprised that there are not that many posts on the Internet for this specific use case (why has nobody posted a simple how-to?)
I am surprised how much traffic a simple website generates (yes, I knew this before, but seeing it is different).
I am curious if this forum can give me some new pointers/ideas that actually work!

No comments:

Post a Comment