# Tracking internet activity

## The Doctor

I want to monitor all sites my box visits. To be clear I'm not trying to monitor another (legitimate) user. My reason is to track (and block) ad/tracking sites or, in the worst case, detect malware calling home.

My ideal format would be a list of human readable urls which I can filter out trusted sites like forums.gentoo.org and eliminate duplicates. The list will be e-mailed daily in a cron job. The last part is fairly easy so once I have the log it shouldn't be a problem. I can't seem to find a good unix way to generate said log. Any tips and tricks? And did I miss something obvious? Google has not been helpful.

Thanks.

----------

## saturnalia0

 *The Doctor wrote:*   

>  My reason is to track (and block) ad/tracking sites

 

Besides the usual extensions such as uBlock Origin, uMatrix/NoScript, etc you can use lists such as this: https://github.com/StevenBlack/hosts

I have found that particularly useful (using /etc/hosts for blocking ads) on my old CyanogenMod phone since ads were blocked on apps as well.

As to your actual question, would it suffice to log DNS queries? If so, `tcpdump dst port 53` plus some grepping should suffice.

----------

## khayyam

The Doctor ...

one possible solution is to use a caching proxy like net-proxy/polipo, you would then set logLevel=4 (for LOGGING_MAX). The advantage to this might be that as it's cached you would be able to do some diagnostics should you need to (though you would probably want to rotate the logs on a regular basis).

@saturnalia0 ... I don't recommend /etc/hosts for ad blocking and the like, something like this is better managed by a dns proxy such as net-dns/unbound (using an 'include' file generated via unbound-block-hosts).

best ... khay

----------

## cboldt

I use dnsmasq to return 127.0.0.1 for ad servers, and get (monthly update) a list of adserver names from https://pgl.yoyo.org/adservers/serverlist.php?hostformat=dnsmasq

Edit to add, dnsmasq can also directly log inquiries made to it.  The logging function is quite flexible.Last edited by cboldt on Tue May 02, 2017 11:59 am; edited 1 time in total

----------

## pietinger

 *The Doctor wrote:*   

> [...] My reason is to track (and block) ad/tracking sites or, in the worst case, detect malware calling home.

 

Iam using "privoxy" since years and it filters out all ads. You can set the log-level just as you want.

----------

## Zucca

I don't monitor the traffic (but I should).

Instead I use /etc/hosts mainly and update it from here. While it's not intended for that use case, it's perhaps the most simple way.

 *pietinger wrote:*   

> Iam using "privoxy" since years and it filters out all ads. You can set the log-level just as you want.

 Can it filter https served ads too?

----------

## szatox

Iptables offers LOG target. You can use it to catch new connections and define information you want to keep in log (e.g. IP and port number).

Not a very clever solution, but it certainly does let you collect some stats.

----------

## The Doctor

Thanks for the replies.

I neglected to mention that I've already got ublock and some /etc/hosts blocking going already. My problem is simply figuring out what I might want to block in my case.

net-proxy/polipo looks close to what I want, but it is dead upstream. This led me to squid and I spent most of the day playing with it. Maybe I'll make some progress with it tomorrow.

Actually the problem seems much harder than I thought it would be. Redirecting system wide traffic to the proxy is straight forward, but it can't handle https traffic and leaves a lot of activity unmonitored and, with my uber impressive computer skills, more than half the internet inaccessible.

I considered the iptables solution but I'm hesitant because it might get spammy and it only gets ips. Perhaps there is an idea for a project here. Write a utility that takes an iptables log and converts it into something readily usable. I could use host to convert the ips back then a little shell magic to make the log. I think I'll play with that tomorrow as well.

----------

## pietinger

 *Zucca wrote:*   

> Can it filter https served ads too?

 

yes.

----------

## Zucca

 *The Doctor wrote:*   

> net-proxy/polipo looks close to what I want, but it is dead upstream. This led me to squid and I spent most of the day playing with it. Maybe I'll make some progress with it tomorrow.
> 
> Actually the problem seems much harder than I thought it would be. Redirecting system wide traffic to the proxy is straight forward, but it can't handle https traffic and leaves a lot of activity unmonitored and, with my uber impressive computer skills, more than half the internet inaccessible.

 I had squid as a transparent proxy. One of its jobs was to make "pre-emptive" ad-blocking where /etc/hosts was not suitable. I had a large list regexp url patterns handling the task. To make it work I had to learn a bit of iptables and port forwardings.

I never tried, but IIRC Squid has https support. I'm not sure if it's offical, but I read that from somewhere.

As I just learned that privoxy also has https support (thanks pietinger), I'm keen to try it out. If it fails I'll "fall back" to Squid eventually.

I think it's about ten years back when I had a privoxy install. Back then it seemed complex... I'll give it a go... after I have sorted out few other things.

----------

