# In Defence of Firefox: some Harvesting  by Referal Decrypted

## miroR

title: In Defence of Firefox: some Harvesting by Referal Decrypted

---

 first posted on In Defence of Firefox: some Harvesting  by Referal Decrypted, formatted for phpBB

This is an analysis of (mostly) just one particular network conversation (of my machine with the Schmoog, i.e. Google), of the kind that I don't like: the harvesting of data, the conversation PCAP having been extracted from a larger PCAP of network conversations (of my machine with mostly Github).

And I am explaining what I have learned, and I'm learning as I analyze, and, later, I'll also try and put some much more advanced questions, for any wizards to give us advice, if they read here.

Later though. Slow process for me. Not simple. Hard work, for me, to get enough clarity to post those questions. The hint only now: will be about extracting certificate from the PCAP and finding it in the /etc/ssl/certs/ in my Gentoo (but applies generally on *nix I believe). And maybe other questions.

I'll also try and make this analysis into an exercize for (newer than me) newbies to SSL decryption. So advanced users, bear up with a little patience at occasional step of mine, with suggestions for newbies, as I write this analysis/exercize.

PREPARATION: You need to have your Wireshark compiled with gnutls and gcrypt to run the commands that I'll offer further below.

Grep it with the line below, and I'll also show the output that it produces in my machine:

```

$ wireshark -v | grep -E "with GnuTLS|with Gcrypt"

without ADNS, with Lua 5.1, with GnuTLS 3.3.20, with Gcrypt 1.6.4, without

1.7.4, with libz 1.2.8, with GnuTLS 3.3.21, with Gcrypt 1.6.4.

$

```

(If you want to know why the two lines, for GnuTLS and for Gcrypt, respectively, run just the 'wireshark -v' and see, or I can tell you, the first is for 'Compiled ... with ...', the second is for "Running ... with ...")

If you don't have those, recompile Wireshark with the right flags (if on Gentoo; or get your devs to do the package right/make the package right yourself, if on some other distro). Then check again, and once it has the right "with ..." lines, proceed.

Download:

http://www.croatiafidelis.hr/foss/cap/cap-160207-Github-Sch/dLo.sh

into where you have all privs, do:

```

$ chmod 755 dLo.sh 

```

and run it:

```

$ ./dLo.sh 

```

Then:

```

$ cd cap-160207-Github-Sch/

```

and check the sums, and verify them if you know how.

Now you should be able to run the commands further below.

They're further below, because I have to explain how I extracted the publishable PCAP. I can't make available the entire uncenz-produced, and stowed in my archive, screencast/dump pair, primarily I can't just show you my entire decryptable network dump of that period of time of mine online.

The entire (decrypted) dump you could only see if I decided to go through the trouble of changing my Github password. Namely for the dump to not have the record layer (the data, not the SYNs, ACKs, the handshakes etc., but the data sent to and fro) mostly completely obscured (into undecryptable binary mumbo-jumbo darkness), I'd need to post the keyfile with the corresponding keys for decrypting of that dump, but then any geek downloading from the link above would also be able to get their hands on my GitHub password. So, I'm not posting either the entire PCAP or the entire set of keys, instead I'm posting just the PCAP with the relevant packets that contain the network conversation that I will dedicate most of the time to in this analysis, along with the SSL keys to decrypt it.

But first I have to explain how I reached to that point that I will analyze further below (the one containing the little bit of harvesting of data).

I chose a filter from:

http://www.askapache.com/software/sniff-http-to-debug-apache-htaccess-and-httpdconf.html

( a note: while it talks about checking on your own server, most of the filters can be used just fine like I used the one below, for other purposes )

and I ran it on the above mentioned with-my-password-and-so-unpublishable PCAP (entire capinfos in the textfile dump_160207_1329_g0n_github.pcap.capinfos in the download, here are just some lines):

```

File name:           dump_160207_1329_g0n_github.pcap

...

Number of packets:   6,643

...

Capture duration:    1257.407431969 secon

First packet time:   2016-02-07 13:29:21.552417719

Last packet time:    2016-02-07 13:50:18.959849688

...

```

And so here's that filter that I chose to run on that PCAP (tshark is kind of the Wireshark's command line version, newbies see 'man tshark'):

```

tshark -r dump_160207_1329_g0n_github.pcap -Y 'http.request.method == "POST" || http.request.method == "PUT"'

```

and the output (a few lines from it only here, the entire output is in the dump_160207_1329_g0n_github.pcap.POST textfile of 31 POST-lines, in the download):

```

405 42.053208348  192.168.1.2 -> 192.30.252.126 HTTP 778 POST /_private/browser/stats HTTP/1.1  (application/json)

1204 611.542155907  192.168.1.2 -> 192.30.252.125 HTTP 433 POST /_private/browser/stats HTTP/1.1  (application/json)

1291 627.303332501  192.168.1.2 -> 192.30.252.128 HTTP 696 POST /session HTTP/1.1  (application/x-www-form-urlencoded)

...

2886 793.834237827  192.168.1.2 -> 208.117.229.249 OCSP 497 Request

2895 793.913826633  192.168.1.2 -> 208.117.229.219 HTTP 839 POST /collect HTTP/1.1  (text/plain)

3554 796.866143872  192.168.1.2 -> 192.30.252.127 HTTP 454 POST /_private/browser/stats HTTP/1.1  (application/json)

3764 822.916616412  192.168.1.2 -> 192.30.252.131 HTTP 1143 POST /users/follow?target=blueness HTTP/1.1  (application/x-www-form-urlencoded)

...

4793 977.206063010  192.168.1.2 -> 54.69.216.138 HTTP 620 POST /v3/links/view HTTP/1.1  (text/plain)

5038 979.549264512  192.168.1.2 -> 192.30.252.125 HTTP 744 POST /_private/browser/stats HTTP/1.1  (application/json)

...

```

Let's get on somewhat clearer terms ('man awk', 'man sort' newbies):

```

$ cat dump_160207_1329_g0n_github.pcap.POST | awk '{ print $5 }' | sort -u

192.30.252.124

192.30.252.125

192.30.252.126

192.30.252.127

192.30.252.128

192.30.252.131

208.117.229.219

208.117.229.249

54.69.216.138

```

That's only 9 POST destination addresses, from 31 lines (the source address being always my machine; to explain: my provider gives no access to any particular machines attached to the router to its dynamic address, any machine accessing the internet sees only just its local address given on the router's local network 192.168.1.0/24, search by these terms on www.wikipedia.org , newbies)...

So that's only 9 POST destination addresses, but that's only 3 networks actually. Because I can grep those in the dump_160207_1329_g0n_github.pcap.hosts file (

which I got with this command:

```

$ tshark -r dump_160207_1329_g0n_github.pcap -qz hosts > dump_160207_1329_g0n_github.pcap.hosts

```

)

(and you can find it in the download).

Showing it to you with the output (newbies, that's just a for loop along with the command substitution, try 'man bash'):

```

$ for i in $(cat dump_160207_1329_g0n_github.pcap.POST | awk '{ print $5 }' | sort -u); do grep $i dump_160207_1329_g0n_github.pcap.hosts; done;

192.30.252.124   api.github.com

192.30.252.125   api.github.com

192.30.252.126   api.github.com

192.30.252.127   api.github.com

192.30.252.128   github.com

192.30.252.131   github.com

208.117.229.219   www.google-analytics.com

208.117.229.249   clients.l.google.com

54.69.216.138   tiles.r53-2.services.mozilla.com

```

Anything  192.30.252.xxx is the github network (and I have to say they're marvelous; so I can only hope they don't transgress against privacy further then the venial level seen here). The bottom is from mozilla networks (I have started to trust Mozilla recently, so I just didn't investigated those packets much), and there you can see two lines from the Schmoog the Octopus' own network. The Schmoog's blah-blah.249 is just a OCSP Request (Online Certificate Status Protocol, and nothing much in there, all regular), but the most interesting POST little bit of traffic (the bit of harvesting of data) is, as you'll see soon below, in the conversation with Schmoog's blah-blah.219 address, the www.google-analytics.com, the one among the Schmoog's addresses that the disconnect.me (

https://disconnect.me/

) bans from users' machines if they know how, or are helped to, employ it, and with Firefox, and its private browsing, and its tracking protection, and its selective cookies polices and stuff, I think I've somewhat learned, and have been helped. Thank you old guard Mozilla team!.

So how did this conversation not get disconnected? Because the referer was github.com and so it was, from the client's --the browser's, the Moz FF's-- point of view, legitimate. As I said at Should firefox be removed from portage?, it's just a venial sin.

I'm not mad at Github.com for this (although I really don't like it).

Now, setting this filter:

```

ip.addr==208.117.229.219

```

in Wireshark with my non-publishable PCAP, and clicking away like this: File > Export Specified Packets and saving it, got me the PCAP file dump_160207_1329_g0n_github.pcap (in the download).

To the bit exact same output as with Wireshark, I got with tshark: 

```

tshark -r dump_160207_1329_g0n_github.pcap -Y "ip.addr==208.117.229.219" -w dump_160207_1329_g0n_github_Schmoog-Yip.pcap

```

(the infix '-Yip' is from the filter, nothing other than mnemonic by-infix-naming of new files in the work)

With tshark, this:

tshark -r dump_160207_1329_g0n_github.pcap -Y "tcp.stream==99" -w dump_160207_1329_g0n_github_Schmoog-s099.pcap

also gets the exact same PCAP out (dump_160207_1329_g0n_github_Schmoog-s099.pcap==dump_160207_1329_g0n_github_Schmoog-Yip.pcap==dump_160207_1329_g0n_github.pcap), so I just kept dump_160207_1329_g0n_github_Schmoog.pcap and deleted the other two.

You can now run:

```

$ capinfos dump_160207_1329_g0n_github_Schmoog.pcap

```

and see for yourself the publishable little PCAP.

These are a few lines from my dump_160207_1329_g0n_github_Schmoog.pcap.capinfos (not in the download because it's trivial to get the exact same output), just, the indented commented lines I interpolated in this text, for comparison:

```

File name:           dump_160207_1329_g0n_github_Schmoog.pcap

...

Number of packets:   71

   # the file it was extracted from was: Number of packets:   6,643

...

...

Capture duration:    186.025799643 second

First packet time:   2016-02-07 13:42:35.257984041

   # the file it was extracted from was: 2016-02-07 13:29:21.552417719

Last packet time:    2016-02-07 13:45:41.283783684

   # the file it was extracted from was: 2016-02-07 13:50:18.959849688

```

So the conversation to Schmoog was some three minutes, in the second half of some 20 minutes of the unpublishable PCAP.

And finally, the command you can run, and see if I'm credible (or if you're apt to deal with this somewhat advanced matter).

Run:

```

$ wireshark -o "ssl.keylog_file: dump_160207_1329_SSLKEYLOGFILE.txt" dump_160207_1329_g0n_github_Schmoog.pcap

```

or, leaner, but not colorful and much less helpful for beginners' understanding (the Wireshark with its panes, and its analytics and statistics and all its functionalities can never be replaced by tshark only; the tshark however, is needed for tedious repetitive tasks such as extracting [lots of] streams), issue a command like:

```

$ tshark -V -o "ssl.keylog_file: dump_160207_1329_SSLKEYLOGFILE.txt" -r dump_160207_1329_g0n_github_Schmoog.pcap  | less

```

( tshark can more easily be used to present in text a particular info, though )

and you'll see the entire, refered-to-from-github (that will be seen further below; just I'm repeating it so you don't blame Firefox for it)...

And you'll see the entire, refered-to-from-github network conversation of my machine with the Schmoog.

A note now of the SSL key file.

How I got that CLIENT_RANDOM line (look up the dump_160207_1329_SSLKEYLOGFILE.txt, if you haven't)? If you have the debugging set (if you don't have it set yet, see https://wiki.wireshark.org/SSL), like I have set it to ~/.ssldbg.log in my machine, you cat get a lot of explanation how the SSL decryption is done, which is kind of reverse engineering on how the encryption is done.

I have unclarities and questions regarding debugging and decryption, but will not just yet indulge in asking (I need myself more clarity for me, so more studying, which is a slow process here, before I ask), as I already mentioned at the top...

Still I can tell I figured out, after perusing the debugging log, from this line in that debugging log:

```

  checking keylog line: CLIENT_RANDOM 65c94458386bf7a7e8485465fd80c28bb7b118279a06a4d7fa7c4a03ae827db5 60f5be47745bb533a0978a2100baeb475e37b4964fd573449f040bd8b69b0473cef46a8b1ecd5e98bb577c1315299281

```

that I only need to give you those key-strings from that line and you'll be able to decrypt the little data-harvesting PCAP (from the download). There were over 200 such lines that got written by the Mozilla's (and others) NSS (Network Security Services) library in that day's, or afternoon's online of mine, when I worked on GitHub.com and maybe did some other surfing, but not very much, or work, most to do with the unpublishable my-password-containing PCAP. Only this exact line is needed for you to decrypt this little PCAP.

Some of the info I have to ask either Mozilla devs, or on Wireshark ML, about. Of course, only after I've studied to make clear queries. A hint: It'll be about why exactly are there so many frames, in this PCAP, in light reddish foreground on black background, which appears to me to mean failed connections. Were there some conversations actually cut, by my Firefox, just as I wanted them not to happen, by my settings in the Preferences?

The crux of this article is coming for your eyes soon, next. The long expected harvesting of data lines.

We need tto extract ('follow' is the term used for it in Wireshark) the SSL Strem, only one in this little PCAP, the 99th one it was in the unpublishable my-password-containing mother-file, and here it this the 0th, the number 0.

In Wireshark right click on any, in the packets pane, the top frame, packet that in the Protocol colums displays "TLSv1.2" and in the menu that pops up select Follow > SSL Stream. In the dialog that opens first select to Save data as "Raw" from the dropdown menu, and then click on Save as, an save it as:

```

dump_160207_1329_g0n_github_Schmoog_s000-ssl-W.bin

```

( And you can also repeat the process, doing all exactly the same, except for selecting Save data as "ASCII" and saving it as:

```

dump_160207_1329_g0n_github_Schmoog_s000-ssl-W.txt

```

instead.

With tshark its more complicated, but I made the script for exactly these record layer tshark extractions. Get tshark-streams.sh from:

https://github.com/miroR/tshark-streams

The script is incomplete, pls. run it first without arguments and read carefully the warnings:

```

$ tshark-streams.sh

```

and if you trust there is no subterfuge, and that nothing should really break as a consequence (in which case you accept it to be entirely on you, not on me), run:

```

$ tshark-streams.sh -r dump_160207_1329_g0n_github_Schmoog.pcap -k dump_160207_1329_SSLKEYLOGFILE.txt

```

That really should get you four files.

Two plain TCP, one text (dump_160207_1329_g0n_github_Schmoog_s000.txt), the othe binary (dump_160207_1329_g0n_github_Schmoog_s000.bin), as if you didn't have the SSL keys file and couldn't possibly decrypt the record layer. Those you can delete.

And two SSL decrypted files, one text, the othe binary. Aaah! It's those that we need! Zucker komm zu letzt! These:

```

$ ls -l dump_160207_1329_g0n_github_Schmoog_s000-ssl*

-rw-r--r-- 1 miro miro 4735 2016-02-12 00:03 dump_160207_1329_g0n_github_Schmoog_s000-ssl.bin

-rw-r--r-- 1 miro miro 4995 2016-02-12 00:03 dump_160207_1329_g0n_github_Schmoog_s000-ssl.txt

```

They're in the download (same download from the top of the page).

The binaries from either Wireshark or Tshark are the same:

```

$ ls -l dump_160207_1329_g0n_github_Schmoog_s000-ssl*bin

-rw-r--r-- 1 miro miro 4735 2016-02-12 00:03 dump_160207_1329_g0n_github_Schmoog_s000-ssl.bin

-rw-r--r-- 1 miro miro 4735 2016-02-11 23:52 dump_160207_1329_g0n_github_Schmoog_s000-ssl-W.bin

$ sha256sum dump_160207_1329_g0n_github_Schmoog_s000-ssl*bin

cfd4ce4ec303e3c746d80e2a45541b4c2daed14cf0b844c618c53331c0528fbf  dump_160207_1329_g0n_github_Schmoog_s000-ssl.bin

cfd4ce4ec303e3c746d80e2a45541b4c2daed14cf0b844c618c53331c0528fbf  dump_160207_1329_g0n_github_Schmoog_s000-ssl-W.bin

$
```

But the texts are not:

```

$ ls -l dump_160207_1329_g0n_github_Schmoog_s000-ssl*txt

-rw-r--r-- 1 miro miro 4995 2016-02-12 00:03 dump_160207_1329_g0n_github_Schmoog_s000-ssl.txt

-rw-r--r-- 1 miro miro 4735 2016-02-12 00:14 dump_160207_1329_g0n_github_Schmoog_s000-ssl-W.txt

$ diff -b -B dump_160207_1329_g0n_github_Schmoog_s000-ssl*txt

1,6d0

< ===================================================================

< Follow: ssl,ascii

< Filter: tcp.stream eq 0

< Node 0: 192.168.1.2:39106

< Node 1: 208.117.229.219:443

< 742

18,20c12,13

< v=1&_v=j30&a=425936500&t=event&ni=0&_s=2&dl=https%3A%2F%2Fgithub.com%2F%3Cuser-name%3E&ul=en-us&de=UTF-8&dt=miroR%20(Miroslav%20Rovis)&sd=24-bit&sr=1024x768&vp=1006x580&je=0&ec=Header&ea=go%20to%20notifications&el=icon%3Aunread&_u=eACAAAQFM~&jid=&cid=181680827.1454848883&tid=UA-3769691-2&cd1=Logged%20In&z=650049128

<    430

< HTTP/1.1 200 OK

---

> 

> v=1&_v=j30&a=425936500&t=event&ni=0&_s=2&dl=https%3A%2F%2Fgithub.com%2F%3Cuser-name%3E&ul=en-us&de=UTF-8&dt=miroR%20(Miroslav%20Rovis)&sd=24-bit&sr=1024x768&vp=1006x580&je=0&ec=Header&ea=go%20to%20notifications&el=icon%3Aunread&_u=eACAAAQFM~&jid=&cid=181680827.1454848883&tid=UA-3769691-2&cd1=Logged%20In&z=650049128HTTP/1.1 200 OK

32,34c25,26

< GIF89a.............,...........D..;

< 728

< POST /collect HTTP/1.1

---

> 

> GIF89a.............,...........D..;POST /collect HTTP/1.1

45,47c37,38

... [ 14 lines, and some are long like some of the above lines, snipped here ]...

114d101

< ===================================================================

$

```

Anyway, this is the piece of Schmoog the evil Octopus' data harvesting, let me guess what the entity references (?) translate like, and also make some common sense from on of the same, but differently formated (with, in Wireshark, some newlines removed), but same one line...

Let me guess (and allow me to extract from the above one same line in two slightly differing presentations, what I probably or certainly understand):

```

https://github.com/ user-name miroR (Miroslav Rovis)

sd=24-bit

sr=1024x768

vp=1006x580

...

Logged In

```

The last line tells to the Schmoog the World's Super-Spy, that I just managed to log in to Github. as user miroR, and that my name is Miroslav Rovis. True, but what does the Schmoog got to do with that? It's none of your business, you Ostopus! Aaah...

The sd, probably screen depth, 24-bit tells what it is: I'm going online with this machine watching it on the screen of a really old monitor (you know those old cumbersome ones, heavy because made with lots of lead, that, if you're from a wealthier country, maybe some of your really poor friends use only), and that, the 24-bit depth is its maximum, as is the sr (screen resolution) of 1024x768. Since I've been slowly losing sight in recent years, I can only splash my Firefox across most of the screen, hiding almost everything else (and I use Alt-Tab'ing to recall to top other windows, of terminals and programs).

Peruse those yourself. They are very well genuine!

Not much more is left for this article to be complete, for even the newbies to understand it.

One thing is the Referer string. Where's the Referer that I talked about?

Oh it's there, it's there:

```

$ grep Referer dump_160207_1329_g0n_github_Schmoog_s000-ssl.txt

Referer: https://github.com/miroR

Referer: https://github.com/miroR

Referer: https://github.com/miroR

Referer: https://github.com/miroR/tshark-streams

$
```

But I'm not posting this to blame Github. It's venial sin like I said, and I only hope they wouldn't go further and deeper into harvesting of data, or even betraying of users', as is well known to have happened with many big brands.

No. I'm posting this so Gentooers and possibly other visiting *nixers can see that, as it appears to me, Mozilla Firefox now cares for privacy.

Will that last is another matter. Monica Chew had left Mozilla becasue at her time Firfox was comercializing on ads and things, and data harvesting to an extent that she didn't find acceptable.

But this piece of harvesting is not on Zilla FF, it's on Github, not on FF. Fox had to allow it because from the point of view of the client program, which Firefox browser is, it was legitimate, because the Referer asked for it. Pls. do correct me if I'm wrong. And do so in any of my findings/understandings that I wrote in this article.

And, peruse it for yourself in any text editor, or plain less (or even cat):

```

$ less dump_160207_1329_g0n_github_Schmoog_s000-ssl.txt

```

. It's in the header previous to every one of those four lines like the one above that I somewhat analyzed and explained.

Be convinced now.

And the other thing left to do, as far as the newbies go, is the:

Screen_160207_1329_g0n_github_FRACT.mkv

(in the download)

It's only two minutes, starting just before my logging into the Github.

Run it with some really good video player *nix program like my friend MPlayer. And here's some comments for your viewing of it, and minutes:seconds M:SS of those two minutes.

Attempt at Login. Failed because, GitHub literally tells me:

```

0:05   Cookies must be enabled to use GitHub.

```

And that would be, in my case, so, for any website except those I allow in. I think Firefox does what I tell her in the:

Firefox Preferences (that I open now).

The Security Tab

```

0:21   I don't use:

      _  Warn me when sites try to install add-ons

      _  Block reported attack sites

      _  Block reported web forgeries

```

Because I browse so little! So little, yet, that I don't need it! Because I have decided I am going to control my machine when I'm online, and nobody is going to own me, like they did, like that really did, well not completely, buy enough for real nuissance:

System attacked, Konqueror went on window-popping spree!

https://forums.gentoo.org/viewtopic-t-905472.html

, and continued ever again at various times later with ever less success:

Postfix smtp/TLS, Bkp/Cloning Mthd, Censorship/Intrusion

https://forums.gentoo.org/viewtopic-t-999436.html

and I am not an expert to be able to do so browsing like mad! No way!

```

       _  Remember logins for sites

       _  Use a master password

```

Of course FF need to have these. But these are idiot settings. The passwords are kept either really properly encrypted and in safe place, or not in digital but on paper in a secret drawer, or solely in your mind. But you are free to disagree.

But of course FF need to have these. For the less understanding masses.

The Privacy Tab (picking just the most important entries, and using =!= for selected)

```

0:23   =!=   Use Tracking Protection in Private Windows

       =!=   Always use private browsing mode

       _  Accept cookies from sites                                     Exceptins

```

So, Tracking Protection in Private Windows is enabled. And I always use private browsing mode.

And I don't accept cookies from sites. And if FF is set to not accept cookies from sites, it does appear to me that it will will honor that setting.

Such as: I wasn't able to log into github, because it wasn't enabled (hasn't been on Github for a few months, and had messed the settings a little probably; I had previously allowed Github already).

So if FF is set to not accept cookies from sites, it does appear to me that it will will honor that setting. (And I'll report to Mozilla-dev ML if I find any cookies from sites that I didn't allow.)

But I have a list of exceptions, else no banking, no other stuff...

So here I click on Exceptions.

```

0:28   < the Exceptions-Cookies window and I add github.com and save it >

```

[[ Can skip some 50 seconds here it you're in a rush. I didn't know I'd be publishing this, I was managing this new entry really slowly. ]]

```

1:35   Closing Preferences, refreshing the Github Login Firefox Tab.

```

and

```

1:50   I'm logged in. (To Flowstamp. Flowstamp is, incidentally and not-related to this topic otherwise, my BSD-licensed FOSS (primitive) program that can get a fine transparent flowing stamp over your videos, and more.)

```

This explanation of how to use cookies in in response to a proposal/suggestion to remove Firefox from portage because of an uninformed notion of how Firefox deals with cookies:

Should firefox be removed from portage?

There are advanced issues related to this article that I would like to post about, based on the hereby already published PCAP. But later.

And later can mean even much later with me. Not sloppiness or other fault, but (lack) of aptness and (various kinds of) ability/ableness will be the factors for the soonness or belatedness of that event.

Regards to all the gentle readers!

----------

