Django debates user tracking
Django debates user tracking
Posted Dec 1, 2016 9:19 UTC (Thu) by misc (subscriber, #73730)Parent article: Django debates user tracking
In fact, first, we wanted to do that over a tor hidden service (now called onion services), but this would trigger too much IDS alert, and likely be blocked in some countries.
We need a specific domain, and a dns server. Each thing we want to count would trigger a json or whatever encoded blob to be sent over dns (in the host part), encrypted with a public/private key scheme.
The tracking party never see the ip, because the request is relayed by the dns at the isp level (or google dns, or public dns around), who do hide it. The log of DNS requests is supposed can be public, hence encryption. Given that the limited value of the data over time, we do think that this should be safe enough. And because we did envision the system to be shared (long term) among projects who did wish to have user counting, indication of something making a request to that dns domain wouldn't reveal anything to anyone, since it could come from homebrew or django or whoever use that. The idea is that someone wshing to count would take all string, attempts to decrypt using their key and see if it bring something useful or not (ie, proper json if encoding json, etc), and discard all crap as "not for me".
And the central dns server would just purge data on a regular basis. This doesn't prevent someone from doing a copy of course, but do prevent opportunistic attack in case of key compromission.
But we didn't publish the notes of our discussion, we did discuss that with Remy Decausemaker right after the bus, but Ralph changed role inside the company, and Remy left RH, so this didn't went anywhere.
There is a few shortcoming on the proposal, such as "handling the infra for that", likely crypto and security "details" such as getting it right, not hitting dns limit. And the usual downside of counting downloads, as I did tried to explain also in the past in http://community.redhat.com/blog/2015/09/lies-damned-lies...
And all the things that 2 engineers in a post FOSDEM week at the back of a Czech bus for 3h couldn't think about, of course.
But I truly think that something can be done, and so far, this proposal would prevent user identification by tracking IP and various way inspired by tor. The infra could be operated by LF or any others orgs. And there is a bunch of DNS that do not track people ( https://diyisp.org/dokuwiki/doku.php?id=technical:dnsreso... , https://servers.opennicproject.org/ ), some handled by non profit in different juridictions to avoid issue of "governement are asking broad access".
Then, the main remaining issue is the uuid handling as central to have a proper count, and see the difference between CI job and long term usage, etc, etc.
I also suspect there is things to do with homomorphic encryption, but I am not knowledgeable enough to tell what :)
(and the proposal only focus on "getting info in a database", the whole "prepare a proper UI" is left as a exercise to the user)