LWN: Comments on "Lightweight alternatives to Google Analytics" https://lwn.net/Articles/822568/ This is a special feed containing comments posted to the individual LWN article titled "Lightweight alternatives to Google Analytics". en-us Thu, 02 Oct 2025 17:48:32 +0000 Thu, 02 Oct 2025 17:48:32 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Lightweight alternatives to Google Analytics https://lwn.net/Articles/825935/ https://lwn.net/Articles/825935/ anarcat <div class="FormattedComment"> <font class="QuotedText">&gt; I record people for 7 days on my site, and shove the data into <a href="https://goaccess.io/">https://goaccess.io/</a> which sends me the result in a crappy email summary, without IP addresses. So I&#x27;m effectively keeping zero in the long term, although I *am* retaining *some* data in the short term.</font><br> <p> An interesting addition to this...<br> <p> It turns out that goaccess *does* record the IP addresses of visitors after processing, when the `VISITORS` panel is enabled. It shows the per-IP top visitors and therefore does keep PII, contrary to what I first believed. I tried to disable that panel (which I do not find very interesting anyways) but then it breaks visitor tracking in the rest of the reports, so that&#x27;s definitely a problem.<br> <p> I am, again, really interested in trying out goatcounter, then. :)<br> </div> Sat, 11 Jul 2020 21:10:22 +0000 Lightweight alternatives to Google Analytics https://lwn.net/Articles/824775/ https://lwn.net/Articles/824775/ ihucos <div class="FormattedComment"> If I can make some shameless advertising for my own product: <a rel="nofollow" href="https://simple-web-analytics.com/">https://simple-web-analytics.com/</a><br> <p> For me one interesting aspect for better privacy is how unique views are tracked. GoatCounter seems to have something like sessions &quot;the right way&quot; or is at least mitigating it&#x27;s effects on privacy concerns. Plausible on the other hand (and many others) uses the hashed user agent and IP as sessions id and stores that permanently. In my opinion that is even worse than cookies, which are more transparent, easier controllable by users and usually some random id that gets forgotten.<br> <p> Simple Analytics (not to be confused with &quot;my&quot; product - similar naming - they where first) makes something quite interesting, which is to simply inspect the `document.referrer`. If it&#x27;s not the site being tracked, it must be a new visitor. &quot;My&quot; product uses the HTTP cache to ensure each use is only counted once a day but also additionally counts on `sessionStorage` for more accuracy.<br> <p> From my naive understanding of the GDPR you cannot have any session id&#x27;s (so also no fingerprinting) if you want to avoid consent banners. That is in my opinion also an interesting but nebulous and difficult topic, with which providers you don&#x27;t need GDPR consent banners.<br> <p> </div> Tue, 30 Jun 2020 12:08:44 +0000 Lightweight alternatives to Google Analytics https://lwn.net/Articles/824129/ https://lwn.net/Articles/824129/ anarcat <div class="FormattedComment"> <font class="QuotedText">&gt; IIRC for SSH the solutions to this are either separate SSH agents per identity or the IdentitiesOnly option.</font><br> <p> I suspect near-absolutely no one does this...<br> <p> <font class="QuotedText">&gt; I guess if web browsers wanted to they could easily mitigate this by pinning each cert to the domain it was created for and only ever sending it to that domain.</font><br> <p> Assuming they cared about client certs at all...<br> <p> <font class="QuotedText">&gt; Also, I wonder if the client cert is in the clear in the TLS handshake, or if Encrypted Client Hello (new name for ESNI) is needed to hide them.</font><br> <p> I would assume the worse. ;)<br> </div> Mon, 22 Jun 2020 14:58:14 +0000 Lightweight alternatives to Google Analytics https://lwn.net/Articles/824045/ https://lwn.net/Articles/824045/ pabs <div class="FormattedComment"> IIRC for SSH the solutions to this are either separate SSH agents per identity or the IdentitiesOnly option.<br> <p> I guess if web browsers wanted to they could easily mitigate this by pinning each cert to the domain it was created for and only ever sending it to that domain.<br> <p> Also, I wonder if the client cert is in the clear in the TLS handshake, or if Encrypted Client Hello (new name for ESNI) is needed to hide them.<br> </div> Mon, 22 Jun 2020 03:29:17 +0000 Lightweight alternatives to Google Analytics https://lwn.net/Articles/824014/ https://lwn.net/Articles/824014/ anarcat <div class="FormattedComment"> I'm not super familiar with the details, but there's a similar problem with SSH, I believe. When you authenticate to a server with public key authentication, either the server or the client at some point need to disclose which public keys are authorized or to try to authorize. When we do server authentication (ie. regular HTTPS) this doesn't matter: the site is public and it's not trying to hide its identity, it's trying to *prove* it to the world!<br> <p> But when you're a client, you have different tradeoffs. You don't want to send that certificate everywhere all the time, because it acts as a unique token that can be used to track you across websites. Firefox has rudimentary protection against this: when I go on a site that wants access to my TLS client cert, it first prompts whether I want to actually authenticate with my cert. But that UI is terrible: it pops open all the time, at random moments, and doesn't remember the "yes I trust this site" checkbox, which seems to do nothing.<br> <p> It's also not clear to me whether the server actually knows about my client cert at this point or whether the dialog is actually effective in not disclosing my identity. And that's just on firefox, which has some support for TLS client certs. I suspect the situation could be catastrophically worse on other servers.<br> <p> I will also note that SSH does not have those protections *at all*. It will happily send *all* the public keys it knows about when trying to login to a random server, which is kind of disturbing when you think about it:<br> <p> $ ssh -v lwn.net<br> [...]<br> debug1: Next authentication method: publickey<br> debug1: Offering public key: cardno:N RSA SHA256:XXXX agent<br> debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic<br> debug1: Offering public key: rsa w/o comment RSA SHA256:XXXX agent<br> debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic<br> [...]<br> <p> <p> No confirmation prompt whatsoever here. And they would be annoying too... i guess SSH expects you to divulge your public key identity when you connect to a server... but in the wild wild web, it seems like a delicate thing to do, so I wonder if a good usability trade-off is possible at all here.<br> </div> Sun, 21 Jun 2020 18:21:00 +0000 Lightweight alternatives to Google Analytics https://lwn.net/Articles/823669/ https://lwn.net/Articles/823669/ Lennie <div class="FormattedComment"> As someone who runs a pretty large website with sign ups and forum comments, etc. and needing a way to see how websites are used.<br> <p> We need to keep logs for the first things I mentioned to deal with abuse and spam.<br> <p> The second part is very important to know how to improve things and what does and doesn't work. Now I would prefer more tools with throw more away and just keep the statistics. But more importantly, as long as it's just us running the website who have the data and not some company like Google or Facebook tracking you on multiple sites that's a very large privacy difference.<br> </div> Fri, 19 Jun 2020 11:29:59 +0000 Lightweight alternatives to Google Analytics https://lwn.net/Articles/823601/ https://lwn.net/Articles/823601/ pabs <div class="FormattedComment"> I'm interested in the privacy issues you mention with TLS client certs. It seems to me that they are basically the ideal auth mechanism in terms of privacy, since with the right browser implementation you could make the authentication choice on a per-request basis, allowing you to only authenticate POST requests or only authenticate URLs the user clicked on and not things the page loads.<br> </div> Fri, 19 Jun 2020 01:17:35 +0000 Lightweight alternatives to Google Analytics https://lwn.net/Articles/823560/ https://lwn.net/Articles/823560/ ddevault <div class="FormattedComment"> Thank you for saying this, I agree entirely. We should not be encouraging people to quit Google Analytics for something else, we should be encouraging them to quit analytics entirely. It's not right to spy on your users. 9 times out of 10, analytics exist only to provide a dopamine fix to the web admin - ask anyone you know to tell you exactly how their changes are informed by analytics data, and you'll likely hear crickets.<br> </div> Thu, 18 Jun 2020 20:43:48 +0000 Lightweight alternatives to Google Analytics https://lwn.net/Articles/823539/ https://lwn.net/Articles/823539/ Cyberax <div class="FormattedComment"> Trackers on individual sites don't suffer from blocked cookies, as users are typically logged in and have a unique session ID. Blocked trackers are more problematic, but even server-side tracking is usually better than none.<br> </div> Thu, 18 Jun 2020 17:50:44 +0000 Lightweight alternatives to Google Analytics https://lwn.net/Articles/823525/ https://lwn.net/Articles/823525/ anarcat <div class="FormattedComment"> <font class="QuotedText">&gt; you can at least avoid intentionally storing PII yourself, or at least calculating the visit count within the request handler without storing any info on the corresponding individual request events.</font><br> <p> I agree! I don't find my solution to be particularly interesting, especially from a privacy perspective. It was just *simple*... :)<br> <p> That said, that very requirement is why I find projects like GoatCounter interesting: it does a special effort at counting events *without* storing PII! It has a pretty elegant design in that perspective. So I would definitely consider it as an alternative to goaccess...<br> <p> <font class="QuotedText">&gt; Visit count is not a direct indicator of value. I visit plenty of websites and blog posts that I don't find useful after the fact. There are various ways to artificially inflate visit count without providing value or providing negative value (such as clickbait and or false headlines on social media).</font><br> <p> Granted. I would consider this an attack vector as any other though. It doesn't mean there is *no* value in visitor count. Sure, you could count only "hits" and decide what's useful and what's not. Or you could just pretend all this stuff doesn't matter. But there are plenty of users who like to see those stats, and I believe in "harm reduction" and provide safer tools by default than pretending that requirement does not exist...<br> <p> <font class="QuotedText">&gt; Password based logins should be replaced with cryptographic logins (Webauthn, TLS client certs or Tor onion client auth for eg), which presumably solves the brute-force issue too.</font><br> <p> Ha! I would like to believe too. But to break it apart: 1) webauthn is definitely useful right now, but it's generally used for 2FA, not for primary. After all, you don't really want people to just login with a "key" (something that you own) because once that is stolen you are totally screwed (you also need "something you know"). 2. TLS client certs would be great if clients implemented them in any meaningful way. But unfortunately, they are going more and more towards the trashbin. And they definitely have their own user-tracking concerns, at least in the current implementation, maybe even worse than regular cookies. 3. Tor is not ubiquitous (yet) so I wouldn't assume it's a good replacement for password authentication just yet.<br> <p> I hear you: passwords suck. But they're still a thing and hard to get rid of! And even if you would get rid of it, I would still argue for rate-limiting in authentication attempts, even with public key authentication.<br> </div> Thu, 18 Jun 2020 16:23:34 +0000 Lightweight alternatives to Google Analytics https://lwn.net/Articles/823521/ https://lwn.net/Articles/823521/ sarunas <div class="FormattedComment"> I would lean to disagree. Apart from ethical concerns for planting tracking cookies, given the widespread use of tracker blockers and now browsers themselves starting to block trackers, data collected must be skewed to the point of being worthless or even misleading...<br> <p> </div> Thu, 18 Jun 2020 15:59:04 +0000 Matomo https://lwn.net/Articles/823390/ https://lwn.net/Articles/823390/ dw <div class="FormattedComment"> It would appear two decades around enterprise software has damaged my definition of lightweight ;) Stood next to the typical 342 KiB of script payload on a modern Google search home page, the 23 KiB gzipped Matomo tracker JS might at least still be considered lightweight by some reasonable standard.<br> </div> Thu, 18 Jun 2020 04:41:31 +0000 Lightweight alternatives to Google Analytics https://lwn.net/Articles/823388/ https://lwn.net/Articles/823388/ pabs <div class="FormattedComment"> While it is improbable to avoid being on the internet without having intermediaries and increasingly improbable that those intermediaries are dumb pipes (and thus trustworthy) and software leaking PII is indeed hard to control, but you can at least avoid intentionally storing PII yourself, or at least calculating the visit count within the request handler without storing any info on the corresponding individual request events.<br> <p> Visit count is not a direct indicator of value. I visit plenty of websites and blog posts that I don't find useful after the fact. There are various ways to artificially inflate visit count without providing value or providing negative value (such as clickbait and or false headlines on social media).<br> <p> Password based logins should be replaced with cryptographic logins (Webauthn, TLS client certs or Tor onion client auth for eg), which presumably solves the brute-force issue too.<br> </div> Thu, 18 Jun 2020 04:21:17 +0000 Lightweight alternatives to Google Analytics https://lwn.net/Articles/823382/ https://lwn.net/Articles/823382/ anarcat <div class="FormattedComment"> I record people for 7 days on my site, and shove the data into <a href="https://goaccess.io/">https://goaccess.io/</a> which sends me the result in a crappy email summary, without IP addresses. So I'm effectively keeping zero in the long term, although I *am* retaining *some* data in the short term.<br> <p> I find that's a reasonable tradeoff: yes, it's better if you don't store any personally identifiable information at all. But once you start doing that, you realize it's actually incredibly difficult. Your uplink might keep track of those streams. Those IP addresses and other PII do land in your computer memory whether you like it or not, and that means it can end up on disk (thanks to swap). <br> <p> So I prefer to assume that there is *some* leakage, make *some* use of it, and limit it over time. Because it *does* have some use. Maybe it's just vanity, but I do like to get feedback of which articles I am writing are valuable to my readers, and while comments and direct feedback are a measure of that, there are way too few of those to provide a meaningful measures. Visits, on the other hands, are a direct metric I can use.<br> <p> And that's without starting on abuse control, for which IP address tracking is kind of invaluable. For example, if you have a site requiring a login and you are not rate limiting password-guessing attempts, you are doing it wrong. And I don't know how you would do that *other* than by logging *some* IP addresses...<br> </div> Thu, 18 Jun 2020 03:02:51 +0000 Matomo https://lwn.net/Articles/823381/ https://lwn.net/Articles/823381/ anarcat <div class="FormattedComment"> <font class="QuotedText">&gt; I find the absence of Matomo inexplicable, it's by far the most feature-complete (and feature-comparable) alternative to Google Analytics around </font><br> <p> I think the key part of the title you might have missed is "Lightweight". I wouldn't qualify Matamo as lightweight if only because it's primarily designed as a (relatively fat) Javascript client (~200KB) that talks to a fairly large PHP web app which does a ton of stuff. The tools evaluated here seem much more lightweight. :)<br> </div> Thu, 18 Jun 2020 02:58:32 +0000 Lightweight alternatives to Google Analytics https://lwn.net/Articles/823379/ https://lwn.net/Articles/823379/ Cyberax <div class="FormattedComment"> Visitor tracking helps to find problematic areas on a website and for commercial websites to understand who is actually using it. It might not matter only for simple content websites (like blogs).<br> <p> Having good alternatives to ever-present GA is a good thing.<br> </div> Thu, 18 Jun 2020 02:55:09 +0000 Lightweight alternatives to Google Analytics https://lwn.net/Articles/823377/ https://lwn.net/Articles/823377/ pabs <div class="FormattedComment"> As a visitor to websites, I think a better alternative is to not track your visitors at all. Don't log their visits anywhere, don't record anything about them.<br> </div> Thu, 18 Jun 2020 02:07:19 +0000 Privacy-preserving Google Analytics https://lwn.net/Articles/823354/ https://lwn.net/Articles/823354/ dw It's worth mentioning the possibility of removing some of the sting from Google Analytics using the <a href="https://developers.google.com/analytics/devguides/collection/protocol/v1">measurement protocol</a> and a local copy of analytics.js. You host a proxy script that forwards the hit on to GA, after making any desirable privacy-preserving changes, such as lopping off some of the IP address (rather than rely on the equivalent Google setting). On the client, configuring analytics.js with a custom <a href="https://developers.google.com/analytics/devguides/collection/analyticsjs/tasks">sendHitTask</a> delivers data to the script. <p> For completeness, the client juju is simply: <pre> ga('create', 'UA-XXXXXXX-1', 'auto'); ga(function(tracker) { tracker.set('sendHitTask', function(model) { var xhr = new XMLHttpRequest(); xhr.open('POST', '/wrapper-script'); xhr.send(model.get('hitPayload')); }); }); </pre> <p> This also creates an opportunity for logging the hit data, so you get the best of both worlds: hassle freedom of GA with all the raw GA preserved should you wish to migrate to another solution in future. <p> Finally, since the entirety of the data received by Google is controlled, and if you're sufficiently paranoid, it's even possible to anonymize the domain being tracked. Wed, 17 Jun 2020 21:32:00 +0000 Matomo https://lwn.net/Articles/823357/ https://lwn.net/Articles/823357/ jake <div class="FormattedComment"> <font class="QuotedText">&gt; I find the absence of Matomo inexplicable, </font><br> <p> Stay tuned :)<br> <p> jake<br> </div> Wed, 17 Jun 2020 21:20:54 +0000 Matomo https://lwn.net/Articles/823348/ https://lwn.net/Articles/823348/ dw I find the absence of <a href="https://matomo.org/">Matomo</a> inexplicable, it's by far the most feature-complete (and feature-comparable) alternative to Google Analytics around Wed, 17 Jun 2020 21:13:02 +0000 Lightweight alternatives to Google Analytics https://lwn.net/Articles/823328/ https://lwn.net/Articles/823328/ ibukanov <div class="FormattedComment"> Given how many sites wordpress runs on, it will be nice to have a review of its analytics plugins. Installation and configuration of those on a self-hosted Wordpress is trivial.<br> </div> Wed, 17 Jun 2020 19:09:48 +0000