More alternatives to Google Analytics
Last week, we introduced the privacy concerns with using Google Analytics (GA) and presented two lightweight open-source options: GoatCounter and Plausible. Those tools are useful for site owners who need relatively basic metrics. In this second article, we present several heavier-weight GA replacements for those who need more detailed analytics. We also look at some tools that produce analytics data based on web-server-access logs, GoAccess, in particular.
Matomo
One of the most popular heavyweight offerings is the open core Matomo. It was formerly called "Piwik" and was created in 2007; LWN looked at Piwik way back in 2010. It's a full-featured alternative to Google Analytics, so companies that need the power of GA can transition to it, but still get the privacy and transparency benefits that come from using an open-source and self-hosted option. As an example, web agency Isotropic recently switched to Matomo:
We chose to do this as we wanted to respect our users privacy, and felt that hosting statistics on our own server was better for both us and them. [...] We needed something that rivaled the functionality of Google Analytics, or was even better than it. The solution needed to offer real-time analytics, geo-location, advertising campaign tracking, heat maps, and be open source.
Even though Matomo is the most popular open-source analytics tool and has been around the longest, it's still only used on 1.4% of the top one million web sites, roughly 2% of GA's market share — it's hard for even well-known open-source software to compete with the $600-billion gorilla.
Like GA, Matomo provides a summary dashboard with a few basic numbers and charts, as well as many detailed reports, including location maps, referral information, and so on. Additionally, Matomo has a feature called "content tracking" that allows automatically tracking users' interactions with the content (clicks and impressions) without writing code, unlike GA, which requires writing JavaScript or installing a third-party plugin. The self-hosted version of Matomo has all of these features, but site owners can also pay for and install various plugins such as funnel measurement, single-sign-on support, and even a rather invasive plugin that records full user sessions including mouse movements.
Matomo is written in PHP and uses MySQL as its data store; installation is straightforward by simply copying the files to a web server with PHP and MySQL installed. It's licensed under the GPLv3; it supports self-hosting for free (standalone or as a WordPress plugin), two relatively low-cost cloud options, and enterprise pricing. Matomo seems like a well-run project and has a fairly active community support forum; it also provides business-level support plans for companies using the self-hosted version.
Open Web Analytics
A similar but less popular tool is Open Web Analytics (OWA), which is also written in PHP and licensed under the GPLv2. OWA uses a donation-based development model rather than having monthly pricing options for a hosted service. Of all the open-source tools, OWA is the one that feels most like a clone of Google Analytics; even its dashboard looks similar to GA's — so it may be a good option for users who are familiar with GA's interface.
OWA is not as feature-rich as Matomo, but still has all the basics: an overview dashboard, web statistics, visitor locations on a map overlay, and referrer tracking. Like Matomo, it comes with a WordPress integration to analyze visitors on those type of sites. It also provides various ways to extend the built-in functionality, including an API, the ability to add new "modules" (plugins), and the ability to hook into various tracking events (e.g. a visitor's first page view or a user's mouse clicks).
OWA is maintained by a single developer, Peter Adams, and has had
periods of significant
inactivity. Recently, development seems to have picked up, with Adams
shipping several new
releases in early 2020.
Some of the warnings on recent releases, such as those
for the 1.6.9 release, may be a bit worrisome, however ("!
IMPORTANT: The API endpoint has
changed!
"). Installation is again straightforward, and just requires
copying the PHP files to a web server and having a MySQL database
available.
Countly
Another open-core option, Countly, was founded in 2013; it is relatively feature-rich and has many dashboard types. Of the tools we are covering, though, it is the one that feels the most like a "web startup", complete with a polished video on its home page and sleek dashboards in its UI. Countly advertises that it is "committed to giving you control of your analytics data".
Countly has a clear distinction between its enterprise edition (relatively expensive, starting at $5000 annually) and its self-hosted community edition, with the latter limited to "basic Countly plugins" and "aggregated data". Countly's core source code is licensed under the GNU AGPL, with the server written using Node.js (JavaScript), and SDKs for Android and iOS written in Java and Objective C.
Countly's basic plugins provide typical analytics metrics such as simple statistics and referrers for web and mobile devices, but also some more advanced features like scheduling email-based reports and recording JavaScript and mobile app crashes. However, its enterprise edition brings in a wide range of plugins (made either by Countly or by third-party developers) that provide advanced features such as HTTP performance monitoring, funnels with goals and completion rates, A/B testing, and so on. Overall, Countly's community edition is a reasonably rich offering for companies with mobile apps or that are selling products online, and it provides the option to upgrade to the enterprise version later if more is needed.
Snowplow
A more generalized event-analytics system is Snowplow Analytics, founded in
2012 and marketed as "the enterprise-grade event data collection
platform
". Snowplow provides the data collection part of the
equation, but it is up to the installer to determine how to model and
display the data. It is useful for larger companies who want control over
how they model sessions or that want to enrich the data with business-specific
fields.
Setting up an installation of Snowplow is definitely not for the faint of heart; it requires configuring the various components, along with significant Amazon Web Services (AWS) setup; it may be possible, but not easy, to install it outside of AWS. However, there is a comprehensive AWS setup guide on the GitHub wiki (and the company does offer for-pay hosted options). Companies can set it up to insert events into PostgreSQL, AWS's columnar Redshift database, or leave the data in Amazon S3 for further processing. Typically a business-intelligence tool like Looker or ChartIO is used to view the data, but Snowplow does not prescribe that aspect.
Snowplow is a collection of tools written in a number of languages, notably Scala (via Spark) and Ruby. It is available under the Apache 2.0 license. Snowplow is used by almost 3% of the top 10,000 web sites, so it may be a reasonable option for larger companies that want full control over their data pipeline.
Analytics using web access logs
All of the systems described above use JavaScript-based tracking: the benefit of that approach is that it provides richer information (for example, screen resolution) and doesn't require access to web logs. However, if server-access logs are available, it may be preferable to feed those logs directly into analysis software. There are a number of open-source tools that do this: three tools that have all been around for over 20 years are AWStats, Analog, and Webalizer. AWStats is written in Perl and is the most full-featured and actively maintained of the bunch; Analog is written in Python and Webalizer is written in C, but neither is actively maintained.
A more recent contender is the MIT-licensed GoAccess, which was designed first as a terminal-based log analyzer, but also has a nice looking HTML view. GoAccess is written in C with only an ncurses dependency, and supports all of the common access-log formats, including those from cloud services such as Amazon S3 and Cloudfront.
GoAccess is definitely the most modern-looking and well-maintained access-log tool, and it generates all of the basic metrics: hit and visitor count by page URL, breakdowns by operating system and browser type, referring sites and URLs, and so on. It also has several metrics that aren't typically included in JavaScript-based tools, for example page-not-found URLs, HTTP status codes, and server response time.
GoAccess's default mode outputs a static report, but it also has an option that updates the data in real time: it updates every 200 milliseconds in terminal mode, or every second in HTML mode (using its own little WebSocket server). GoAccess's design seems well thought-out, with options for incremental log parsing (using data structures stored to disk) and support for parsing large log files using fast parsing code and in-memory hash tables.
The tool is easy to install on most systems, with pre-built packages for all the major Linux package managers, and a Homebrew version for macOS users. It even works on Windows using Cygwin or through the Linux Subsystem on Windows 10.
Wrapping up
All in all, there are several good options for those who need more powerful analytics, or need a system similar to GA, but are open source. For those running e-commerce sites, or in need of features like funnel analysis, Matomo and Countly seem like good choices. Enterprises that need direct control over how their events are stored and modeled should perhaps consider a Snowplow installation. For those who have access to their web logs or just don't want to use JavaScript-based tracking, GoAccess seems like a good choice for web-log analysis in 2020.
Index entries for this article | |
---|---|
GuestArticles | Hoyt, Ben |
Posted Jun 25, 2020 8:25 UTC (Thu)
by k3ninho (subscriber, #50375)
[Link]
K3n.
Posted Jun 25, 2020 14:09 UTC (Thu)
by ibukanov (subscriber, #3942)
[Link]
Posted Jun 29, 2020 16:17 UTC (Mon)
by vegge (guest, #6926)
[Link]
https://github.com/c-amie/analog-ce
It has useful filtering capabilities and configurable HTML output.
Posted Jan 13, 2022 14:09 UTC (Thu)
by EddieM (guest, #156215)
[Link]
More alternatives to Google Analytics
More alternatives to Google Analytics
More alternatives to Google Analytics
More alternatives to Google Analytics
Snowplow recently eased the setup of their open source implementation quite considerably - you can find out more at https://docs.snowplowanalytics.com/docs/open-source-quick...
In brief, Snowplow have built a set of terraform modules, which automates the setting up & deployment of the required infrastructure & applications for an operational Snowplow open source pipeline, with just a handful of input variables required on your side.
There is also a trial version of the commercial offering at https://try.snowplowanalytics.com/
HTH,
Eddie