Too often, developers like myself treat analytics as a sideshow, a minor concern, something that can be bolted on later in the form of Google Analytics. However, while GA offers critical insights as to your traffic levels, behavior, and audience, it’s often too difficult to measure things that matter to your business.
Our solution to this problem was to roll our own simple analytics system. Essentially, whenever you take an action in the app, like reading a story, that will be recorded in Firebase. This way, we can keep track of things like:
- Pageviews on stories
- Time spent reading
- Words published
And many more metrics, with ease. How does this work, though?
The system I initially designed ended up being very close to what we’re using now. What you have in the database, for each user, are “fragments”. Every user has their own “fragment” document. When you land on a story page, we record a pageview. When you leave, we record time spent reading. And so on, etc.
This has a nice side effect of mitigating any bad actors to their own little document. But how are these documents gathered together into meaningful metrics? We use what we call an “aggregate” document, which gathers together activity under a particular story, or a particular user, and sums the fragments. Since every user is scoped to their own fragment, we can simply subtract the activity of any bad actors from the overall aggregate records, and voila! No muss, no fuss.
This raises many questions though, such as: How do we count the activity of anonymous users vs. authenticated users? How do we tell them apart? How can we detect bad actors ahead of time and be proactive rather than reactive?
I can’t go into all of the details there, as we’re trying to mitigate the bad actors, not give them the keys to the kingdom. I can say this, though: we’re taking every measure possible to prevent bad apples from ruining the bunch. One inspiration has been Discourse’s “trust level” system. If, for example, you were to verify your email, read for an hour, and publish at least 1,000 words, we can be reasonably sure you’re not a bot, or a spammer creating thousands of accounts.
There are many more aspects of this system I’d love to discuss, but at this point, I’d like to open this up to questions! What do you want to know about this analytics system?