Hacker Noon 2.0: Our Custom Analytics System

Too often, developers like myself treat analytics as a sideshow, a minor concern, something that can be bolted on later in the form of Google Analytics. However, while GA offers critical insights as to your traffic levels, behavior, and audience, it’s often too difficult to measure things that matter to your business.

Our solution to this problem was to roll our own simple analytics system. Essentially, whenever you take an action in the app, like reading a story, that will be recorded in Firebase. This way, we can keep track of things like:

  • Pageviews on stories
  • Time spent reading
  • Words published

And many more metrics, with ease. How does this work, though?

The system I initially designed ended up being very close to what we’re using now. What you have in the database, for each user, are “fragments”. Every user has their own “fragment” document. When you land on a story page, we record a pageview. When you leave, we record time spent reading. And so on, etc.

This has a nice side effect of mitigating any bad actors to their own little document. But how are these documents gathered together into meaningful metrics? We use what we call an “aggregate” document, which gathers together activity under a particular story, or a particular user, and sums the fragments. Since every user is scoped to their own fragment, we can simply subtract the activity of any bad actors from the overall aggregate records, and voila! No muss, no fuss.

This raises many questions though, such as: How do we count the activity of anonymous users vs. authenticated users? How do we tell them apart? How can we detect bad actors ahead of time and be proactive rather than reactive?

I can’t go into all of the details there, as we’re trying to mitigate the bad actors, not give them the keys to the kingdom. I can say this, though: we’re taking every measure possible to prevent bad apples from ruining the bunch. One inspiration has been Discourse’s “trust level” system. If, for example, you were to verify your email, read for an hour, and publish at least 1,000 words, we can be reasonably sure you’re not a bot, or a spammer creating thousands of accounts.

There are many more aspects of this system I’d love to discuss, but at this point, I’d like to open this up to questions! What do you want to know about this analytics system?


I’m Nostradaming these questions from a lot of users:

  • how this better from Medium stats
  • it will be like a cool dashboard with gauges?
  • can this system actually help me to understand what users like more, like advise: “Hey, you have long article, people bored and leave it”
  • can it help to predict something for a long run. I.e.(if you will post 2 articles per week, we assume that your followbase will grow in 2x)
  • monetization it’s just a blackhole for discussion. I mean with HN2.0 it will give some abilities to help advertize both HN company and small companies as well

Excellent questions! Let’s tackle them one-by-one:

This is better because, for one, we listen! We’ll give the people what they want. That, and we’ll be way more transparent in how we calculate those stats, so you actually know what’s going on behind the numbers.

Here’s a sneak peek of what we have planned :slight_smile:

So this is a really cool idea, and the system won’t be that smart from day 1, but you will be able to infer that information. Given your example, you’ll be able to see words written for a particular story, compared to reading time. This will give you a rough idea of how a given story is performing. Please note that what we have planned right now are overall stats, not stats for a particular story, but the latter will be close behind the overall stats!

This is related to my last answer, but in short, not from day 1. That said, we’re definitely talking about how to make the most of the data we’re aggregating, to give authors the best insights possible. Questions like this directly affect those conversations, so: thanks for this! :slight_smile: I’d personally love to see this – it’s just calculus, right? :wink:

@David has talked about our vision for monetization for authors a bit in this thread, but in short: monetization for authors is a high-end goal, a long-term goal, not something we’re going to have early on. That said, you will be able to add your own call-to-action to your profile – sky’s the limit there! Of course, if you’re just talking monetization in general, we have big plans for sponsors in Hacker Noon 2.0… more to come in a future Product thread… :wink:


What about make it modular/open source, so you can use power of community and ability to make it like system with plugins?

Interesting idea, but it opens up a lot of questions and issues, such as generalizing the analytics system enough to be useful to other people, maintaining a large open source project as a company, and maintaining security while being open (I absolutely abhor security by obscurity, but on the other hand, the NSA doesn’t release all of its security practices and algorithms, not by half).

So while I don’t see this happening in the near future, who knows what may happen down the road! In the meantime, we’ll definitely be collecting feedback, implementing the best ideas, and writing about what we’re doing so folks know what’s behind the numbers.

1 Like

When this thread will grow - it should be published at HN, like a teaser - 100%

Maybe it’ll be cool for team, community and you to create some sort of releases plan for this dashboard? It will help to solidify tasks/features and see how it doing? Project manager inside me don’t give up to help :slight_smile:

And I think we underestimate the power of this community. Like BetaFeedback showing that one head is good, but 1+1 = 11 Maybe someone will decide to create a service that will be used at HN, or know some cool code/module that can speed up things. As a lazy person, I’m always looking for alternatives in order to speed up things

Yes, I remember that chat from features thread. Just more cents about monetization.

5 years ago I saw a client dashboard for AdRoll - it was very informative and more simple than GA. it has a small amount of visual data, but with good UI/logic, so anyone was able to understand how ads are working.

And I remember like David or Linh(Smooks Mafia) shared HN stats at Investments page, but how advertiser can get more information from that data? Use case: for example - one good article from copyrighter cost $100 + CTA in the end. How much clicks/ROI it can bring per user/per day/per months.

I think by asking questions like this it will help to understand the main reason behind this stats… Maybe it worth trying to define some goals of how these stats can be used in order to accomplish XX or YY

I mean that from developers perspective it’s just numbers and d3 chart, but what about business logic behind it…

1 Like

Oh yeah, I wouldn’t say that the developers perspective should ever be just numbers and a d3 chart. It should always, always be about the underlying use case and what the people actually need from the numbers.

That said, I think we’re getting into the weeds here – this is almost talking about a whole new analytics product, right? We often joke about pivots, as a team, but that would be a big one.

However, you talk about goals – what are the key metrics for authors? I think for me, the ratio of words published to time read is super important. It tells me quite directly how effective my writing and its distribution are (or you could compare time reading to pageviews, and then compare the two ratios). What would you want to see as an author? Let’s get concrete! :slight_smile:

1 Like

Ok, but this is a very easy way :frowning: I got it - slowing down.

  • views all
  • reads(people reach the end of story)
  • clicks on profile, soc links, CTA at story
  • how much people follows me after article
  • What category is more trending for views

But! for reads, prevent increasing reads when people just scroll to the end.

Can we at least have writing tips? pleeze. Based on that data.
or it’s like fitness trackers - showing to you your data, but not preventing you from eating icecream

I think there should be poll for this things. So people can vote for what they want to see first.

1 Like

Great stuff, thanks Arthur! I’d love to have all of the things you discuss, especially writing tips, but writing tips in particular are capital-H Hard to implement well. We do want to do these things…it’ll just take time. Will we have it on day 1? No way. But a year from now? It’s certainly possible. I’d be happy to get into why writing tips are such a difficult problem, but really, it’s worth a blog post or thread all its own.

I imagine that better insights are something a lot of people want, and with all the data we’ll be pulling in with the analytics system, it would be foolish not to do it at some point. However, other priorities may take precedence. Polls are a good idea, but we have to be careful how we phrase the poll. If you ask people if they want better insights into how their stories are performing, almost everyone will say “yes”. It’s a no-brainer. It’s hard to phrase these questions, and suss out what people really want (and what they really need). Anyway, the system will become smarter over time. But we have to crawl before we can walk, and walk before we can run, right?

As for the other stats you mentioned, I’m currently working on pageviews, reading time, words published, and a few other stats in the pipeline. Tracking clicks on links within a story is a great idea, one I honestly hadn’t thought of – I’ll bring that up with the team for sure. CTA clicks (at least on the profile) will definitely have to happen as well. As for followers, we don’t currently have a follower system in place, but that’s coming. Regarding which categories (or tags, in this case) are trending, we will have analytics of some kind on which tags are performing the best for your stories. Stay tuned for an update on that last bit… :wink:

tl;dr we will have writing tips eventually, but not from day 1, due to the sheer difficulty of doing that well. As for the stats you mentioned, we’re working on some now, some later. :slight_smile:

time to dig deeper into the BuzzSumo API :slight_smile: