SamSuka
Touhou-Project.com
Touhou-Project.com

patreon


Statistics and how useful they are

 If you’re running a website, having a good handle on statistics can be very useful. Being able to tell how many visitors you have, what’s brought them to the site, what they look at and for how long allows you to cater to your community’s needs and address issues. Not only that—statistics help determine bandwidth usage and plan accordingly around possible technical bottlenecks.

THP is no exception to that reality and pretty much for as long as the site has existed, I’ve kept records of some sort. The primary way I’ve done that is through a program called AWStats, which is an open source analytics tool. Without getting too technical about it, the web server generates logs of all requests it handles (files served, external links leading to it, etc) and has all sorts of data such as IP and time associated with each entry. AWStats looks at these logs before they’re rotated (to prevent logs from getting too big, after some time new ones are made and old ones are stored for a set amount of time) and breaks down the information in a more human-intelligible way. The end result are charts, tables and other visual aids that can tell me how many unique visitors the website gets in a specific time period or percentages about which web browsers are most popular among users.

See thp.moe/teru/statistics1.png of a yearly breakdown of various stats.

I’ve kept actual numbers out of the image because of a few flaws in methodology. Chiefly, that stats are updated weekly rather than monthly because of issues we had way in the past. This means that things like unique visitors is necessarily inflated since it treats every instance of the logs as brand new information. Some of the flaws have to do with the web server configuration itself and the different way logs are stored and the specific versions of software. It’s something that I’ll be sorting out whenever the next release of Debian is out (within the next few months likely) as fixing things necessitates breaking from previous methodology. In other words, new data can’t be compared to old data directly. So I might as well change a whole bunch of other variables such as software (different versions or programs altogether) while I’m at it.

If you read the previous technical post, you’ll know that I’m a privacy-focused and free/libre and open source software (FLOSS) sort of guy. In practical terms it means that I try to use tools which I can inspect and modify to my satisfaction and that I don’t trust proprietary software to handle data. How is that relevant? Well, more accurate methods of getting data involve using things like cookies to track users as they move around the website (sometimes even on other websites) and creating persistent databases of activity. I’m against this or anything that strips away anonymity from users without their consent or can be, if leaked, used for malicious purposes by third parties. Likewise, some of the more popular tools like Google Analytics for website traffic analysis are not to be trusted because they involve third parties who may use this data for advertising or whatever else.

With things like adblock becoming ubiquitous and no script and rejection of cookies by web users, it’s also increasingly more difficult to get a complete impression of who your users are. So this is why I’ve focused more on the general trends as well as what links to THP. The sad truth of the former is that the raw activity (hits, bandwidth, visitors) is less than half since 2014 and that things like referrers (search engines, links elsewhere) have always accounted for a very tiny amount of traffic influx. Basically, people who are on THP, access it through a bookmark or typing out the address and people don’t really just stumble upon the site.  

This is why I think that, medium and long-term, THP needs more people and visibility to keep generating activity. I’ll talk about more concrete plans sometime down the line but it’s clear that at least part of the solution will have to necessarily involve advertising. It’s my intention to do a few long-planned reforms (think mystery box) and collecting something of a more sizable war chest (which is part of the reason we have a Patreon and stretch goals to begin with!) before actually dealing with with attracting more people directly. I’ll just add as an aside: if you ever have an idea on how to get more new blood, do let me know via messaging on Patreon or talking to me through IRC. I’ll be happy to take any and all ideas into consideration.

Hope it won’t be too long before the next one of these posts and that I’ll be able to share some more details of things I’ve been working on.

P.S. The board software does generate “stats” in the admin page but it’s limited in usefulness as it’s mostly just the number of posts made across the boards in the last day, week or lifetime. Doesn’t show what people read or what parts of the site they access. In thp.moe/teru/statistics2.png  you can see just how slow the boards have gotten—the number of posts on /th/ in the last week used to be the number of daily posts there some time ago.

Edit: Forgot that hotlinking to images is disabled on THP so changed the links to the address in-text you have to paste in your browser so you won't get a 403 error.



More Creators