Doug's Dynamic Drivel

"Religion is what the common people see as true, the wise people see as false, and the rulers see as useful." Seneca

Entries Comments


Lies, damned lies, and statistics

23 December, 2009 (16:35) | Computer, Technology

Everywhere you go on the web you see visitor touted by site owners. Have you ever stopped to consider how misleading those can be? If you go somewhere and see a counter of some sort that states they’ve had 1.500,000 visitors the average non-techie person is going to think – wow this is a popular site, its had over a million people visit it. Unfortunately there is so much wrong with that, not the least of which is the rapidly increasing number of that scour websites on a constant basis, or that it presents very misleading eyeball figures for potential advertisers.

Currently on this site there are approximately 10 bot hits to every real hit although I have most of those hits filtered out in the results, but that takes a lot of work. I have plenty of /24 (Class C – 256IP addresses in a continuous block) and lot’s of referrer agents etc all being ignored by my main stats program Firestats, I doubt very much if most hosts even make the attempt or know that it’s needed.

One thing is that many don’t exactly advertise themselves as such. They don’t put any identifying information in the user agent string and one has to then analyze their behavior to catch them (such as going to numerous pages on the site and spending exactly the same amount of time on each page before preceding to the next one dropping off for a bit then coming back and doing it again on another 10 posts or so. Ones that do that find their address ranges in my deny file in iptables pretty quickly, along with a sharp rebuke sent to their hosting provider – that’s very bad manners.

Bots suck up bandwidth and, more importantly, they eat up RAM/CPU resources. Were I actually paying for this server, would be costing me serious money as I would find myself having to add more and more resources to the server so that legitimate users could visit here and not be faced with a long load time.

Yesterday I added a different stats program Visitor Maps to the blog – it does a different job than Firestats – between them my stats will be much more reliable – if not as impressive as some others are. Here is an example of what it shows me

It’s good at detecting most but is missed a number of MSN still I got the IP Addresses for those so I can expand the range of IPs to overlook (MSN. Yahoo, Google are all welcome to index my site as often as they want – others not so often) Nearly all of the hits in that 35 minute span of time was a bot – pretty eye opening isn’t it :)

Give me a little link love would ya ;):
  • StumbleUpon
  • Facebook
  • Twitter
  • Digg
  • NewsVine
  • Wikio
  • Slashdot
  • del.icio.us
  • LinkedIn
  • Google Bookmarks
  • Fark
  • Technorati
  • Blogosphere News
  • MySpace
  • Yahoo! Buzz
  • blogmarks
  • FriendFeed
  • DZone
  • Live
  • MisterWong
  • Ping.fm
  • Reddit
  • Yahoo! Bookmarks
  • Blogplay
  • email
  • Identi.ca
  • Suggest to Techmeme via Twitter
Drivel Tags: ,

Related posts

«

  »

Write a comment

Commercial advertising not accepted. Comments linked to commercial sites will be deleted as SPAM





Get Adobe Flash playerPlugin by wpburn.com wordpress themes

Bad Behavior has blocked 4373 access attempts in the last 7 days.

417374 pages viewed, 30 today
338913 visits, 28 today
FireStats icon Powered by FireStats