Count search engine referrals using web server logs and shell commands

Suppose you want to know how much of your website traffic comes from Google or another search engine. It's very easy if you use Awstats or Google analytics, but what if you haven't configured these tools? Using only the web server log files and some shell commands will enable us to quickly parse log files and give a count of how many referrals we had from a search engine.

For the purposes of this tutorial, we will be assuming an Apache web server, and that the search engine that we are counting the referrals from is Google.

How it works

By default, Apache writes the referring URL in the log file as part of the logged request. Referrals coming from the result of a Google search will have it's referring URL beginning with "http://www.google and it is this phrase that we will search for to determine the number of referrals from Google.

The double-quote at the beginning of the search phrase is necessary as referrers are enclosed in double-quotes by Apache.

Which commands to use

The shell command to parse the log is:

cat /path/to/logs/access.log | grep -c "\"http://www.google"

If we ran that command, the output will be the number of referrals we have from Google search engine result pages.

We first read the contents of the log file, then we pipe it to grep which counts the number of occurrences of "http://www.google in the log file. The -c flag is what tells grep that we want to know the number of matches that are found.

If you need to modify the above command to count the referrals from another search engine, just identify the beginning of the URL that the referrals will be coming from, and alter it in the command accordingly.

Caveats

Note that it is possible to spoof the referrer when making the HTTP request so these results can be easily manipulated. For the most part though, this method works quite well.

Conclusion

This method can be easily automated to count referrals from search engines at regular intervals for those among us who absolutely need to be updated with the latest website statistics.

If you need more advanced statistics, try Google Analytics or Awstats. We hope you enjoyed this tutorial.

Thank Tutorial Arena for This Tutorial.
Show your appreciation with a +1...