Avid readers may have notice a new function I added relatively silently this week, the how-on-earth-did-anyone-find-this-article function. For eksample, if you go to this article about coffee, you will see that someone has found this article by googleing "hvordan lage aprikoslikør", at least at the time of writing. So how can I know that?
As you open a web page, what actually happens is that your browser sends something called an http request to the web server. This request contains a bunch of information, most importantly which page you want to open, but also some other stuff. Some of that information is required, and if it is missing, it will result in an error, whereas some is optional. For example, most browsers allow you to select which languages you prefer to see websites in. Not that many webpages are actually available in more than one language, or at least not in both Norwegian and English, but calcuttagutta currently is, so you can test this if you want.
Another piece of optional information, and the one which concerns us here, is the http_referer. If you click a link to go to a new page, the http request will contain information about which page you came from. If you don't want to give up this information, it is possible to disable it, but most people probably neither know nor care. So for example, if you googled "hvordan lage aprikoslikør", your http_referer would look like this
http://www.google.com/search?hl=en&source=hp&q=hvordan%20lage%20aprikoslik%C3%B8r&aq=f&aqi=&aql=&oq=&gs_rfai=
http://www.google.com/
http://www.google.
search?hl=en&source=hp&q=hvordan%20lage%20aprikoslik%C3%B8r&aq=f&aqi=&aql=&oq=&gs_rfai=
q=hvordan%20lage%20aprikoslik%C3%B8r
For those interested, the actual code looks like this:
from urllib import unquote_plusif request.META.has_key('HTTP_REFERER'):
if request.META['HTTP_REFERER'].startswith('http://www.google.'):
queries = request.META['HTTP_REFERER'].split('?')[1].split('&')
for query in queries:
if query.split('=')[0] == 'q':
c['article'].google_count += 1
c['article'].last_google_hit = unquote_plus(query.split('=')[1])
c['article'].save()
-Tor Nordam