Google Analytics is not a statistics package!

As everyone knows I’m a big fan of using Google Analytics with repositories in order to see what is happening with your repository with respect to visitors – what they are looking at / which links they are following / where they are coming from / how many people are visiting the site etc.

However from time to time I come across views regarding some of the data that is not captured by Google Analytics. Such data includes users who do not allow javascript / cookies, and visitors who click directly on ‘files’ (e.g. PDF files). In this second case, the data isn’t tracked because there is no web page shown from which to run the Google Analytics tracking code. In an attempt to help collect some of this information I have used a script by Patrick H. Lauke which triggers when a user clicks to download a file from a metadata jump-off page. It registers the click with Google Analytics and the download is recorded. But as I said, it doesn’t direct hits to the file that did not first go via the repository.

Is this a problem? Personally I don’t think so:

  • At least some of the data is now being recorded, which is better than none. It might not be numercially accurate, but hopefully it is still representative of user behaviour.
  • Remember that Google Analytics is an analytics package, not a statistics package. It does not claim to record every click, but is more intended to help with analysing and improving the user experience (e.g. “Do I get more file downloads if I place the list of files above the metadata or below it” or “Do users that land on a browse page download more files than those that arrive directly on an item page”).
  • If you want raw download figures, use a proper statistics system that works from web server logs (e.g. IRStats or a common web stats system such as AWStats). Most likely you’ll want to use both.

сайт визитка на заказ

5 thoughts on “Google Analytics is not a statistics package!

  1. patrick h. lauke

    you could set up some fancy url rewriting rules in apache to pipe direct downloads/links via some custom, analytics-hitting script page first…but that may be overkill in most cases.

  2. Montserrat

    Buenos días.
    Soy Montse.
    El año pasado instale dspace 1.4.2 y las estadísticas mensuales no aparecen, como podréis comprobar en la url http://rabida.uhu.es/dspace/statistics

    Solo aparece el mes y el año que indico en la variable de año y mes inicial.

    Al ejecutar manualmente los scripts de estadísticas, aparecen los siguientes fallos:

    1.- [dspace@rabida dspace]$ cd /usr/local/dspace/bin/

    [dspace@rabida bin]$ ./stat-initial
    Exception in thread “main” java.sql.SQLException: ERROR: invalid input syntax for type timestamp: “2009”
    at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1501)
    at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1283)
    at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:186)
    at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:392)
    ……….
    ……….

    2.- [dspace@rabida bin]$ ./stat-report-initial
    Failed to read input file
    Failed to read input file
    ……….
    ……….

    ¿Sabe cómo puedo corregirlo?
    Gracias.

Leave a Reply

Your email address will not be published. Required fields are marked *