Google Analytics is not a statistics package!
As everyone knows I’m a big fan of using Google Analytics with repositories in order to see what is happening with your repository with respect to visitors – what they are looking at / which links they are following / where they are coming from / how many people are visiting the site etc.
However from time to time I come across views regarding some of the data that is not captured by Google Analytics. Such data includes users who do not allow javascript / cookies, and visitors who click directly on ‘files’ (e.g. PDF files). In this second case, the data isn’t tracked because there is no web page shown from which to run the Google Analytics tracking code. In an attempt to help collect some of this information I have used a script by Patrick H. Lauke which triggers when a user clicks to download a file from a metadata jump-off page. It registers the click with Google Analytics and the download is recorded. But as I said, it doesn’t direct hits to the file that did not first go via the repository.
Is this a problem? Personally I don’t think so:
- At least some of the data is now being recorded, which is better than none. It might not be numercially accurate, but hopefully it is still representative of user behaviour.
- Remember that Google Analytics is an analytics package, not a statistics package. It does not claim to record every click, but is more intended to help with analysing and improving the user experience (e.g. “Do I get more file downloads if I place the list of files above the metadata or below it” or “Do users that land on a browse page download more files than those that arrive directly on an item page”).
- If you want raw download figures, use a proper statistics system that works from web server logs (e.g. IRStats or a common web stats system such as AWStats). Most likely you’ll want to use both.
In: Uncategorized · Tagged with: analytics, repositories



on August 7, 2008 at 2:35 pm
Permalink
you could set up some fancy url rewriting rules in apache to pipe direct downloads/links via some custom, analytics-hitting script page first…but that may be overkill in most cases.
on August 20, 2009 at 9:13 pm
Permalink
Buenos días.
Soy Montse.
El año pasado instale dspace 1.4.2 y las estadísticas mensuales no aparecen, como podréis comprobar en la url http://rabida.uhu.es/dspace/statistics
Solo aparece el mes y el año que indico en la variable de año y mes inicial.
Al ejecutar manualmente los scripts de estadísticas, aparecen los siguientes fallos:
1.- [dspace@rabida dspace]$ cd /usr/local/dspace/bin/
[dspace@rabida bin]$ ./stat-initial
Exception in thread “main” java.sql.SQLException: ERROR: invalid input syntax for type timestamp: “2009″
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1501)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1283)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:186)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:392)
……….
……….
2.- [dspace@rabida bin]$ ./stat-report-initial
Failed to read input file
Failed to read input file
……….
……….
¿Sabe cómo puedo corregirlo?
Gracias.
on August 20, 2009 at 9:54 pm
Permalink
Buenos días,
Podría usted enviar por favor el rastro de apilado completo a la lista del email de la dspace-tech http://wiki.dspace.org/index.php/DSpaceResources#Mailing_Lists, donde usted recibirá ayuda.
Gracias,
Stuart
on August 21, 2009 at 1:07 am
Permalink
ok
he enviado fichero con errores al correo:
dspace-tech-request@lists.sourceforge.net
Gracias.
on August 21, 2009 at 9:25 am
Permalink
Intento dspace-tech@lists.sourceforge.net algo que el dspace-tech-request@ists.slurceforge.net.