Surfacing Google Analytics stats in DSpace
In the recent survey asking the DSpace community for their top 3 feature requests for DSpace 1.6, the number one most requested feature was statistics. As you’ll know from previous posts, I’m a big fan of Google Analytics.
For the uninitiated, you insert a small bit of JavaScript in your web pages, and Google provide a very rich and powerful analytics service for viewing your site statistics.
Recently Google announced the launch of an analytics API that allows you to remotely query and download the statistics its holds about your site.
I like playing with APIs, so throught I’d write a solution that downloads item splashscreen view statistics from Google Analytics and displays them on the item page:

The solution is quite simple. It requires the additon on one Java class into DSpace. This class should be run daily to download the statistics. The same class is used by the user interface to display the statistics. If you want to implement this solution, follow the instructions below:
- Create a new directory (java package) at [dspace-src]/dspace-api/src/main/java/org/dspace/app/googleanalytics
- Download the code shown at the bottom of this post, and save it as GoogleAnalyticsHitCounter.java in the directory that you just created.
- Edit [dspace-src]/dspace-api/pom.xml to add in the dependencies on the Google API libraries:
<dependency> <groupId>com.google.gdata</groupId> <artifactId>gdata-core</artifactId> <version>1.0</version> </dependency> <dependency> <groupId>com.google.gdata</groupId> <artifactId>gdata-analytics</artifactId> <version>1.0</version> </dependency> <dependency> <groupId>com.google.collect</groupId> <artifactId>google-collect</artifactId> <version>1.0</version> </dependency>
- Then download and save gdata-src.java-1.32.1.zip and extract and save (somewhere handy) the jar files: gdata-core-1.0.jar, gdata-analytics-1.0.jar, google-collect-1.0.jar (in zip file as google-collect-1.0-rc1.jar)
- Inatall each of these by running the following Maven commands, adjusting paths as appropriate:
- mvn install:install-file -DgroupId=com.google.gdata -DartifactId=gdata-core -Dversion=1.0 -Dfile=gdata-core-1.0.jar -Dpackaging=jar
- mvn install:install-file -DgroupId=com.google.gdata -DartifactId=gdata-analytics -Dversion=1.0 -Dfile=gdata-analytics-1.0.jar -Dpackaging=jar
- mvn install:install-file -DgroupId=com.google.collect -DartifactId=google-collect -Dversion=1.0 -Dfile=google-collect-1.0.jar -Dpackaging=jar
- Next, edit [dspace-src]/dspace-jspui/dspace-jspui-webapp/src/main/webapp/display-item.jsp, and somewhere in the code (choose where you want it), add the following code:
<%
// See if we can display a counter
String path = "/handle/" + item.getHandle();
String count = GoogleAnalyticsHitCounter.getPageCount(path);
if ((count != null) && (!"".equals(count)))
{
%>
<table align="center" class="miscTable">
<tr>
<td class="oddRowEvenCol" align="center">
This item has been viewed <strong><%= count %></strong> times
</td>
</tr>
</table>
<%
}
%>
- If you don’t deploy your user interface as the ROOT webapp, then you’ll have to add the context in the line: String path = “/handle/” + item.getHandle();
- Now build and deploy DSpace as you would normally (mvn package; ant update; etc…)
- Edit dspace.cfg and add in the following entries:
- googleanalytics.username = your-google-analytics@email.address.com
- googleanalytics.password = your-google-analytics-password
- googleanalytics.siteid = 123456789
- googleanalytics.filename = analyticscounts.properties
- googleanalytics.startdate = 2007-07-17
- Adjust the email address and password as appropriate.
- Log in to Google Analytics and find out the first date that you have statistics for. Set this in the start date entry, in the form of yyyy-mm-dd
- View the dashboard of your Google Anlytics, and look at the URL. Part of it will include ‘id=nnnnnnn‘. Copy the id number and enter it in the dspace.cfg siteid entry.
- Download and compile your statistics by running (from [dspace]/bin/)
- dsrun org.dspace.app.googleanalytics.GoogleAnalyticsHitCounter
- If everything worked as it should, you should now have a file [dspace]/analyticscounts.properties If you look in this file, you find entires in the form of ‘/handle/xxxx/yyyy=55′.
- Now start tomcat, view an item, and if the handle appears in the downloaded stats, you should see the item count!
As with the DSpace video player solution I wrote about earlier this week, the code is not perfect, and needs to be improved a bit to make it solid, but is a good start if you wanted to use this type of solution. Enjoy!
package org.dspace.app.googleanalytics;
import java.io.IOException;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.Properties;
import java.util.Calendar;
import java.util.Date;
import java.text.SimpleDateFormat;
import com.google.gdata.client.analytics.AnalyticsService;
import com.google.gdata.data.analytics.DataEntry;
import com.google.gdata.data.analytics.DataFeed;
import com.google.gdata.data.analytics.Metric;
import com.google.gdata.util.AuthenticationException;
import com.google.gdata.util.ServiceException;
import org.dspace.core.ConfigurationManager;
import org.apache.log4j.Logger;
public class GoogleAnalyticsHitCounter {
/** log4j category */
private static Logger log = Logger.getLogger(GoogleAnalyticsHitCounter.class);
/** Hit counter */
private static Properties counts;
/** When the counter last loaded? */
private static Date lastloaded;
/** The filename of the counter file */
private static String filename;
/**
* Initalise the system
*/
public static void init()
{
// Load the properties file
Calendar yesterday = Calendar.getInstance();
yesterday.add(Calendar.DATE, -1);
lastloaded = yesterday.getTime();
filename = ConfigurationManager.getProperty("dspace.dir") +
System.getProperty("file.separator") +
ConfigurationManager.getProperty("googleanalytics.filename");
counts = new Properties();
loadCounter();
}
/**
* Get the count for a particular page (e.g. /handle/123/456
*
* @param page The page path
* @return The count. Empty String if unknown
*/
public static String getPageCount(String page)
{
// Check we're initialised
if (lastloaded == null)
{
init();
}
// Reload the hits
loadCounter();
// Get the value
if (page == null)
{
page = "";
}
String count = counts.getProperty(page);
// Return the value
if (count != null)
{
return count;
}
return "";
}
/**
* (Re)load the counter. It is reloaded every hour.
*/
private static void loadCounter()
{
// Do we need to load it?
Calendar hourago = Calendar.getInstance();
hourago.add(Calendar.HOUR, -1);
if (lastloaded.before(hourago.getTime()))
{
try
{
counts.load(new FileReader(filename));
lastloaded = Calendar.getInstance().getTime();
}
catch (Exception e)
{
log.warn("Unable to load google hit counter from " + filename);
}
}
}
/**
* Command line method to collect the statistics from Google Analytics.
*
* @param args No arguments used
*/
public static void main(String args[])
{
// Set up the variables
String username = ConfigurationManager.getProperty("googleanalytics.username");
String password = ConfigurationManager.getProperty("googleanalytics.password");
String siteid = ConfigurationManager.getProperty("googleanalytics.siteid");
String startdate = ConfigurationManager.getProperty("googleanalytics.startdate");
String handle = ConfigurationManager.getProperty("handle.prefix");
String root = ConfigurationManager.getProperty("dspace.url");
String filename = ConfigurationManager.getProperty("dspace.dir") +
System.getProperty("file.separator") +
ConfigurationManager.getProperty("googleanalytics.filename");
// Get the local path
String path = "";
try
{
URL localURL = new URL(root);
path = localURL.getPath();
if (path.endsWith("/"))
{
path = path.substring(0, path.length() - 1);
}
}
catch (MalformedURLException e)
{
System.err.println("Invalid dspace.url URL (" + root + ")");
return;
}
AnalyticsService as = new AnalyticsService("gaExportAPI_acctSample_v1.0");
String baseUrl = "https://www.google.com/analytics/feeds/";
// Login to Google
try {
as.setUserCredentials(username, password);
} catch (AuthenticationException e) {
System.err.println("Authentication failed : " + e.getMessage());
return;
}
// The results
Properties counts = new Properties();
// Keep requesting pages of results from Google until a blank page is found
// pages of 1,000 results at a time
URL queryUrl;
int i = 1;
boolean found = true;
int total = 0;
// Get stats up until yesterday
Calendar yesterday = Calendar.getInstance();
yesterday.add(Calendar.DATE, -1);
SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd");
String enddate = format.format(yesterday.getTime());
while (found)
{
found = false;
try {
String q = baseUrl +
"data?start-index=" + i +
"&ids=ga:" + siteid +
"&start-date=" + startdate +
"&end-date=" + enddate +
"&metrics=ga:pageviews" +
"&dimensions=ga:pagePath" +
"&filters=ga:pagePath%3D~" + path + "/handle/" + handle + "/[0-9]%2B$";
queryUrl = new URL(q);
} catch (MalformedURLException e) {
System.err.println("Malformed URL: " + baseUrl);
return;
}
// Send our request to the Analytics API and wait for the results to come back
DataFeed dataFeed;
try {
dataFeed = as.getFeed(queryUrl, DataFeed.class);
} catch (IOException e) {
System.err.println("Network error trying to retrieve feed: " + e.getMessage());
return;
} catch (ServiceException e) {
System.err.println("Analytics API responded with an error message: " + e.getMessage());
return;
}
for (DataEntry entry : dataFeed.getEntries()) {
String id = entry.getId().substring(70);
id = id.substring(0, id.indexOf('&'));
for (Metric metric : entry.getMetrics()) {
counts.put(id, metric.getValue());
total = total + Integer.parseInt(metric.getValue());
}
found = true;
}
i = i + 1000;
}
// Save the properties file
counts.put("total", "" + total);
try
{
counts.store(new FileOutputStream(filename), null);
System.out.println("Saved " + total + " total hits in " + filename);
}
catch (IOException e)
{
System.err.println("Error saving results to file: " + filename);
return;
}
}
}
In: Uncategorized · Tagged with: analytics, dspace, repositories



on May 29, 2009 at 9:58 am
Permalink
Great work Stuart !
Is your experience with the API that it responds quickly or slow ?
Would be interested to compare whether it becomes slower for big numbers.
on May 29, 2009 at 6:56 pm
Permalink
Hi Bram, The stats are downloaded in pages of 1,000 at a time, and is done so ‘offline’ by a daily cron job. So speed of response from the API isn’t really a problem. (At the moment, it seems to take about second or so per 1,000 results)
on September 18, 2009 at 6:14 pm
Permalink
Hello Sir,
I am running dspace in local.
I have succesfully implemented code.
But I am getting zero hits.
I googleAnalytics accout in my profile
I do have Website URL: http://localhost:8080/dspace
on September 24, 2009 at 6:57 am
Permalink
Have you set ‘googleanalytics.siteid = 123456789′ appropriately to your site id?
on September 24, 2009 at 5:55 pm
Permalink
Hello Sir,
I have set googleanalytics.siteid = UA-10719208-1
and i do have profile id = 21619613
Which one i need to set.
Q 2 : Does the code works locally ?
on September 28, 2009 at 9:24 am
Permalink
For putting statistics on your site you use the site ID. Unfortunately I don’t think Google Analytics works when running your web site as http://localhost/
on October 27, 2009 at 10:16 pm
Permalink
This is very interesting!
Although it fails to compile on my server – I get the error message “[INFO] Compilation failure/…../GoogleAnalyticsHitCounter.java:[96,6] load(java.io.InputStream) in java.util.Properties cannot be applied to (java.io.FileReader)”
I refers to the loadCounter() static, but I am not quite sure what to look for here…
Java version is jdk1.5.0_15
on October 28, 2009 at 9:52 am
Permalink
Hi Urban,
I think loading the contents of a Properties file using a FileReader was only introduced in Java 1.6.
Try changing the line:
counts.load(new FileReader(filename));
to
counts.load(new FileInputStream(new File(filename)));
You’ll also need extra import lines at the top of the file:
import java.io.File;
import java.io.FileInputStream;
Thanks,
Stuart
on October 28, 2009 at 10:55 am
Permalink
Works like a dream. Many thanks!
on October 30, 2009 at 11:04 pm
Permalink
Hi,
[I think this is the right place to ask...],
where can I have a look to a public DSpace GA stats?
on October 31, 2009 at 9:01 am
Permalink
Hi Alessandra,
I don’t know if there are any public DSpace instances running this code.
Thanks,
Stuart
on December 17, 2009 at 5:13 pm
Permalink
when I got the updated analyticscounts.properties, Should I need to restart the jspui service in tomcat to get the updated count?
on December 17, 2009 at 7:50 pm
Permalink
Hi Gary,
I wrote the code a while ago, and haven’t looked at it for a while. Looking at it again, I think it should reload the data every hour, although it was never fully tested (was more of a proof of concept) so it might not work fully.
Thanks,
Stuart
on February 6, 2010 at 12:35 pm
Permalink
Dear Stuart
I have addapted this code to use with the xmlui Interface.
I acctually store the hitcounts in a dc field so I can browse items by Hitcounts. Recentlly the code stopped working, giving me an error:Authentication failed : Captcha required
the code still works on an instance of dspace in my Macbook but not in the server of the Institute.
Any hints on how to solve this.
Paulo
on February 6, 2010 at 12:44 pm
Permalink
Hi Paulo,
I’ve not seen this error myself before. http://code.google.com/apis/gdata/docs/auth/clientlogin.html#Examples looks useful. I think it may be that the API is really intended to present data to a user, rather to a system that uses it, so a user would be able to re-authenticate and solve the Captcha.
Another Google help page also suggests:
If a user supplies an incorrect username or password, or a similar error occurs, the AuthenticationException is thrown. If your application uses ClientLogin to authorize, and a program requests a token too frequently, the user is presented with a captcha challenge response. (links to the URL above).
I hope that helps,
Stuart
on February 10, 2010 at 1:06 am
Permalink
Hi Paulo and others looking at incorporating the Google Analytics API into DSpace. Starting from some of the code on this page we at OpenRepository.com have been able to get a pretty nice result for the statistics of our repositories. For some background on what and how we did it, click through to here: http://openrepository.com/products/enhanced-statistics From there you can click through to the demo repository and see some examples of the API in action. Hope this gives you all some impetus and hope as to what can be achieved.
Bests,
Michael