Surfacing Google Analytics stats in DSpace

In the recent survey asking the DSpace community for their top 3 feature requests for DSpace 1.6, the number one most requested feature was statistics. As you’ll know from previous posts, I’m a big fan of Google Analytics.

For the uninitiated, you insert a small bit of JavaScript in your web pages, and Google provide a very rich and powerful analytics service for viewing your site statistics.

Recently Google announced the launch of an analytics API that allows you to remotely query and download the statistics its holds about your site.

I like playing with APIs, so throught I’d write a solution that downloads item splashscreen view statistics from Google Analytics and displays them on the item page:

gajspui

The solution is quite simple. It requires the additon on one Java class into DSpace. This class should be run daily to download the statistics. The same class is used by the user interface to display the statistics. If you want to implement this solution, follow the instructions below:

  • Create a new directory (java package) at [dspace-src]/dspace-api/src/main/java/org/dspace/app/googleanalytics
  • Download the code shown at the bottom of this post, and save it as GoogleAnalyticsHitCounter.java in the directory that you just created.
  • Edit [dspace-src]/dspace-api/pom.xml to add in the dependencies on the Google API libraries:
<dependency>
<groupId>com.google.gdata</groupId>
<artifactId>gdata-core</artifactId>
<version>1.0</version>
</dependency>

<dependency>
<groupId>com.google.gdata</groupId>
<artifactId>gdata-analytics</artifactId>
<version>1.0</version>
</dependency>

<dependency>
<groupId>com.google.collect</groupId>
<artifactId>google-collect</artifactId>
<version>1.0</version>
</dependency>
  • Then download and saveĀ gdata-src.java-1.32.1.zip and extract and save (somewhere handy) the jar files: gdata-core-1.0.jar, gdata-analytics-1.0.jar, google-collect-1.0.jar (in zip file as google-collect-1.0-rc1.jar)
  • Inatall each of these by running the following Maven commands, adjusting paths as appropriate:
    • mvn install:install-file -DgroupId=com.google.gdata -DartifactId=gdata-core -Dversion=1.0 -Dfile=gdata-core-1.0.jar -Dpackaging=jar
    • mvn install:install-file -DgroupId=com.google.gdata -DartifactId=gdata-analytics -Dversion=1.0 -Dfile=gdata-analytics-1.0.jar -Dpackaging=jar
    • mvn install:install-file -DgroupId=com.google.collect -DartifactId=google-collect -Dversion=1.0 -Dfile=google-collect-1.0.jar -Dpackaging=jar
  • Next, edit [dspace-src]/dspace-jspui/dspace-jspui-webapp/src/main/webapp/display-item.jsp, and somewhere in the code (choose where you want it), add the following code:
<%
    // See if we can display a counter
    String path = "/handle/" + item.getHandle();
    String count = GoogleAnalyticsHitCounter.getPageCount(path);
    if ((count != null) && (!"".equals(count)))
    {
%>
        <table align="center" class="miscTable">
            <tr>
                <td class="oddRowEvenCol" align="center">
                    This item has been viewed <strong><%= count %></strong> times
                </td>
            </tr>
        </table>
<%
    }
%>
  • If you don’t deploy your user interface as the ROOT webapp, then you’ll have to add the context in the line: String path = “/handle/” + item.getHandle();
  • Now build and deploy DSpace as you would normally (mvn package; ant update; etc…)
  • Edit dspace.cfg and add in the following entries:
    • googleanalytics.username = your-google-analytics@email.address.com
    • googleanalytics.password = your-google-analytics-password
    • googleanalytics.siteid = 123456789
    • googleanalytics.filename = analyticscounts.properties
    • googleanalytics.startdate = 2007-07-17
  • Adjust the email address and password as appropriate.
  • Log in to Google Analytics and find out the first date that you have statistics for. Set this in the start date entry, in the form of yyyy-mm-dd
  • View the dashboard of your Google Anlytics, and look at the URL. Part of it will include ‘id=nnnnnnn‘. Copy the id number and enter it in the dspace.cfg siteid entry.
  • Download and compile your statistics by running (from [dspace]/bin/)
    • dsrun org.dspace.app.googleanalytics.GoogleAnalyticsHitCounter
  • If everything worked as it should, you should now have a file [dspace]/analyticscounts.properties If you look in this file, you find entires in the form of ‘/handle/xxxx/yyyy=55′.
  • Now start tomcat, view an item, and if the handle appears in the downloaded stats, you should see the item count!

As with the DSpace video player solution I wrote about earlier this week, the code is not perfect, and needs to be improved a bit to make it solid, but is a good start if you wanted to use this type of solution. Enjoy!

package org.dspace.app.googleanalytics;

import java.io.IOException;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.Properties;
import java.util.Calendar;
import java.util.Date;
import java.text.SimpleDateFormat;

import com.google.gdata.client.analytics.AnalyticsService;
import com.google.gdata.data.analytics.DataEntry;
import com.google.gdata.data.analytics.DataFeed;
import com.google.gdata.data.analytics.Metric;
import com.google.gdata.util.AuthenticationException;
import com.google.gdata.util.ServiceException;
import org.dspace.core.ConfigurationManager;
import org.apache.log4j.Logger;

public class GoogleAnalyticsHitCounter {

/** log4j category */
private static Logger log = Logger.getLogger(GoogleAnalyticsHitCounter.class);

/** Hit counter */
private static Properties counts;

/** When the counter last loaded? */
private static Date lastloaded;

/** The filename of the counter file */
private static String filename;

/**
* Initalise the system
*/
public static void init()
{
// Load the properties file
Calendar yesterday = Calendar.getInstance();
yesterday.add(Calendar.DATE, -1);
lastloaded = yesterday.getTime();
filename = ConfigurationManager.getProperty("dspace.dir") +
System.getProperty("file.separator") +
ConfigurationManager.getProperty("googleanalytics.filename");
counts = new Properties();
loadCounter();
}

/**
* Get the count for a particular page (e.g. /handle/123/456
*
* @param page The page path
* @return The count. Empty String if unknown
*/
public static String getPageCount(String page)
{
// Check we're initialised
if (lastloaded == null)
{
init();
}

// Reload the hits
loadCounter();

// Get the value
if (page == null)
{
page = "";
}
String count = counts.getProperty(page);

// Return the value
if (count != null)
{
return count;
}
return "";
}

/**
* (Re)load the counter. It is reloaded every hour.
*/
private static void loadCounter()
{
// Do we need to load it?
Calendar hourago = Calendar.getInstance();
hourago.add(Calendar.HOUR, -1);
if (lastloaded.before(hourago.getTime()))
{
try
{
counts.load(new FileReader(filename));
lastloaded = Calendar.getInstance().getTime();
}
catch (Exception e)
{
log.warn("Unable to load google hit counter from " + filename);
}
}
}

/**
* Command line method to collect the statistics from Google Analytics.
*
* @param args No arguments used
*/
public static void main(String args[])
{
// Set up the variables
String username = ConfigurationManager.getProperty("googleanalytics.username");
String password = ConfigurationManager.getProperty("googleanalytics.password");
String siteid = ConfigurationManager.getProperty("googleanalytics.siteid");
String startdate = ConfigurationManager.getProperty("googleanalytics.startdate");
String handle = ConfigurationManager.getProperty("handle.prefix");
String root = ConfigurationManager.getProperty("dspace.url");
String filename = ConfigurationManager.getProperty("dspace.dir") +
System.getProperty("file.separator") +
ConfigurationManager.getProperty("googleanalytics.filename");

// Get the local path
String path = "";
try
{
URL localURL = new URL(root);
path = localURL.getPath();
if (path.endsWith("/"))
{
path = path.substring(0, path.length() - 1);
}
}
catch (MalformedURLException e)
{
System.err.println("Invalid dspace.url URL (" + root + ")");
return;
}

AnalyticsService as = new AnalyticsService("gaExportAPI_acctSample_v1.0");
String baseUrl = "https://www.google.com/analytics/feeds/";

// Login to Google
try {
as.setUserCredentials(username, password);
} catch (AuthenticationException e) {
System.err.println("Authentication failed : " + e.getMessage());
return;
}

// The results
Properties counts = new Properties();

// Keep requesting pages of results from Google until a blank page is found
// pages of 1,000 results at a time
URL queryUrl;
int i = 1;
boolean found = true;
int total = 0;

// Get stats up until yesterday
Calendar yesterday = Calendar.getInstance();
yesterday.add(Calendar.DATE, -1);
SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd");
String enddate = format.format(yesterday.getTime());

while (found)
{
found = false;
try {
String q = baseUrl +
"data?start-index=" + i +
"&ids=ga:" + siteid +
"&start-date=" + startdate +
"&end-date=" + enddate +
"&metrics=ga:pageviews" +
"&dimensions=ga:pagePath" +
"&filters=ga:pagePath%3D~" + path + "/handle/" + handle + "/[0-9]%2B$";
queryUrl = new URL(q);
} catch (MalformedURLException e) {
System.err.println("Malformed URL: " + baseUrl);
return;
}

// Send our request to the Analytics API and wait for the results to come back
DataFeed dataFeed;
try {
dataFeed = as.getFeed(queryUrl, DataFeed.class);
} catch (IOException e) {
System.err.println("Network error trying to retrieve feed: " + e.getMessage());
return;
} catch (ServiceException e) {
System.err.println("Analytics API responded with an error message: " + e.getMessage());
return;
}

for (DataEntry entry : dataFeed.getEntries()) {
String id = entry.getId().substring(70);
id = id.substring(0, id.indexOf('&'));
for (Metric metric : entry.getMetrics()) {
counts.put(id, metric.getValue());
total = total + Integer.parseInt(metric.getValue());
}
found = true;
}

i = i + 1000;
}

// Save the properties file
counts.put("total", "" + total);
try
{
counts.store(new FileOutputStream(filename), null);
System.out.println("Saved " + total + " total hits in " + filename);
}
catch (IOException e)
{
System.err.println("Error saving results to file: " + filename);
return;
}
}
}

28 thoughts on “Surfacing Google Analytics stats in DSpace

  1. Bram Luyten

    Great work Stuart !
    Is your experience with the API that it responds quickly or slow ?

    Would be interested to compare whether it becomes slower for big numbers.

  2. stuart Post author

    Hi Bram, The stats are downloaded in pages of 1,000 at a time, and is done so ‘offline’ by a daily cron job. So speed of response from the API isn’t really a problem. (At the moment, it seems to take about second or so per 1,000 results)

  3. Hardik Mishra

    Hello Sir,

    I have set googleanalytics.siteid = UA-10719208-1
    and i do have profile id = 21619613

    Which one i need to set.

    Q 2 : Does the code works locally ?

  4. Urban Andersson

    This is very interesting!

    Although it fails to compile on my server – I get the error message “[INFO] Compilation failure/…../GoogleAnalyticsHitCounter.java:[96,6] load(java.io.InputStream) in java.util.Properties cannot be applied to (java.io.FileReader)”

    I refers to the loadCounter() static, but I am not quite sure what to look for here…
    Java version is jdk1.5.0_15

  5. Stuart Post author

    Hi Urban,

    I think loading the contents of a Properties file using a FileReader was only introduced in Java 1.6.

    Try changing the line:

    counts.load(new FileReader(filename));

    to

    counts.load(new FileInputStream(new File(filename)));

    You’ll also need extra import lines at the top of the file:

    import java.io.File;
    import java.io.FileInputStream;

    Thanks,

    Stuart

  6. Gary

    when I got the updated analyticscounts.properties, Should I need to restart the jspui service in tomcat to get the updated count?

  7. Stuart Post author

    Hi Gary,

    I wrote the code a while ago, and haven’t looked at it for a while. Looking at it again, I think it should reload the data every hour, although it was never fully tested (was more of a proof of concept) so it might not work fully.

    Thanks,

    Stuart

  8. Paulo Jobim

    Dear Stuart
    I have addapted this code to use with the xmlui Interface.
    I acctually store the hitcounts in a dc field so I can browse items by Hitcounts. Recentlly the code stopped working, giving me an error:Authentication failed : Captcha required
    the code still works on an instance of dspace in my Macbook but not in the server of the Institute.
    Any hints on how to solve this.
    Paulo

  9. Stuart Post author

    Hi Paulo,

    I’ve not seen this error myself before. http://code.google.com/apis/gdata/docs/auth/clientlogin.html#Examples looks useful. I think it may be that the API is really intended to present data to a user, rather to a system that uses it, so a user would be able to re-authenticate and solve the Captcha.

    Another Google help page also suggests:

    If a user supplies an incorrect username or password, or a similar error occurs, the AuthenticationException is thrown. If your application uses ClientLogin to authorize, and a program requests a token too frequently, the user is presented with a captcha challenge response. (links to the URL above).

    I hope that helps,

    Stuart

  10. Michael Guthrie

    Hi Paulo and others looking at incorporating the Google Analytics API into DSpace. Starting from some of the code on this page we at OpenRepository.com have been able to get a pretty nice result for the statistics of our repositories. For some background on what and how we did it, click through to here: http://openrepository.com/products/enhanced-statistics From there you can click through to the demo repository and see some examples of the API in action. Hope this gives you all some impetus and hope as to what can be achieved.
    Bests,
    Michael

  11. Librarian

    hi,
    I was trying add this, while running mvn commands i get following build error.

    after creating directory i have downloaded that google analyticshitcounter.java, saved the gdata-src in [dspace-src]/dspace-api/src/main/java/org/dspace/app/ folder and after that i extracted the gadata-core 1.0.jar to the same app folder then i tried to run command
    C:\dspace\bin>mvn install:install-file -DgroupId=com.google.gdata -DartifactId=g
    data-core -Dversion=1.0 -Dfile=gdata-core-1.0.jar -Dpackaging=jar
    [INFO] Scanning for projects…
    [INFO] Searching repository for plugin with prefix: ‘install’.
    [INFO] ————————————————————————
    [INFO] Building Maven Default Project
    [INFO] task-segment: [install:install-file] (aggregator-style)
    [INFO] ————————————————————————
    [INFO] [install:install-file]
    [INFO] Installing C:\dspace\bin\gdata-core-1.0.jar to C:\Documents and Settings\
    Administrator\.m2\repository\com\google\gdata\gdata-core\1.0\gdata-core-1.0.jar
    [INFO] ————————————————————————
    [ERROR] BUILD ERROR
    [INFO] ————————————————————————
    [INFO] Error installing artifact ‘com.google.gdata:gdata-core:jar': Error instal
    ling artifact: File C:\dspace\bin\gdata-core-1.0.jar does not exist

    [INFO] ————————————————————————
    [INFO] For more information, run Maven with the -e switch
    [INFO] ————————————————————————
    [INFO] Total time: 1 second
    [INFO] Finished at: Fri Apr 02 15:53:09 GMT+05:30 2010
    [INFO] Final Memory: 3M/254M
    [INFO] ————————————————————————
    C:\dspace\bin>

    kindly let me know as early as possible, what could be the reason i am getiing this error

    Thanking you in advance
    with regards

  12. Stuart Post author

    Hi,

    You need to adjust this command:

    C:\dspace\bin>mvn install:install-file -DgroupId=com.google.gdata -DartifactId=gdata-core -Dversion=1.0 -Dfile=gdata-core-1.0.jar -Dpackaging=jar

    Change the part that reads ‘-Dfile=gdata-core-1.0.jar’ and add a path. E.g. ‘-Dfile=C:\dspace-src\dspace-api\src\main\java\org\dspace\app\gdata-core-1.0.jar’ – adjust it as appropriate. However, you don’t need to unzip the files in your dspace source directory as they don’t really belong there. Unzip them n a temporary location, then you can delete them once you have run the ‘mvn install:install-file’ command.

    Thanks,

    Stuart

  13. Shashidhar Chaturvedi

    Hi Sir,
    I am using Dspace 1.6. I want to implement captcha in my JSPUI interface.
    Please guide me how can i implement captcha in new user registration page in JSPUI interface.

    Thanks
    Shashidhar Chaturvedi

  14. Nikhil George

    Hello ,
    I made all the files and added all necessary configurations.
    Everything went fine until i run
    dsrun org.dspace.app.googleanalytics.GoogleAnalyticsHitCounter

    I get the following error:

    Error in launcher.xml: Invalid class name: org.dspace.app.googleanalytics.GoogleAnalyticsHitCounter.java

    I’m running dspace1.6 and JSPUI interface
    Please help to rectify this error..

    Thanks in advance

  15. Stuart Post author

    Hi Nikhil. Did you run ‘mvn package’, ‘cd dspace/target/dspace-1.6.0-SNAPSHOT-build.dir/’, then ‘ant update’ ?

  16. Nikhil George

    Yes i first installed all the 3 jar files using mvn install, then i ran mvn package and ant update both was sucessful…i do’t hav a clue abt wat the problem is…please advice..

  17. Nikhil George

    Hello,
    Thank you Stuart for giving me the previous hint.
    I rebuild the dspace, now dsrun command is running fine, but now i ran into another problem that is, wen ever i click on an item it shows “Internal Server Error”, instead of item description.
    i checked the dspace log it look’s like this:

    An error occurred at line: 286 in the jsp file: /display-item.jsp
    GoogleAnalyticsHitCounter cannot be resolved
    283:

    I hav my dspace in …/tomcat/web2/ROOT.
    Please help..

  18. Nikhil George

    Hello Stuart,
    I think you did’t understand my error, i’ll try to make myself more clear. When i click on an item title to view it, i get “internel server error”. I checked the dspace log and found this error:
    An error occurred at line: 286 in the jsp file: /display-item.jsp
    GoogleAnalyticsHitCounter cannot be resolved
    285:String path = “http://dspace.jubileemission.in/handle/” + item.getHandle();
    286:String count = GoogleAnalyticsHitCounter.getPageCount(path);

    It’s telling that dspace can’t resolve GoogleAnalyticsHitCounter in line String count = GoogleAnalyticsHitCounter.getPageCount(path);

    Please help…i think i’m jst a step away frm implementing the google analytics in my dspace….please advice….

  19. Stuart Post author

    Hi Nikhil,

    I’m afraid that I’m not sure what is causing your error. Since the dsrun command is working, GoogleAnalyticsHitCounter must exist, so without access to your server I can’t really look to see what is going wrong. If you want to check it exists in your JSPUI, you can look in [tomcat]/webapps/jspui/WEB_INF/libs/, and run ‘zipinfo dspace-api-*-SNAPSHOT.jar | grep GoogleAnalyticsHitCounter’ to check that the class exists in the dspace-api jar file.

    Thanks,

    Stuart

  20. Elvi

    Hello Stuart,

    How can you apply the code for display-item.jsp in item-view.xsl if I am using xmlui instead of jspui?

    Thanks.

  21. Pingback: Some thoughts on developing my first Confluence plugin -

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>