Monthly Archives: May 2009

Surfacing Google Analytics stats in DSpace

In the recent survey asking the DSpace community for their top 3 feature requests for DSpace 1.6, the number one most requested feature was statistics. As you’ll know from previous posts, I’m a big fan of Google Analytics.

For the uninitiated, you insert a small bit of JavaScript in your web pages, and Google provide a very rich and powerful analytics service for viewing your site statistics.

Recently Google announced the launch of an analytics API that allows you to remotely query and download the statistics its holds about your site.

I like playing with APIs, so throught I’d write a solution that downloads item splashscreen view statistics from Google Analytics and displays them on the item page:

gajspui

The solution is quite simple. It requires the additon on one Java class into DSpace. This class should be run daily to download the statistics. The same class is used by the user interface to display the statistics. If you want to implement this solution, follow the instructions below:

  • Create a new directory (java package) at [dspace-src]/dspace-api/src/main/java/org/dspace/app/googleanalytics
  • Download the code shown at the bottom of this post, and save it as GoogleAnalyticsHitCounter.java in the directory that you just created.
  • Edit [dspace-src]/dspace-api/pom.xml to add in the dependencies on the Google API libraries:
<dependency>
<groupId>com.google.gdata</groupId>
<artifactId>gdata-core</artifactId>
<version>1.0</version>
</dependency>

<dependency>
<groupId>com.google.gdata</groupId>
<artifactId>gdata-analytics</artifactId>
<version>1.0</version>
</dependency>

<dependency>
<groupId>com.google.collect</groupId>
<artifactId>google-collect</artifactId>
<version>1.0</version>
</dependency>
  • Then download and save gdata-src.java-1.32.1.zip and extract and save (somewhere handy) the jar files: gdata-core-1.0.jar, gdata-analytics-1.0.jar, google-collect-1.0.jar (in zip file as google-collect-1.0-rc1.jar)
  • Inatall each of these by running the following Maven commands, adjusting paths as appropriate:
    • mvn install:install-file -DgroupId=com.google.gdata -DartifactId=gdata-core -Dversion=1.0 -Dfile=gdata-core-1.0.jar -Dpackaging=jar
    • mvn install:install-file -DgroupId=com.google.gdata -DartifactId=gdata-analytics -Dversion=1.0 -Dfile=gdata-analytics-1.0.jar -Dpackaging=jar
    • mvn install:install-file -DgroupId=com.google.collect -DartifactId=google-collect -Dversion=1.0 -Dfile=google-collect-1.0.jar -Dpackaging=jar
  • Next, edit [dspace-src]/dspace-jspui/dspace-jspui-webapp/src/main/webapp/display-item.jsp, and somewhere in the code (choose where you want it), add the following code:
<%
    // See if we can display a counter
    String path = "/handle/" + item.getHandle();
    String count = GoogleAnalyticsHitCounter.getPageCount(path);
    if ((count != null) && (!"".equals(count)))
    {
%>
        <table align="center" class="miscTable">
            <tr>
                <td class="oddRowEvenCol" align="center">
                    This item has been viewed <strong><%= count %></strong> times
                </td>
            </tr>
        </table>
<%
    }
%>
  • If you don’t deploy your user interface as the ROOT webapp, then you’ll have to add the context in the line: String path = “/handle/” + item.getHandle();
  • Now build and deploy DSpace as you would normally (mvn package; ant update; etc…)
  • Edit dspace.cfg and add in the following entries:
    • googleanalytics.username = your-google-analytics@email.address.com
    • googleanalytics.password = your-google-analytics-password
    • googleanalytics.siteid = 123456789
    • googleanalytics.filename = analyticscounts.properties
    • googleanalytics.startdate = 2007-07-17
  • Adjust the email address and password as appropriate.
  • Log in to Google Analytics and find out the first date that you have statistics for. Set this in the start date entry, in the form of yyyy-mm-dd
  • View the dashboard of your Google Anlytics, and look at the URL. Part of it will include ‘id=nnnnnnn‘. Copy the id number and enter it in the dspace.cfg siteid entry.
  • Download and compile your statistics by running (from [dspace]/bin/)
    • dsrun org.dspace.app.googleanalytics.GoogleAnalyticsHitCounter
  • If everything worked as it should, you should now have a file [dspace]/analyticscounts.properties If you look in this file, you find entires in the form of ‘/handle/xxxx/yyyy=55′.
  • Now start tomcat, view an item, and if the handle appears in the downloaded stats, you should see the item count!

As with the DSpace video player solution I wrote about earlier this week, the code is not perfect, and needs to be improved a bit to make it solid, but is a good start if you wanted to use this type of solution. Enjoy!

package org.dspace.app.googleanalytics;

import java.io.IOException;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.Properties;
import java.util.Calendar;
import java.util.Date;
import java.text.SimpleDateFormat;

import com.google.gdata.client.analytics.AnalyticsService;
import com.google.gdata.data.analytics.DataEntry;
import com.google.gdata.data.analytics.DataFeed;
import com.google.gdata.data.analytics.Metric;
import com.google.gdata.util.AuthenticationException;
import com.google.gdata.util.ServiceException;
import org.dspace.core.ConfigurationManager;
import org.apache.log4j.Logger;

public class GoogleAnalyticsHitCounter {

/** log4j category */
private static Logger log = Logger.getLogger(GoogleAnalyticsHitCounter.class);

/** Hit counter */
private static Properties counts;

/** When the counter last loaded? */
private static Date lastloaded;

/** The filename of the counter file */
private static String filename;

/**
* Initalise the system
*/
public static void init()
{
// Load the properties file
Calendar yesterday = Calendar.getInstance();
yesterday.add(Calendar.DATE, -1);
lastloaded = yesterday.getTime();
filename = ConfigurationManager.getProperty("dspace.dir") +
System.getProperty("file.separator") +
ConfigurationManager.getProperty("googleanalytics.filename");
counts = new Properties();
loadCounter();
}

/**
* Get the count for a particular page (e.g. /handle/123/456
*
* @param page The page path
* @return The count. Empty String if unknown
*/
public static String getPageCount(String page)
{
// Check we're initialised
if (lastloaded == null)
{
init();
}

// Reload the hits
loadCounter();

// Get the value
if (page == null)
{
page = "";
}
String count = counts.getProperty(page);

// Return the value
if (count != null)
{
return count;
}
return "";
}

/**
* (Re)load the counter. It is reloaded every hour.
*/
private static void loadCounter()
{
// Do we need to load it?
Calendar hourago = Calendar.getInstance();
hourago.add(Calendar.HOUR, -1);
if (lastloaded.before(hourago.getTime()))
{
try
{
counts.load(new FileReader(filename));
lastloaded = Calendar.getInstance().getTime();
}
catch (Exception e)
{
log.warn("Unable to load google hit counter from " + filename);
}
}
}

/**
* Command line method to collect the statistics from Google Analytics.
*
* @param args No arguments used
*/
public static void main(String args[])
{
// Set up the variables
String username = ConfigurationManager.getProperty("googleanalytics.username");
String password = ConfigurationManager.getProperty("googleanalytics.password");
String siteid = ConfigurationManager.getProperty("googleanalytics.siteid");
String startdate = ConfigurationManager.getProperty("googleanalytics.startdate");
String handle = ConfigurationManager.getProperty("handle.prefix");
String root = ConfigurationManager.getProperty("dspace.url");
String filename = ConfigurationManager.getProperty("dspace.dir") +
System.getProperty("file.separator") +
ConfigurationManager.getProperty("googleanalytics.filename");

// Get the local path
String path = "";
try
{
URL localURL = new URL(root);
path = localURL.getPath();
if (path.endsWith("/"))
{
path = path.substring(0, path.length() - 1);
}
}
catch (MalformedURLException e)
{
System.err.println("Invalid dspace.url URL (" + root + ")");
return;
}

AnalyticsService as = new AnalyticsService("gaExportAPI_acctSample_v1.0");
String baseUrl = "https://www.google.com/analytics/feeds/";

// Login to Google
try {
as.setUserCredentials(username, password);
} catch (AuthenticationException e) {
System.err.println("Authentication failed : " + e.getMessage());
return;
}

// The results
Properties counts = new Properties();

// Keep requesting pages of results from Google until a blank page is found
// pages of 1,000 results at a time
URL queryUrl;
int i = 1;
boolean found = true;
int total = 0;

// Get stats up until yesterday
Calendar yesterday = Calendar.getInstance();
yesterday.add(Calendar.DATE, -1);
SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd");
String enddate = format.format(yesterday.getTime());

while (found)
{
found = false;
try {
String q = baseUrl +
"data?start-index=" + i +
"&ids=ga:" + siteid +
"&start-date=" + startdate +
"&end-date=" + enddate +
"&metrics=ga:pageviews" +
"&dimensions=ga:pagePath" +
"&filters=ga:pagePath%3D~" + path + "/handle/" + handle + "/[0-9]%2B$";
queryUrl = new URL(q);
} catch (MalformedURLException e) {
System.err.println("Malformed URL: " + baseUrl);
return;
}

// Send our request to the Analytics API and wait for the results to come back
DataFeed dataFeed;
try {
dataFeed = as.getFeed(queryUrl, DataFeed.class);
} catch (IOException e) {
System.err.println("Network error trying to retrieve feed: " + e.getMessage());
return;
} catch (ServiceException e) {
System.err.println("Analytics API responded with an error message: " + e.getMessage());
return;
}

for (DataEntry entry : dataFeed.getEntries()) {
String id = entry.getId().substring(70);
id = id.substring(0, id.indexOf('&'));
for (Metric metric : entry.getMetrics()) {
counts.put(id, metric.getValue());
total = total + Integer.parseInt(metric.getValue());
}
found = true;
}

i = i + 1000;
}

// Save the properties file
counts.put("total", "" + total);
try
{
counts.store(new FileOutputStream(filename), null);
System.out.println("Saved " + total + " total hits in " + filename);
}
catch (IOException e)
{
System.err.println("Error saving results to file: " + filename);
return;
}
}
}

Easy pseudo-video streaming for DSpace repositories

A few days ago someone posted an enquiry to the dspace-general email list asking how to embed a video player in DSpace web pages. This was followed up by a lot of replies along the lines of “it would be great if DSpace could do that!”.

I wrote a quick reply saying how I thought it had been implemented, and described the solution as “quick and easy”. I thought I’d better put my money where my mouth is, and prove that it really is quick and easy. So I spent the last hour of my working day making it work, and here is how to do it:

  • Download the JW FLV media player from http://www.longtailvideo.com/players/jw-flv-player/
  • Unzip the download, and copy player.swf and swfobject.js into [dspace-src]/dspace/modules/jspui/src/main/webapp/
  • Add the following code to the bottom of [dspace-src]/dspace-jspui/dspace-jspui-api/src/main/java/org/dspace/app/webui/jsptag/Itemtag.java (before the final ‘}’):
private void showMediaPlayer() throws IOException
{
	try
	{
		Bundle[] bundles = item.getBundles("ORIGINAL");
		if (bundles.length > 0)
		{
			Bitstream[] bitstreams = bundles[0].getBitstreams();
			boolean found = false;
			for (Bitstream bitstream : bitstreams)
			{
				if (!found)
				{
					if ("video/x-flv".equals(bitstream.getFormat().getMIMEType()))
					{
						// We found one, don't search for any more
						found = true;
						
						// Display the player
						HttpServletRequest request = (HttpServletRequest)pageContext.getRequest();
						String url = request.getContextPath() + 
									"/bitstream/" + item.getHandle() + "/" +
									bitstream.getSequenceID() + "/" +
									UIUtil.encodeBitstreamName(bitstream.getName(), Constants.DEFAULT_ENCODING);
						JspWriter out = pageContext.getOut();
						out.println("<script type=\"text/javascript\" src=\"" + request.getContextPath() +
									"/swfobject.js\"></script>\n" +
									"<center><div id=\"player\">Video</div></center>" +
								"<script type=\"text/javascript\">\nvar so = new SWFObject('" +
								request.getContextPath() + "/player.swf','mpl','320','240','9');\n" +
								"so.addParam('allowscriptaccess','always');\n" +
								"so.addParam('allowfullscreen','true');\n" +
								"so.addParam('flashvars','&file=" + url + "&autostart=true');\n" +
								"so.write('player');\n" +
								"</script>");
					}
				}
			}
		}
	}
	catch (SQLException sqle)
	{
		// Do nothing
	}
}

In the same file, find the line that reads private void render() throws IOException” and straight after the opening brace ‘{‘  add a new line that reads:

showMediaPlayer();
  • Rebuild and redeploy DSpace as you would normally (mvn package; ant update; etc)
  • Log in to your DSpace instance as an administrator and go to the bitstream format registry.
  • Enter a new format with the mime type video/x-flv and the file extension flv
  • Now grab yourself an flv video. A quick way of doing this is to use http://keepvid.com/ and to enter the URL of a YouTube video. It will then download this as an flv video.
  • Create a new item in DSpace, and upload this file. It should recognise it as a flash video file.
  • Now view the item, and if the code is working correctly, it will have detected a video exists and will bring up the video player.

vid

As I said, quick, and easy! Now I didn’t say the solution was beautiful, efficient, or written is the best way possible; this is just a proof of concept.

Whilst this solution doesn’t give you proper video streaming, it does give you a halfway house that integrates nicely with DSpace.

Perhaps we should make this is into a pluggable system for DSpace 1.6 where you can register classes that can render file types, and then make a configurable option to register viewers to filetypes? Thoughts?

DSpace 1.6 survey results

Well, the results of the recent DSpace 1.6 survey in which we asked the DSpace community to list the top three features they would like to see in version 1.6 have now been published. The results will probably come as no surprise, but here are the top three features:

  1. Better statistics
  2. An embargo facility
  3. Batch metadata editing

We have now assigned a ‘point person’ to each of these who will drive the process forward to decide how we go about achieving these goals. Obviously this is not an exhaustive list of features that will be in 1.6,  but they will help to guide  development efforts.

The full results of the survey can be seen in a ‘wordle’ at http://www.wordle.net/gallery/wrdl/794098/DSpace_1.6_survey_results

 worlde16

We will also be using Twitter (http://twitter.com/dspacetweets) and its RSS feeds (http://twitter.com/statuses/user_timeline/37160113.rss) to provide updates on version 1.6 as they develop. This will be an interesting experiment to see if this proves a useful way of disseminating development activities as they occur.