On DSpace development

Over the past 9 months I’ve had the privilege to hold the position of ‘DSpace Release Co-coordinator’. This has meant that I’ve been able to not only work with a group of dedicated and talented repository developers and to act as liaison with the user community, but to also watch the development process happen from close quarters.

With each release of DSpace a release co-coordinator is elected to manage the process of releasing the software. In most (all?) cases however there has only been one volunteer, meaning a vote has not been used. Each release coordinator brings with them their own unique style of working, collaborating, and decision-making. These inputs mean that the development process has changed slowly over time from the beginnings when it was a funded project hosted by MIT and HP Research Labs, to the truly community driven open source project that it is today.

My contribution to the change has come through the form of a survey at the start of the 1.6 development process to collect the top three new features that the community wanted in the release. It was my desire that these three features should be completed before 1.6 was released. I am pleased to say that this has happened. Prior to 1.6 the way in which the feature set for the next release was decided was usually just based on the effort available within the community, and the interests of those with the effort available. Whilst this has inevitably continued to happen (and to great effect – 1.6 will include many excellent features not in the top 3 list) it did give us a focus for our developments.

This blog post outlines some of the ways that the DSpace development process has changed over the past year, and some ways in which I think it should continue to change.

On the past year:

Almost a year ago we started holding weekly development meetings using IRC (Internet Relay Chat). These meetings were spearheaded by our technical director Bradley McLean and have resulted in the ability for us to have much more co-ordination between developers. Before the weekly meetings any committer could commit any code they chose to, and there was little discussion about the general direction that any release would take. To some extent this freedom continues, however it has become much more of the norm for developers to discuss their plans, and to get the approval of their fellow developers before doing any work.

We have started to use a new issue tracking system called JIRA. JIRA allows us more flexibility than our previous trackers that were provided by SourceForge. The biggest change however has not come through the direct use of JIRA as much of its functionality is identical to SourceForge, but more through the interaction between the weekly development meetings and JIRA. The first 15 minutes of each development meeting is typically devoted to reviewing all new issues (bugs, suggestions, patches) and deciding what to do with them. Sometimes this involves us closing the issue immediately (e.g. if it is out of scope for DSpace), asking for further input from the contributor, or assigning a developer to work with the contributor to resolve the issue (fix the bug / apply the patch / diagnose the problem). One of the problems we used to suffer from with the SourceForge tracker (a problem with our lack of processes rather than with the software), and for which we used to get bad publicity, was the average time for which issues stayed on the tracker. The average was usually over two years! It should be stressed that didn’t mean contributions took that long to be assessed, just that no one ever cleared out old issues, which meant that the average was somewhat skewed.

The final change during the past year has been the addition of some more committers. The most notable of the new committers has been Jeffrey Trimble. His addition to the group is notable as he is the first non-developer to join the group. He was invited to join as he has been an active member of the community for many years, and has been working with us as a ‘documentation gardener’ to ensure that our documentation gets the attention it needs.

I hope that the result of these changes will be that the community finds DSpace 1.6 to be a better piece of software, more in line with their needs, and with better documentation. In addition contributors should have received better and quicker feedback on their contributions.

But what about the future? Where next? I’m sure no one thinks that we have the perfect development community and processes. These are my thoughts on what we could change, although many people and discussions have influenced them heavily, so I can’t claim most of them to be my original thoughts. Some of these thoughts have been expressed elsewhere and by others so won’t be new, others perhaps will be new to you.

On the release coordinator role:

I’m quite often asked, “What does the release coordinator do”? This is perhaps hard to define as each release coordinator has their own style, but I’ll explain what it has meant to me. Primarily it has involved coordination. This isn’t a technical job as it is more akin to project management. Knowing what needs doing, where through the processes of doing these things we are, knowing who is doing what, finding volunteers to undertake tasks, and reporting progress to the community.  It has also cost time performing the role, and, although this was purely a personal choice, has cost me money by paying for the testathon.net server to be run allowing users to test the release.

Being a developer myself has no doubt helped in this process, but I don’t think the role needs to be held by a developer. The developers of DSpace have often been criticised unfairly for not listening to users of the software when deciding what features to develop. Perhaps by having a non-developer in this role could help as their actions are less likely to be perceived this way? I know from experience at recent DSpace conferences, and after requests for feedback, that very often repository managers do not respond to requests for input. If this were due to any perceived barriers between the development and user communities, perhaps this would help break them down?

Traditionally the release coordinator has been chosen once the previous release has been made. It would be more effective if they were chosen three or four months earlier. This would give the benefit of them learning the ropes from the current release coordinator, but more importantly the current and next release coordinators could work together to help decide what is in and out of scope for the current and next versions. This means that even before a release is made, we know where we’re heading with the next version, which may influence some decisions we make today.

On the decision making process:

The introduction of the weekly development meetings has helped the decision making process in a dramatic way. Where developers once worked in isolation, they now work together much more closely. However the day-to-day decisions are still made by developers and the release coordinator. We need to find a way to involve the wider community in these decisions. For example it was a personal decision of mine that we should wait until all three of the requested new features are completed before we release 1.6. This has undoubtedly delayed the release of 1.6. If the community had of decided that two of these features would have been sufficient to make a new release and wait for the third feature in the subsequent release, then the software could perhaps have been released three months earlier.

How do we get this extra input? It would be impractical to involve the whole community in all decisions, so we need a representative sample of the entire community to for a team who can make these decisions. The team needs to include developers, users, and Duraspace staff. A team of 8 to 12 should ensure enough breadth of experience whilst remaining small enough to be effective. Duraspace would decide the Duraspace members, and elections could be held for the two categories of other members (developers and users).

On committers:

First off I want to express the privilege I feel of being a DSpace committer and being trusted to help steer the development of the most widely used repository platform. I say this first because I’m about to launch into a mini rant!

Committers often get criticised for a lot of different things. We get criticised for not listening to users enough (we listen, although often they don’t talk!), we get criticised for following our own agendas (most of the time we follow the agenda of our employers), we get criticised for developing the software too slowly (most of us have day jobs, and DSpace development is only a small part of our roles, and at any given time there is usually only a third of the committers active due to other work pressures). One of the email lists that the developers use keeps us up to date with when code changes are made in our code repository. I know all the committers, and what time zones they operate in, and without exception a good percentage of these code changes occur outside of each committers working day. We work hard to give as much of our time as we can to DSpace, often at the expense of our own time. Just ask my wife how often I’m working on DSpace code either before or after work, or at weekends and holidays! Committers try their best, are a very friendly bunch, and don’t deserve the criticism they sometimes get. Rant over – time for some more productive thoughts!

It is useful to explain how the committers group has evolved over time. The first committers were members of the original HP / MIT project team who initially developed DSpace. The next group of developers who became committers were typically members of funded projects to create some of the first installations of DSpace. They developed some of the early features added to the application. Later still, the new committers were usually asked to join the group because they spent a large amount of their time working with DSpace. These days we do not have so many developers who devote so much of their time to DSpace. This is probably because it is no longer such a big job to install and configure an institutional repository. So how does this affect the committers group?

As I mentioned in my rant, it means that at any time, there is probably only a third of the group ‘active’ and able to give development time and effort, and even then it is likely that most will only be able to give a very small amount of time (not nearly enough to develop large new features). We need to adjust the way the committers group is composed to account for that.

Something else I’d like to note is perhaps the perceived ‘separateness’ of the committers group. Because the committers group isn’t open like the rest of the community, and because there is no way to ‘become’ a committer (other than contribute over time, and wait to be invited) there is probably a perceived barrier. Some developers may think there is no chance of them becoming a committer so will not get involved at all. This is a loss to the community and something we need to address.

My thoughts are that if a community decision-making team existed, then the committers group could change its direction. At the moment being a ‘committer’ is conflating the original meaning that is the rights to add code to DSpace in our code repository, with the role of decision-making. If this decision-making role is given to the newly formed group, then being a ‘committer’ goes back to the traditional meaning. We can then open this group up to anyone who asks for (and needs) commit rights.

Of course allowing anyone to commit code to the code repository comes with the potential for trouble, and this would have to be managed with processes. For example developers who wanted to be granted commit rights would need to have contributed three patches, then for the first six months would have to get the express permission of the decision making group before applying patches, and then they would have finally earned their wings and be granted the full freedom the current committers have.

On the release schedule:

Releases have traditionally happened when everything was ready, and have not followed any prescribed timelines. This has ensured the software evolves at its own pace, but has many negative points such as users not being able to predict when they’ll need technical effort in the future for upgrades, and has slowed down the release of some features that could have been released earlier.

Our development practice has so far been to make a large release every year or two, and to make two or three minor release between them. In a traditional software development model minor releases are only used for small changes or bug fixes. Because our releases are so spread out, minor DSpace releases tend to include much more.

I’d like to see us move to more regular releases, and to keep to only those minor fixes and features for minor releases. If we were to do that, then we should start work on the development of 1.7 as soon as 1.6 is released, whereas previously we would have gone on to 1.6.1. No doubt we’ll need a 1.6.1, but in coding terms this should be developed on a branch of our code repository, not in trunk.

On other roles

So far I’ve concentrated on the technical development of DSpace. However there are many other roles that could be filled to improve the community and software further. Whilst we’ve been lucky to have Jeff working the documentation for 1.6, if we had a small team of documentation specialists working on it, the results would be wonderful documentation complete with screenshots, howtos, usage tips etc. The same goes for other areas such as help screens in DSpace, translations, publicity, training materials, screen casts etc. We need to find good ways of encouraging more contributions from the community, playing to people’s strengths and interests. If all 700+ institutions could donate a small amount of effort, must think what we could do!

These are my thoughts. I’m sure yours are different in some or all aspects. I’m open to any comments, and no doubt my views will continue to change as this subject is discussed further and we get more peoples’ input. I’d love to know what you think!site

2 thoughts on “On DSpace development

  1. Pingback: Ajankohtaista julkaisuarkistoista – Digitaalinen kirjasto

  2. Pingback: Stuart Lewis' Blog » DSpace 1.6 released!

Leave a Reply

Your email address will not be published. Required fields are marked *