How does the Facebook SWORD client actually work?

I’ve been asked a few questions recently about how SWORD clients work, and in particular how the SWORD Facebook client works. The Facebook client is one of the most complete demonstration clients that there is, and as such ‘hides’ a lot of the work that goes on behind the scenes. This post will explain how a SWORD deposit from within Facebook actually works:

  • First off, the user has to select the repository they wish to deposit into. This can either be done by selecting from a dropdown list of known demo SWORD repositories, or by manually entering the URL of a service document:

1-select-repo

  • Most repositories will require their users to authenticate using a username and password. These are typically passed to the SWORD server using HTTP BASIC authentication. Optionally, an ‘on-behalf-of’ user can be specified (see the SWORD specification for what this means):

2-username-password

  • When this initial form is submitted, the client will visit the SWORD server, and request the service document by performing a HTTP GET of the service document URL. In the case of the SWORD Facebook application, it is written using the SWORD PHP library. The PHP library uses cURL to retreive the service document. Using a 3rd party library such as the PHP library makes it *really* easy to do this. Here is the required PHP:
require("swordappclient.php");
$sac = new SWORDAPPClient();
$sdr = $sac->servicedocument($url, $user, $password, $onbehalfof);
  • Hopefully (assuming a valid service document URL, username and password) the client will receive a service document back from the SWORD server. The service document will specify which collections a user may submit to and provide some details about each collection (e.g. name, URL to deposit to, policy, prefered packaging types etc). Different repository platforms interpret the meaning of ‘collection’ differently. In DSpace, these map to DSpace collections, wheras in EPrints, they relate to workspaces within the user’s account.

3-service-document

  • Once you have selected a collection into which you wish to deposit an item, you are presented with a form requesting metadata. Your promted for the type of item, its peer-review status, title, abstract, and first author. You can optionally add second and their author names, and an existing URL for the item.

4-deposit-form

  • So what happens with this metadata you enter? In a nutshell, it all gets crosswalked and wrapped up in a METS document, encoded in SWAP. To see what I mean, look at the following example:
<?xml version="1.0" encoding="utf-8" standalone="no" ?>
<mets ID="sort-mets_mets" OBJID="sword-mets" LABEL="DSpace SWORD Item" PROFILE="DSpace METS SIP Profile 1.0" xmlns="http://www.loc.gov/METS/" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/mets.xsd">
<metsHdr CREATEDATE="2008-09-04T00:00:00"><br /><agent ROLE="CUSTODIAN" TYPE="ORGANIZATION">
<name>Stuart Lewis</name>
</agent>
</metsHdr>
<dmdSec ID="sword-mets-dmd-1" GROUPID="sword-mets-dmd-1_group-1">
<mdWrap LABEL="SWAP Metadata" MDTYPE="OTHER" OTHERMDTYPE="EPDCX" MIMETYPE="text/xml">
<xmlData>
<epdcx:descriptionSet xmlns:epdcx="http://purl.org/eprint/epdcx/2006-11-16/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://purl.org/eprint/epdcx/2006-11-16/ http://purl.org/eprint/epdcx/xsd/2006-11-16/epdcx.xsd">
<epdcx:description epdcx:resourceId="sword-mets-epdcx-1">
<epdcx:statement epdcx:propertyURI="http://purl.org/dc/elements/1.1/type" epdcx:valueURI="http://purl.org/eprint/entityType/ScholarlyWork" />
<epdcx:statement epdcx:propertyURI="http://purl.org/dc/elements/1.1/title">
<epdcx:valueString>Item Title</epdcx:valueString>
</epdcx:statement>
<epdcx:statement epdcx:propertyURI="http://purl.org/dc/terms/abstract">
<epdcx:valueString>Item Abstract</epdcx:valueString>
</epdcx:statement>
<epdcx:statement epdcx:propertyURI="http://purl.org/dc/elements/1.1/creator">
<epdcx:valueString>Lewis, Stuart</epdcx:valueString>
</epdcx:statement>
<epdcx:statement epdcx:propertyURI="http://purl.org/eprint/terms/isExpressedAs" epdcx:valueRef="sword-mets-expr-1" />
</epdcx:description>
<epdcx:description epdcx:resourceId="sword-mets-expr-1">
<epdcx:statement epdcx:propertyURI="http://purl.org/dc/elements/1.1/type" epdcx:valueURI="http://purl.org/eprint/entityType/Expression" />
<epdcx:statement epdcx:propertyURI="http://purl.org/dc/elements/1.1/language" epdcx:vesURI="http://purl.org/dc/terms/RFC3066">
<epdcx:valueString>en</epdcx:valueString>
</epdcx:statement>
<epdcx:statement epdcx:propertyURI="http://purl.org/dc/elements/1.1/type" epdcx:vesURI="http://purl.org/eprint/terms/Type" epdcx:valueURI="http://purl.org/eprint/entityType/Expression" />
<epdcx:statement epdcx:propertyURI="http://purl.org/dc/terms/available">
<epdcx:valueString epdcx:sesURI="http://purl.org/dc/terms/W3CDTF">2009-04-28</epdcx:valueString>
</epdcx:statement>
<epdcx:statement epdcx:propertyURI="http://purl.org/eprint/terms/Status" epdcx:vesURI="http://purl.org/eprint/terms/Status"  epdcx:valueURI="http://purl.org/eprint/status/PeerReviewed" />
<epdcx:statement epdcx:propertyURI="http://purl.org/eprint/terms/copyrightHolder">
<epdcx:valueString>Stuart Lewis</epdcx:valueString>
</epdcx:statement>
</epdcx:description>
</epdcx:descriptionSet>
</xmlData>
</mets>
  • If you examine the XML, you’ll see the metadata in the METS document. For example the title is described on line 14, the abstract on line 17, the author on line 20, and the deposit date on line 31. The mapping of metadata elements from the form fields to the METS/SWAP is fixed in the software.
  • The next stage of the submission is to upload a file to add to the metadata to make the package to deposit. The details of the file you choose are added in to the METS document. This is done using the fileSec and structMap portions of the METS standard:
<fileSec>
<fileGrp ID="sword-mets-fgrp-1" USE="CONTENT">
<file GROUPID="sword-mets-fgid-0" ID="sword-mets-file-0" MIMETYPE="application/pdf">
<FLocat LOCTYPE="URL" xlink:href="SWORD Ariadne Jan 2008.pdf" />
</file>
</fileGrp>
</fileSec>
<structMap ID="sword-mets-struct-1" LABEL="structure" TYPE="LOGICAL">
<div ID="sword-mets-div-1" DMDID="sword-mets-dmd-1" TYPE="SWORD Object">
<div ID="sword-mets-div-2" TYPE="File">
<fptr FILEID="sword-mets-file-0" />
</div>
</div>
</structMap>
  • The mets file (mets.xml) and the uploaded file are the put into a zip file, and deposited to the repository. Again, these two steps are very easy using the PHP library:
require('packager_mets_swap.php');
// Create a new package with the root and directory of the input files, and the root and directory of the created package
$package = new PackagerMetsSwap($rootin, $dirin, $rootout, $fileout);

// Add metadata to the package
$package->setType($test_type);
$package->setTitle($title);
$package->setAbstract($abstract);
foreach ($creators as $creator) {
$package->addCreator($creator);
}

// Add a file to the package
$package->addFile($filename, $mimetype);

// Now deposit the package
require("swordappclient.php");
$sac = new SWORDAPPClient();
$dr = $sac->deposit($depositurl, $username, $password, $onbehalfof, $filename, $packageformat, $pacakgecontenttype);
  • Once the pacakge is sent to the repository, it is up to the repository to decide how to handle the package. In the case of DSpace, two things happen. The first is that a raw copy of the original package is archived in a new item allowing us to see precisely what was deposited. This is hidden from users though. Secondly, the package is opended up and processed. The file is added to the item, and the metadata is crosswalked using XSL to Dublin Core as used by DSpace. An example of part of the XSL used is shown below (mapping dcterms creator to dc.contributor.author):
<!-- creator element: dc.contributor.author -->
<xsl:if test="./@epdcx:propertyURI='http://purl.org/dc/elements/1.1/creator'">
<dim:field mdschema="dc" element="contributor" qualifier="author">
<xsl:value-of select="epdcx:valueString"/>
</dim:field>
</xsl:if>

I hope this post clears up a bit of what goes on ‘behind the scenes’ of a SWORD client. I hope it also shows how easy it can be to create a SWORD client using the PHP library which provides not only code to request service documents and deposit items, but also to create pacakges in a format that is accepted by DSpace and EPrints. Any questions?

13 thoughts on “How does the Facebook SWORD client actually work?

  1. Jason Fowler

    Stuart,

    Nice work on this project. This is the type of project that has the potential for extending the use of DSpace exponentially.

    I’ve been testing the app with my own repository, and I get the following message whenever the upload takes place. My connection is good. Any idea what could be causing it?

    ——

    Fatal error: Uncaught exception ‘Exception’ with message ‘Error parsing response entry (String could not be parsed as XML)’ in /var/www/fb.swordapp.org/swordapp-php-library-0.9/swordappclient.php:129 Stack trace: #0 /var/www/fb.swordapp.org/htdocs/deposit/process/deposit.php(35): SWORDAPPClient->deposit(”, ‘myusername@myhost.com’, ‘mypassword’, ”, ‘/var/www/fb.swo…’, ‘http://purl.org…’, ‘application/zip’) #1 {main} thrown in /var/www/fb.swordapp.org/swordapp-php-library-0.9/swordappclient.php on line 129

  2. Jason Fowler

    That’s it. I’m still on 1.5.1. Will try again when I update.

    Really, really nice app, though. Love the work you’re doing with SWORD. All of it has serious potential at integrating a DSpace submission step into the regular workflow of normal people.

  3. Jason Fowler

    Stuart,

    I have updated to 1.5.2 and after adjusting a few options in my config file, I have gotten close with getting the Facebook app to work with my repository. I can now submit the package, and I can see the atom entry generated when I tail the logs. Unfortunately, I now get a code 202 error from the app.

    Any suggestions?

  4. Jason Fowler

    Stuart,

    A little further information on the problem. Actually, the package is being uploaded, and I see it when I look at my submissions within DSpace. But I still get the 200 error code in the Sword Facebook App, and I cannot see my submission from within the it.

    Thanks,
    Jason

  5. Stuart Post author

    Hi Jason,

    The application has been playing up slightly this morning. I’ve done two things which may help (changed the way the transfer works to not use chunked transfer, and set a Content-Length header). I’ve also restarted the web server which seems to have made it a bit better.

    Could you email me a screen shot of the page that fails if it still doesn’t work for you?

    Thanks,

    Stuart

  6. Mark Jordan

    Hi Stuart,

    I have a question about the use of SWAP. Neither the SWORD 1.3 spec nor the DSpace METS SIP Profile mention it, yet it appears to be handled by all the SWORD server implementations I have looked at (DSpace, ePrints, Fedora). Where is the use of SWAP described in relation to the SWORD spec?

  7. Mark Jordan

    Thanks for the pointer. Sorry to be thick — and please let me know if there’s a mailing list where I can ask these questions — but seems to me that the example METS document you provide, and the METS file in Example 5 included in the dspace-sword-1.3.1 plugin, don’t conform to the DSpace METS Document Profile for Submission Information Packages described at http://wiki.dspace.org/confluence/display/DSPACE/DSpaceMETSSIPProfile, since they don’t contain a MODS dmdSec, they only contain SWAP dmdSecs.

    On the other hand, none of the Structural Requirements in the DSpace profile actually say that conforming documents MUST contain a MODS description; requirement 13 says “at least one dmdSec containing the metadata record for the entire DSpace item” but the closest the profile comes to being prescriptive about MODS is in the Descriptive Metadata section, where it says “As declared in Structural Requirements #16, DSpace requires just one MODS record that describes the entire item” (requirement 16 deals with how DSpace implementations should deal with sourceMD attributes, not about descriptive metadata so it is possible this reference is an editorial slip).

    The reason I am digging out this requirement is that I am considering developing a proof of concept SWORD server for CONTENTdm, and I’d like to know if I need to worry about crosswalking submissions from MODS or just SWAP (at the beginning, anyway).

  8. Stuart Post author

    Hi Mark,

    The best option for an email list might be the SWORD-APP-TECH email list:

    https://lists.sourceforge.net/lists/listinfo/sword-app-tech

    Packaging in SWORD is a big can of worms, and one that personally I think we need to raise again. For example OAI-PMH works well, because you know that you can always get unqualified Dublin Core out of it (at a minimum). SWORD needs a documented and agreed simple packaging standard that we can all support as a minimum. Without that, if we accept different packaging formats, we’ll never achieve the levels of interoperability that we desire. It just so happens that DSpace / Fedora / EPrints all accept the SWAP/METS mix after it was first developed for SWORD in DSpace, but there is no compulsion for repositories to support it.

    Thanks,

    Stuart

  9. Mark Jordan

    Stuart, thanks for the pointer to the list, I’ll check it out. Also, you’ve given me all the info I need to move ahead with my SWORD server. If the standard repo platforms implement SWAP-in-METS as their basic (de facto) packaging format, that combination should be a safe choice to start with.

    Thanks very much,

    Mark

  10. Divino Ignacio

    Hi Stuart,

    Where is the sword plugin for Facebook? The address fb.swordapp.org does not return a valid page. Do you know if this plug-in is available? Could indicate a client application sword ready for use?

    Thank you for your attention,

    Divino Ignácio.

  11. Stuart Post author

    Hi Divino,

    The SWORD Facebook application is no longer available. Due to the way Facebook constantly changes, it was taking too long to keep it up to date. Sorry!

    Stuart

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>