Progress Report No.4

Anna Sexton
4 November 2002- 2 May 2003


1. LEADERS user survey

As reported in Progress Report No.3 the LEADERS team have conducted a survey of archive users across different archive repositories as a means of establishing a profile of the ‘typical’ archive user. This profile is acting as a basis for selecting a sample of users who can provide more detailed information about their needs and feedback on the LEADERS demonstration application.

The surveying was divided into two 8 week phases. The first phase ran from 7 October 2002-6 December 2002 and the second phase ran from 13 January 2003-14 March 2003. The participants were:

Phase One
* The National Archives [formerly named Public Record Office] (National Archive)
* University College London Special Collections (Specialist Repository)
* Wellcome Institute Archive and Manuscript Service (Specialist Repository)
Phase Two
* Dorset Record Office (Local Government Archive)
* Birmingham City Archives (Local Government Archive)
* University of Glasgow Archive Service (A Specialist Repository that specialises in collecting Business Archives)

The table below shows the number of completed questionnaires returned by each repository:

Repository No of Completed Questionnaires
The National Archives (formerly named Public Record Office) 404
Dorset Record Office 109
Wellcome Institute Archive and Manuscript Service 51
University College London Special Collections 22
University of Glasgow Archive Service 18
Birmingham City Archives 13

When looking at these numbers it is clear that if the LEADERS results were taken as they stand and used to build a ‘typical archive user’ profile the result wound be skewed in favour of the National Archives’ and could not be taken as representative of the average user type. The LEADERS team have developed a methodology for weighting the survey results to overcome this potential problem.

The results have been analysed and are discussed in a preliminary report entitled ‘LEADERS User Survey Results’ to be tabled at the Management Committee meeting (12 May). The team also plan to disseminate these findings at the ACH/ALLC 2003 conference and the Society of Archivists 2003 conference as well as publishing one or more papers in recognised journals.

2. Building the demonstrator application

The LEADERS team have now reached an agreement with BookMARC to help with the development work involved in producing some aspects of the LEADERS toolkit and the LEADERS demonstration application.

In order to help BookMARC visualise what kinds of presentations of digital forms of archive documents the team want to create through the integration of TEI, EAD and EAC, the team have produced example screens and stylesheets for one item within the Orwell Collection. These screens have been shown at a number of talks given to student and archivists and positive feedback has been received on each occasion.

BookMARC have created a web site with a back-end content management system which aids collaboration by providing instant access for the team into where BookMARC have got to in the development process. Tools on the site allow the team to check, add and edit internal notes, draft reports, test sites and the web log.

Following on from technical discussions between BookMARC and the team an architecture for the application has been developed (see appendix 1). Initially the team thought that the demonstrator would be developed on the Microsoft .NET framework using ASP.NET and if necessary C#. However, it has become clear that an open source java based application which can run on Apache in both Windows and Linux environments is more preferable. This is because such an application will be more effective in promoting adoption by potential end users because of very low infra-structural costs and portability.

BookMARC have already begun work on building the search and retrieval functionality of the demonstrator and the team is confident that we will have something useful to show at this year’s round of conferences beginning with ALLC/ACH at the end of May.

3. The LEADERS toolkit

BookMARC’s involvement in the LEADERS Project extends beyond developing the demonstrator application in that the web services and stylesheets created for the demonstrator will also form a fundamental part of the final toolkit. The main deliverables included in the toolkit are now seen as being:

* LEADERS Schema (based on EAD 2002 but incorporating elements from the TEI and NISO MIX)
* TEI for archives DTD/Schema
* EAC DTD Schema
* Set of XSLT stylesheets
* Set of web services
* Documentation for the set of tools including recommendations on the use of server technology and search engines

The development of the LEADERS Schema is currently in progress and has involved a substantial review of the metadata options available in TEI, EAD, EAC and the NISO MIX standard. The team have developed a structure for the schema which is based on EAD 2002 with item level descriptions being enriched through the introduction of TEI and NISO MIX elements within EAD’s <altformavail> element referenced through namespaces. This structure allows both the original archive document and the digital forms of the archive document to be adequately described. Appendix 2 contains an example of item level metadata (in XML) for one document within the Orwell Collection which illustrates how the description of the original archive document, the TEI transcript and the digitized images can be brought together within an EAD-based finding aid through the use of the LEADERS Schema. Having completed the intellectual thinking behind the Schema, the actual building of it is now underway. The team have decided to base the Schema on the W3C Schema Standard and are currently testing preliminary results.

4. Dissemination

4.1 Conferences and meetings

Chris Turner attended a briefing meeting with BookMARC in Portugal on 4 December 2002.

The team’s proposal of a session (3 related papers) for the ALLC/ACH 2003 Conference has been accepted. The session is called ‘Integrating TEI and EAD to Create Usable and Re-usable Archival Resources’ and the papers are:
* Focusing on the needs of the end user community (Elizabeth Hallam-Smith)
* Integrating EAD and TEI: the resolution of metadata overlaps (Anna Sexton)
* Developing a generic toolkit: architecture and technology issues (Chris Turner)
The conference which will take place on 29 May -2 June in Athens, Georgia, USA and Susan Hockey is also attending.

The Society of Archivists have asked the team to present one paper at the September 2003 conference (Southampton) entitled ‘Developing new technologies in-line with user needs’. The team have also put in a proposal for the DRH 2003 Conference which will take place in September in Cheltenham.

4.2 Talks

Recent talks given by the team are outlined in the table below:

Speaker Talk Title Audience Date
Anna Sexton and Chris Turner Introduction to the LEADERS Project Society of Archivist’s EAD/Data Exchange Group 14 November 2002
Chris Turner Open Archive Initiative Briefing SLAIS Research Group 5 February 2003
Anna Sexton and Chris Turner LEADERS Update Society of Archivist’s EAD/Data Exchange Group 13 March 2003
Chris Turner Web Services – An Overview SLAIS postgraduates on MA in Electronic Communication and Publishing (with invited guests from UCL’s Information Systems and History Department) 27 March 2003
Anna Sexton EAD, EAC, TEI and Integrated Access to Archives SLAIS postgraduates on MA in Archive and Records Management 28 March 2003
Anna Sexton and Chris Turner Using EAD and TEI to create usable and reusable archive resources Humanities Research Institute, University of Sheffield 7 May 2003

4.2 Publications

Anna Sexton and Chris Turner’s article entitled ‘Expanding the role of EAD: providing adequate metadata for digital as well as original archive documents’ has been published in Vine (Vol.32, No.4, 2003).

The team’s paper from the DRH 2002 conference has been accepted for publication in the Conference Proceedings (published by Oxford University Press).

Chris Turner and Anna Sexton have also been invited to write an article for the Journal of the Society of Archivist’s (submission September 2003 for publication in April 2004).

5. Advising UCL’s history department

UCL’s History Department are currently putting in a bid to the AHRB for a follow-on to their current English Monastic Archives Project. Chris Turner and Anna Sexton were approached by the department for advice on the technical architecture needed for the second project. This involved attending three meetings and reviewing the existing database used in the first project. This collaboration and communication between SLAIS and the History Department is a useful development.

6. Training

Chris Turner attended an OAI Forum workshop in Lisbon (5-7 December 2002) and has also been to a series of lectures on Web Services run by the British Computing Society’s Object Orientated Programming Section (29 January-19 March 2003).

7. Follow up project

The team have been working towards the development of a viable project proposal that will build on the work that will be completed by the LEADERS team by the project’s finishing date of March 2004. The team are looking to work with the National Archives to test the validity of the toolkit in a real production environment working with a complete collection. This would involve some initial development and expansion of the existing tools before managing the digitisation and encoding of the chosen archival material. The team are currently looking at the various funding bodies that could be applied to and the various ways that the proposal could be modified to fit the demands and funding limits of each organisation.

Appendix 1: Architecture for LEADERS demonstrator application

Appendix 2: XML encoding for one item within the Orwell Collection (document entitled ‘Notes on the Spanish Militias) according to the principles of the LEADERS Schema

<c level="item">
<did>
<unittitle>Notes on the Spanish Militias</unittitle>
<unitdate certainty="medium" normal="1938-1939">c.1938-1939</unitdate>
<unitid>GB 0103 ORWELL A/3/b</unitid>
<physdesc>
<extent>1 item (8 sheets)</extent>
<physfacet type="material">Typescript on paper sheets with manuscript additions and corrections in the margins and in the body of the text; back of sheets left blank.</physfacet>
</physdesc>
<abstract>Notes giving autobiographical account of Orwell's time as a fighter in a Spanish Militia called the POUM</abstract>
</did>
<controlaccess><!—Index terms go in here for names, places, topics and dates -->
</controlaccess>
<scopecontent><!—Scope and content info goes in here --></scopecontent>
<!-- In EAD 2002 there is no provision for recording rich metadata for digital representations, which is an essential to LEADERS. Our schema will expand the elements available within <altformavail> using XML namespaces to reference TEI and NISO MIX elements as well as allowing some EAD elements to be repeated. This is illustrated below -->
<altformavail type="transcription">
<altformavail>
<unittitle>Notes on the Spanish Militias: An Electronic Transcript</unittitle>
<unitid xlink:type="simple" xlink:href="../teidocs/GB 0103 ORWELL A-3-b-t.xml">GB 0103 ORWELL A/3/b/t</unitid>
<unitdate type="creation">April-May 2002</unitdate>
<extent>18 Kbytes</extent>
<phystech><p>an XML file encoded according to TEI P4</p>
</phystech>
<origination role="transcriber/encoder">
<name>Anna Sexton</name>
<address><addressline>The LEADERS Project</addressline><addressline>School of Library, Archive and Information Studies</addressline><addressline>University College London</addressline>
</address>
</origination>
<tei:funder>Arts and Humanities Research Board
<address><addressline>Whitefriars</addressline><addressline>Lewins Mead</addressline><addressline>Bristol</addressline><addressline>BS1 2AE</addressline>
</address>
</tei:funder>
<tei:publicationStmt>
<tei:authority>The Orwell Trust</tei:authority>
<address></address>
<tei:availability>
<p>Copyright in the encoding belongs to University College London.</p>
<p>Copyright in the intellectual content belongs to The Orwell Trust, from whom permission must be obtained for any reproduction of any part of this transcription</p>
</tei:availability>
</tei:publicationStmt>
<tei:encodingDesc>
<tei:samplingDesc>
<p>Only the first three pages of the original are included in this electronic transcription.</p>
</tei:samplingDesc>
<tei:editorialDecl><!—desc. of editorial decisions goes in here -->
</tei:editorialDecl>
</tei:encodingDesc>
</altformavail>
</altformavail>
<altformavail type="online images>
<altformavail>
<unittitle>Notes on the Spanish Militias: On-line Digital Image No.1</unittitle>
<unitid xlink:type="simple" xlink:href="../images/Orwell/jpegs/GB 0103 ORWELL A-3-b-oi1.jpg">GB 0103 ORWELL A/3/b/oi1</unitid>
<unitdate>2003-02-25</unitdate>
<extent>144.7Kbytes</extent>
<mix:MIMEType>image/jpeg</mix:MIMEType>
<mix:CompressionScheme>JPEG</mix:CompressionScheme>
<mix:ImageWidth>958</mix:ImageWidth>
<mix:ImageLength>1196</mix:ImageLength>
<mix:DeviceSource>Adobe Photoshop 7.0</mix:DeviceSource>
<mix:SourceTyp_cf2 >TIFF file image of original document created by digital camera</mix:SourceTyp_cf2 >
<mix:SourceID>GB 0103 ORWELL A-3-b-ri1.tif</mix:SourceID>
<origination>
<name>Anna Sexton</name>
<address><addressline>The LEADERS Project</addressline><addressline>School of Library, Archive and Information Studies</addressline><addressline>University College London</addressline>
</address>
</origination>
</altformavail>
<altformavail>
<unittitle>Notes on the Spanish Militias: On-line Digital Image No.2</unittitle>
<unitid xlink:type="simple" xlink:href="../images/Orwell/jpegs/GB 0103 ORWELL A-3-b-oi2.jpg">GB 0103 ORWELL A/3/b/oi2</unitid>
<unitdate>2003-02-25</unitdate>
<extent>144.7Kbytes</extent>
<mix:MIMEType>image/jpeg</mix:MIMEType>
<mix:CompressionScheme>JPEG</mix:CompressionScheme>
<mix:ImageWidth>958</mix:ImageWidth>
<mix:ImageLength>1196</mix:ImageLength>
<mix:DeviceSource>Adobe Photoshop 7.0</mix:DeviceSource>
<mix:SourceType>TIFF file image of original document created by digital camera</mix:SourceTyp_cf2 >
<mix:SourceID>GB 0103 ORWELL A-3-b-ri2.tif</mix:SourceID>
<origination>
<name>Anna Sexton</name>
<address><addressline>The LEADERS Project</addressline><addressline>School of Library, Archive and Information Studies</addressline><addressline>University College London</addressline>
</address>
</origination>
</altformavail>
<altformavail>
<unittitle>Notes on the Spanish Militias: On-line Digital Image No.3</unittitle>
<unitid xlink:type="simple" xlink:href="../images/orwell/jpegs/GB 0103 ORWELL A-3-b-oi3.jpg">GB 0103 ORWELL A/3/b/oi3</unitid>
<unitdate>2003-02-25</unitdate>
<extent>144.7Kbytes</extent>
<mix:MIMEType>image/jpeg</mix:MIMEType>
<mix:CompressionScheme>JPEG</mix:CompressionScheme>
<mix:ImageWidth>958</mix:ImageWidth>
<mix:ImageLength>1196</mix:ImageLength>
<mix:DeviceSource>Adobe Photoshop 7.0</mix:DeviceSource>
<mix:SourceTyp_cf2 >TIFF file image of original document created by digital camera</mix:SourceTyp_cf2 >
<mix:SourceID>GB 0103 ORWELL A-3-b-ri3.tif</mix:SourceID>
<origination>
<name>Anna Sexton</name>
<address><addressline>The LEADERS Project</addressline> <addressline>School of Library, Archive and Information Studies</addressline><addressline>University College London</addressline></address>
</origination>
</altformavail>
</altformavail>
</c>


Copyright © UCL 2002