How to Build, Display and Find METS Objects
Nate Trail Digital Project Coordinator Network Development and MARC Standards Office Library of Congress ALA - June 25, 2007
Outline of Topics » A little about the technology » How we build and display digital objects » Searching » Other Tools » Future Directions » Conclusions
Technologies Used »
XML documents
• •
»
XSLT processes
• •
» »
METS Files MODS Files Transformation Display
MySQL Database Cocoon framework
•
And a little Java
XSLT Transformation
Stylesheet “transforms” it into a simpler structure of elements, labels and values.
MODS XML for Beaux Arts Trio recording
XSLT for Display
XSLT “transforms” it into an HTML or XHTML display for the screen
METS XML for Beaux Arts Trio recording
Web Framework Concepts » User input via URL » Create small programs for easy-to-modify parts » Separate Data from Action and Design
Key Elements in Describing an Object »Bibliographic data »Files »Object type »(sheetMusic,printMaterial audio,video, compactDisc recordedEvent, bibRecord, photo, Article, Score…)
Bibliographic data (1) » Harvested from Voyager ILS (SRU/z3950) » Inserted into MySQL database
Bibliographic Data(2) » Direct Data entry for items not previously cataloged
METS Maker Pipeline – Step 1 »
Bibliographic Data is extracted from the database and converted to MODS
METS Maker Pipeline – Step 2 »
Combine queries of:
• • • •
»
Object type (SheetMusic, RecordedEvent etc.) Files on the server MODS data Rights metadata
Creates a virtual file called “pre-mets”
Files on Server
Object Type MODS MD
Web Pipeline Rights MD
premets
METS Maker Pipeline – Step 3 »
METS Maker XSLT processes the file using the object’s profile and builds a METS File
METS file for Beaux Arts Trio concert recording
Display – Step 1 »
Within-object Navigation
Display – Step 2 »
Descriptive information about the object
Display – Step 3 »
Page-turning or Playlists
(pre-display virtual file )
Display – Step 4 »
Add page framing and breadcrumbs LC Presents header and high-level navigation Breadcrumb navigation LC Presents footer (not in view)
Veterans History Project Displays Veterans History Project Experiencing War Header
Standard Veterans History Project Header
Searching » Lucene is open source and scalable to large
indexes: Australia has 16 million items indexed
» Lucene indexes are easy to configure using XSLT: METS
n
x
t m
Searching in LC Presents » Single box search offered on home page » Collection-level search offered for subcollections » Searches may also be limited to certain fields or object types
Searching in LC Presents » Canned search to create “virtual collections”
Searching in LC Presents » New browsing capability
Browse by Subject, Name or Title
Searching in Veterans History
Limit by conflict, branch, gender, POW status, etc.
Browsing in Veterans History
Browse by last name, state, race/ ethnicity and war/branch
Searching in Minerva (across collections) Choose among collections harvested.
MODS terms to search.
Other Web Tools » SQL – query any Oracle or MySQL tables from the browser
•
Holdings queries (beyond z3950/SRU)
» Site administration • • •
Convert from MARC to MODS or MODS to MARC Convert HTML to XML Link checking
Future Directions » New Behaviors, Profiles • • •
Article OCR Multivolume Monograph Generating RSS data on the fly
» Integrate JHOVE file inspection tool (and MIX metadata) into our METS objects
» Move to JPEG2000 file format for images
Conclusions » The METS standard is flexible enough to
describe multiple kinds of complex objects.
» Profiles in METS help define an object and it’s range of behaviors
» METS and MODS play well together
Conclusions, cont. » MODS is extremely powerful as a structural tool, not just a bibliographic tool
» Consistent, authoritative, structured metadata makes search and display for an object persistent into future software and hardware systems
Conclusions, cont. »
Standards matter. Without standardized structures, we can’t program a site to retrive or display items with any reliability.
»Open Source software is a major step forward for
the ability of libraries and other lower budget institutions to enhance the availability and use of their digital assets.
Questions? www.loc.gov/lcp (LC Presents Music, Theater, and Dance) www.loc.gov/vets (Veterans History Project)
Nate Trail
[email protected] Library of Congress