Will the NAA revolutionize the way we maintain Australia's official historical record?
Let your thoughts drift for a moment, if you will, to the leafy streets of central Canberra, where the engines of Commonwealth government are nestled one against the next and the deafening silence of the wheels of government almost drowns out the birds in the trees.
In one such department - we'll call it Department X, but it could be any similar agency - workers write documents on their computers, then e-mail the files to each other as drafts gradually progress to final copy. Spreadsheets fill computer screens, and PowerPoint is everywhere. Yet while documents come to life on screen, when it comes time to archive them a seemingly archaic method is used: documents are printed, then filed according to age-old rules for records management.
With electronic document creation and publishing ubiquitous, you would think that the digital content our government agencies are producing would naturally be stored in some sort of a digital archive. Yet more than a decade after government agencies stopped buying typewriters, such a situation as above remains surprisingly common.
Department X's archiving strategies manager is unrepentant. "Electronic storage and backup is just for that - for backup purposes," says the manager, who asked not to be named. "People think what we're doing is a little backward but I've seen it go wrong - and to me it's something that's too important to do wrong.
"Once you've gone electronic it's very hard to go back. I've worked in agencies where electronic record keeping has failed, and I need to feel confident that these issues can be resolved. So for now, we're going to stick with the tried and true method."
It may sound anachronistic and more than a little counterproductive, but the printing of digital records on paper for long-term archiving remains the de facto method of archiving across many sectors of government. We have, simply, been dealing with paper for far too long to give it up now.
Handling the Digital Explosion
Reliance on paper remains a challenge for the National Archives of Australia (NAA), whose role as manager of long-term historical government records has become increasingly difficult in recent years as the volume of information government produces continues to rise.
Legislation governing long-term records archiving (the Archives Act 1983) has long delegated responsibility for managing records to individual departments, which were seen as best able to utilize the information and so best suited to retain custody of it. The NAA was simply expected to archive whatever the departments sent to it - but this became increasingly difficult as the explosion of digital document production, and matching paper copies, forced the agency to reconsider its archiving policies.
In March 2000, largely as a result of this growth, the NAA fundamentally changed its archiving policy and stopped accepting just any sort of information. Rather, the long-term value of information is assessed to decide what information should be classified as national archives. This approach differentiates between "temporary records" that are of no use after a specific period of time, and "archival records" - those that "have a value that outweighs the business need for which they were created", in NAA's words. Whether a record qualifies as worthy of being archived depends on its conformance with at least one of five NAA objectives (see "NAA archiving objectives", page 23).
That is a change from conventional archiving regulations, which allow departments to archive anything they want as long as their records management procedures meet the requirements of the international ISO 15489 standard (previously Australian Standard 4390-1996). This standard, adopted in March 2002, has clarified expectations for government agencies' internal archiving processes, as well as providing a common target for future development - which in the NAA's case now includes the creation of a single digital archiving system capable of storing anything that is sent its way
The catch: the system not only needs to be intelligent and broad enough to accept content in any format the departments are likely to use, but also must preserve the structure of that information so it can be reconstructed at any given point in the future. This makes it inherently more complex than paper-based archives, where the right environmental conditions and knowledge about size and weight parameters make it relatively straightforward to keep records for centuries.
As long as it is physically intact, paper will always be legible. Reconstruction of digital content, however, requires both a data file and instructions for reading it; without it, most data will in 25 or 50 years' time be as useless to us as Egyptian hieroglyphics were until Frenchman Jean-Francois Champollion broke their code in 1822, chiefly through deciphering the markings on the Rosetta Stone.
Deciphering the hieroglyphics was the life's work of the brilliant Champollion, whose work laid the foundations of modern Egyptology. His experiences also offer a valuable lesson for government agencies that have yet to appreciate the importance of framing data structures in terms of both short-term operational uses and long-term archiving applications.
Preserving both the content and the context of digital information, therefore, has become a top priority for the NAA. "Every government agency confronts these issues at one point or another," says Stephen Ellis, assistant director-general for digital government with the NAA.
"They have to create records of their activities for business reasons, and do this not to make a historical record but for quite targeted business, transactional and accountability reasons. The key issue now is the need to sustain the accessibility of the underlying data, which forms the basis of the records in the system, over extended periods of time."
XML: The New Rosetta Stone
While Ellis proposed such a system as far ago as 1993, it is only with the advent of XML (eXtensible Markup Language) and its many derivatives that record-keeping bodies have gained the ability to pair content with meaning in the necessary way.
XML, for example, is the basis of AGLS (www.naa.gov.au/recordkeeping/gov_online/agls/metadata_element_set.html), a standardized metadata set that became AS 5044 in December 2002. AGLS, which is being taken up by archiving agencies at various levels of government, provides 19 descriptive elements that form a common vocabulary for description of digital content and online services. It is an important step towards the interoperability of information across government, and by extension sets a precedent for the gradual rationalization of data formats with an eye to eventual archiving by the NAA.
Because it is largely intended for standardizing information on external-facing Web sites, AGSL is only a small part of the bigger whole - more the entree than the mains when it comes to content interoperability. Efforts to build a broader framework for the standardization and archiving of digital content fall within the scope of the Digital Recordkeeping Initiative (DRI), a multi-jurisdictional project that is working to develop a suite of standards to ensure the efficient long-term preservation of digital content.
DRI's scope spans both Australia and New Zealand, giving it a universality that is hoped to resolve ongoing discrepancies in records management policy between the various levels of government.
While the concepts behind the DRI are becoming clear and well documented, much of its technical direction is now under the auspices of the NAA's Digital Preservation Project (DPP), (at www.naa.gov.au/recordkeeping/preservation/digital/summary.html). Kicked off in mid 2001, the DPP embodies the technological efforts involved in bringing DRI into reality.
Much of the DPP's work centres around automating the conversion from proprietary data formats into XML. Of considerable assistance in this effort has been Xena (http://sourceforge.net/projects/xena/), an open source tool that converts PST, Word, Excel, RTF, PNG and database files into an XML-based file format for ongoing storage and preservation. Xena's ability to accept a broad variety of source materials resolves the NAA's largest technological hurdle in delivering on the promise of the DRI, providing a direction forward for whatever unified repository eventually takes shape.
Use of open source tools and open standards will be essential to engender support for the DRI across the government, says Ellis: "We recognize that in terms of uptake, we have to reduce the cost of this. We have to reduce the effort involved. If people have to reformat everything, every time they do anything for us, they're not going to do it. That's why one of the ideals we would hope to move towards is having everything natively stored in a standardized format."
Learn to Crawl, Then Learn to Archive
Even as the NAA pursues its dream of a unified data archive, individual government agencies - many of which have their hands full running some of Australia's most complex and high-volume IT infrastructures - have their own priorities. This has led to a difficult situation for advocates of the DRI: smaller departments like our Department X, which might normally be good candidates for a limited-scope pilot of new technology, are keeping a watch-and-wait mentality until they can see some runs on the board.
For this reason, many government organizations will be closely watching plans by the Health Insurance Commission (HIC) to embrace the DRI. The HIC, which for four years has maintained its own archive of electronic documents using Tower Software's TRIM environment, plans to use the NAA-sponsored DIRKS methodology - an eight-step methodology for auditing and revising business processes for ISO 15489 compliance - to push itself towards more consistent representation of its data.
"HIC corporate records are managed according to the Archives Act regardless of their format," says Ellen Dunne, general manager of the HIC's Information and Payments Services Division. "A project for sentencing electronic records will be undertaken this year, and in the scope of this project will be the management of permanent retention records. Most likely, these will be converted to XML and transferred to NAA."
Given its key role in delivery of Australian health-care, the HIC plays an important role in the historical record that NAA is charged with creating. It is also an excellent test subject for the progression of DRI from philosophy to actionable technology solution. But will it fly with government archivists at other organizations?
Maybe. "When we have a clear picture of how [other agencies] have been able to implement those systems, that's when we'll start moving forward," says our contact at Department X. Other agencies will likely remain similarly sceptical, pushing back the time frame for broader adoption of the DRI by years. Since the NAA lacks the power to impose particular standards on departments - beyond, of course, the requirements of ISO 15489, which does not mandate electronic document storage - individual agency decision makers effectively have all the time in the world to make the shift.
Is the NAA, then, a toothless dragon that will have to depend on CIOs' good graces to develop a unified national digital archive? Perhaps. But the situation here is still far better than that for the NAA's counterparts in the US and the UK, where problems of adoption and less technological progress have kept digital record keeping in relatively early stages.
"We haven't been able to build on a very strong tradition across central government, of understanding what this is all about and having a comprehensive set of policies and the means to support them," confesses Malcolm Todd, project manager for Sustainability of Electronic Records with the UK National Archives. "Implementing electronic records management almost feels like implementing records management for the first time: it exposes a whole lot of issues that were there before but never appreciated."
Impetus for the move to digital archiving may come in the UK in the fallout from a recent review by Peter Gershon, until recently the chief executive of the Office of Government Commerce, who concluded that creation of a common infrastructure for information processing would reduce costs and improve efficiency across the government.
Acting on those findings, however, has proved to be a protracted process. "We've really had to change gear, which has meant an awful lot of engagement with details and standards work, and question about how far we take the role," says Todd. "We don't have many formal powers. We are recognized as experts in records management in the UK government, but there's a certain amount of uncertainty about how it's all going to play itself out."
Equally uncertain are efforts by the US National Archives & Records Administration (NARA), which is spearheading a major project to build a content-agnostic technical framework for digital records retention. Throughout 2005, NARA will run a series of seminars including industry, university, computer science and government representatives to nail down the specifications of what is eventually expected to become a $US100 million project to be delivered progressively over two years by either Lockheed Martin or Harris Corporation.
That the US government would turn to massive military contractors to develop their digital record-keeping system reflects both the complexity and the importance of the initiative. "We have a great deal of interest in finding ways of making it easy to retain records over long periods of time and across generations of technology," says Lewis Bellardo, deputy archivist of the US with NARA, who recently met Todd, NAA representatives and others in Sydney to review progress on ISO 15489.
The NARA system will be built in what Bellardo calls a "modular and extensible approach" that allows new formats to be accommodated by developing add-ins. Throughout 2005 and beyond, Bellardo expects that NARA will begin pushing out templates that individual departments can begin tailoring for their own use according to their archiving formats. Yet even with the immense funding going into the project, NARA's power to enforce digital record keeping is little more than that of its peers in the UK and Australia.
"We're not going to force agencies to use this technology; we think it will be an attractive option and that they'll be interested in it," Bellardo says. "We have already had some agencies tell us they really can't wait for us to develop this. We anticipate considerable collaboration, and there are going to be occasions where we're just going to have to help [departments] do it by working with them to get control of their data and record types."
The Way Forward
While they confer regularly, the different approaches by archival bodies in the US, the UK and Australia reflect the ongoing diversity of opinion about how an accessible, secure, relevant digital archive can be maintained in the long run. There is clearly no simple answer as to how to make this work smoothly.
Interestingly enough, many requirements of the NAA, NARA and National Archives' efforts have already been satisfied by the Public Record Office of Victoria's VERS project (Victorian Electronic Records Strategy, at www.prov.vic.gov.au/vers/vers/default.htm), which links digital content with descriptors of the representation standard (currently PDF) in which it is stored. VERS is already in use at a variety of Victorian government agencies, and in 2002 the PROV established a VERS Centre of Excellence to further its penetration.
Given the already considerable success of VERS, broader and competing efforts may be reinventing the wheel to some extent. Yet over time, the task confronting national archiving agencies will become only more intense - and agencies will require a flexible infrastructure capable of keeping up. If VERS ultimately serves as little more than a proof of concept for the technologies the NAA is seeking to implement, it will have played an important role.
Government IT decision makers will, over time, play an equally important role. While it may be premature to advocate a wholesale shift away from proprietary data formats, it is hardly premature for IT executives within various agencies to become well apprised of the NAA's efforts and to begin plotting the best way to enable the automated archiving of digital content. After all, while the NAA is aiming to be able to convert incoming content to XML for long-term storage, agencies and departments can significantly improve the quality of archival information by gradually assuming the burden of creating information in meaningful formats.
Just how much departments take up the call remains to be seen. At Department X and dozens like it, paper will continue to be the currency of the land for many years. But over time, early adherence to DRI goals within new projects will help extend record keeping best practice into the digital realm.
"There's no question that we're trying to turn around an ocean liner," says Ellis. "Government agencies are enormously complex, have a high level of accountability, and have to be responsive to changes in policy. We're always aware that you have to have a practical way of implementing [structured record keeping] within normal agency operations. If you can get in at the front, at the requirements stage, you're in a much better position to influence the actual final outcomes. This is basically a business requirement to which we're seeking to find a technical solution."
NAA archiving objectives
In assessing whether records should be preserved in long-term archives, the National Archives of Australia uses five criteria, of which records should meet at least one to be preserved:
1-To preserve concise evidence of the deliberations, decisions and actions of the Commonwealth and Commonwealth institutions relating to key functions and programs and significant issues faced in governing Australia.
2-To preserve evidence of the source of authority, foundation and machinery of the Commonwealth and Commonwealth institutions.
3-To preserve records containing information that is considered essential for the protection and future wellbeing of Australians and their environment.
4-To preserve records that have a special capacity to illustrate the condition and status of Australia and its people, the impact of Commonwealth government activities on them, and the interaction of people with the government.
5-To preserve records that have substantial capacity to enrich knowledge and understanding of aspects of Australia's history, society, culture and people.
In addition, some types of records may be kept because the Australian community holds them, or the information they contain, in high esteem. This may be evident, for example, from continuing high usage rates or by the community expressing its concerns to the responsible authorities.
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.