The reference model OAIS – (Reference Model for an Open Archival Information System) – is the result of work of CCSDS – Consultative Committee for Space Data Systems (international standards organization of space agencies) – undertaken at the request of ISO .
This work, which has been associated with representatives of institutional libraries and archives, led in 2002 to draft a document that specifies a very general logical architecture and functionality of an archiving system, and is today an international ISO standard (ISO 14721).
A version 2 [ http://public.ccsds.org/publication…] of OAIS has been published in august 2012.
This new version brings in several changes:
- it takes into account the risk management more explicitly;
- a new category of information is identified: access rights information on documents;
- it mandates the existence of a plan of reversibility (recovery of archived data);
- while considering the destruction of data under certain conditions;
- finally, the concept of “information property” appears, which allows to describe some of the information that we wish to preserve emphasized for a a particular purpose (especially in the case of a format migration to insist on the information we wish to preserve in priority).
Sa traduction française est disponible depuis 2017 (https://public.ccsds.org/Pubs/650x0m2%28F%29.pdf).
OAIS is an abstract model. It defines terminology and concepts. It identifies the actors, describes the functions and information flows, and proposes an information model particularly suited to the problem of digital archiving, even if it does not prejudge the nature of the objects to be archived.
OAIS is not a collection of technical specifications intended to be directly implemented. It is a guide that identifies the problem as a whole and which forced him to ask all the right questions.
As a conceptual model of reference, OAIS is now widely used on the international level, and by most of the institutional actors of the digital archives.
The OAIS model identifies four key roles in a retrieval system:
- An internal actor, the “archive“, that is to say, the operator of the archiving system,
- Three external actors: the “management“, the “producers” and “users“.
- The “management” ensures the function of policy maker. For management, the archiving system fits into an overall strategic plan which is only one factor among others. It is management’s responsibility to support the device, politically, financially, and on the very long term.
- The “producers” are the people, or more likely the organizations, which provide the objects to be archived. Digital objects on which producers work before archiving are SIP (Submission Information Package). Once stored, they become AIP (archival information package), internal objects to the Archive.
- The “ users“, meanwhile, are organizations and people who have access to archived copies of these objects. Digital objects available to users are the PID (dissemination information package). OAIS identifies a particular class of users – the “target user community” – as the priority beneficiary population of archiving service. The services to be provided will be distinct depending on whether the target user community is large or not, expert or “public”, etc..
The OAIS model maps an archiving system according to 6 major functional areas.
- The “ingest” entity receives, checks and validates the objects to be archived. The objects themselves are transmitted to the “storage” entity , while the information necessary for their description and their management over time are transmitted to the entity data management.
- The “storage” entity provides the physical preservation of archived objects. She holds the archived objects available to the “access” entity. In accordance with rules established by the entity “administration”, she supports the achievement of multiple copies and renewal of old media.
- The “data management” entity supports the maintenance of all internal data – database – necessary for long term preservation. It provides to the other system entities the descriptive information of archived objects (including the “access” entity) and any technical and archival management information required.
- The “administration” entity provides overall coordination system. It establishes the rules. It oversees the overall quality of the service and its improvement. It reports to management.
- The “sustainability planning” entity is the monitoring unit and planning of the system. It listens to the external environment and makes recommendations to make the necessary changes, including technological developments like. It prepares and plans these developments. It is also responsible for tracking changes that may take place in the “target user community” target to ensure that the access service remains consistent with changing expectations of users.
- The “access” entity encompasses all the services that interface directly with users.In addition to access control functions, it is mainly to allow users to search through the catalog of archived objects, and provide them with the objects they requested.
The different types of information
In the OAIS model, each object to check is contained in an information package, which includes several types of information to sustain in order to ensure conservation in the long term.
Any object (data content ) archived, is closely linked to its representation information that can be translated into concepts more explicit (for the most part, these are the file formats specifications ).
Representation information gives a signification to the bit-string composing the archive object. It reproduces the content and helps to understand and to interpret it. So, linked to the digital object, it’s what is composing the Information content, according to the OAIS model.
Information content could be:
- Structure information, which explains how other information is organized (ex.: mapping tables between file names and page numbers for a digitized book).
- Or semantic information, giving additional information on the specific meaning of each structure information (ex: if there’s text in the archive, the semantic information gives explanation on the language used).
To be useful, representation information must be adapted to the knowledge of the present or future archive users (i.e. the “Designated Community” of the OAIS).
Let us take an example of Chinese literature fonds from the 14th century, being archived after its digitization. The archiving service must make sure that the designated community (the potential users) knows the Chinese alphabet of the 14th century as well as the codes of this literature and Chinese history. If not, the service will have to make sure that this information is available somewhere on sustainable basis and provide the link to it; or it will have to archive this information just as the digitized documents.
This set, called Information content should be associated with a number of preservation description information (PDI), which is essential to the understanding of the archived object in time. This preservation description information is of several kinds:
- provenance information which document the history of the content information;
- context information which detail the links between content information and its environment;
- reference information which can identify unequivocal the content information;
- fixity information which describe the mechanisms guaranteeing the content information integrity.
These items ( content information + preservation information) are interconnected by means of packaging information.
The descriptive information – information package description – is transmitted to the “data management” entity for the search, order and recover functions of the information preserved in the archive system. It allows the creation of the reference database of the repository.
After this brief presentation of the OAIS model, it may be useful to clarify a point of terminology which is sometimes confusing. Although very similar, OAIS and OAI acronyms designate realities quite independent from each other.
OAI means all projects and achievements in favor of the Open Access Initiative, i.e. free access to “academic” information. The acronym OAI is sometimes developed in Open Archive Initiative (“archives ouvertes” in French), which does not fail to foster confusion with OAIS. .
The term “Open” of OAIS simply means that the model, that is to say the content of the reference document is not ’copyright’ and so it is freely distributable and usable. However, the OAIS model does not prejudge control access to archived information and is therefore completely independent of OAI initiatives.