CINES - Centre Informatique National de l’Enseignement Supérieur

C.I.N.E.S. Centre Informatique National de l’Enseignement Supérieur

Conversions

conversionA file format conversion is the transformation of a set of files stored in a format considered non-permanent in another format considered permanent, to ensure the readability of the information they contain.
It may act upstream of the archive, if files were produced in formats that are not perennial or during the file storage, if the archived format has become obsolete.

At CINES, we chose to adopt the term “logic migration”, which includes format conversion itself and all stages of reflection and preliminary tests.
This is explained in the “logic migration ” process that we detailed in the definition of our business processes.

See mapping.

Before embarking on any actual conversion, it is very important to conduct a test phase, accompanied by a feasibility study in order to completely control the operation. This includes information:

Any format migration generally results in degradation of the information to be retained: in the shape and / or content. The important thing is to preserve thefidelity of the document.

Rules adopted at CINES on logic migration

The goal is to make the fewest migrations possible. That is why the CINES performs a rigorous selection of the accepted formats for archiving.
For more information on the selection of file formats, see the section “Selection”.

Any format conversion must take place with the explicit agreement of the submission service side. However, once the authorization given, the operation is transparent to him and the CINES is responsible for providing the link between the original submitted files and their migrated form.

The archiving of migrated files in PAC is done in the same way as for any archives project and does not delete the original archived file.
In accordance with ISO 27001, the last two versions of migrated files will be stored in addition to the original file:

If F(0) is the file in the original format, and F (n) the file in its last form migrated, PAC retains F(0), F(n) and F(n-1).

Thus, during the third migration of a file, it is necessary to destroy the first migrated form, and so on.

In the case where an AIP consists of several files, only the format file considered not sustainable is (or are) migrated. To avoid redundancy in storage the duplication of the AIP shouldn’t be in its entirety. The links between the various files which composed the initial AIP and the migrated AIP are transcribed through index tables.
The archiving of migrated files in PAC respects, however, the initial AIP unity.

The sip.xml accompanying these migrated files is a reduced form of the “classic” sip.xml : it contains only the metadata necessary for its identification and its connection to the initial AIP. The initial AIP remains the “reference” for the users.
In addition, it is necessary to well document and perpetuate a certain amount of information on the operation of “logic migration”and to directly link this data to the migrated files:

These informations are compiled into an XML file of metadata. There is a metadata file for each migration operation and all migrated files in the same migration operation refer to the same metadata file.

Currently, no logic migration has been conducted on the archives kept in PAC.
The specifications described above allow us to consider the conduct of such an operation.

Two tools of the European program “Planets” to help preservation planning.

plato-logoPlato

A tool for analyzing different scenarios giving solid and well documented recommendations to help in choosing of a strategy for preservation planning, according to predefined criteria.

opf-site-logo

 

Open Preservation Foundation

A dedicated research and experimentation environment where results can be evaluated and shared with the community at large. [/notification]