A file format conversion is the transformation of a set of files stored in a format considered non-permanent in another format considered permanent, to ensure the readability of the information they contain.
It may act upstream of the archive, if files were produced in formats that are not perennial or during the file storage, if the archived format has become obsolete.
At CINES, we chose to adopt the term “logic migration”, which includes format conversion itself and all stages of reflection and preliminary tests.
This is explained in the “logic migration ” process that we detailed in the definition of our business processes.
Before embarking on any actual conversion, it is very important to conduct a test phase, accompanied by a feasibility study in order to completely control the operation.
This includes information:
- What information is important to maintain in priority in the document;
- What are the processes that will be used (they will be automatic or manual: everything depends on the number of files to migrate);
- if there is already migration software (free or paid);
- and what will be the effects of migration on the files: acceptable losses, not loss, too many losses …
Any format migration generally results in degradation of the information to be retained: in the shape and / or content. The important thing is to preserve the fidelity of the document.
Rules adopted at CINES on logic migration
The goal is to make the fewest migrations possible. That is why the CINES performs a rigorous selection of the accepted formats for archiving.
For more information on the selection of file formats, see the section “Selection”.
Any format conversion must take place with the explicit agreement of the submission service side. However, once the authorization given, the operation is transparent to him and the CINES is responsible for providing the link between the original submitted files and their migrated form.
The archiving of migrated files in PAC is done in the same way as for any archives project and does not delete the original archived file.
In accordance with ISO 27001, the last two versions of migrated files will be stored in addition to the original file:
If F(0) is the file in the original format, and F (n) the file in its last form migrated, PAC retains F(0), F(n) and F(n-1).
Thus, during the third migration of a file, it is necessary to destroy the first migrated form, and so on.
In the case where an AIP consists of several files, only the format file considered not sustainable is (or are) migrated. To avoid redundancy in storage the duplication of the AIP shouldn’t be in its entirety. The links between the various files which composed the initial AIP and the migrated AIP are transcribed through index tables.
The archiving of migrated files in PAC respects, however, the initial AIP unity.
The sip.xml accompanying these migrated files is a reduced form of the “classic” sip.xml : it contains only the metadata necessary for its identification and its connection to the initial AIP. The initial AIP remains the “reference” for the users.
In addition, it is necessary to well document and perpetuate a certain amount of information on the operation of “logic migration” and to directly link this data to the migrated files:
- name and date of the migration operation;
- denomination and version of original formats and target;
- number of files involved;
- conversion software name for and control format used;
- list of the characteristics of the file contents that had to be kept;
These informations are compiled into an XML file of metadata. There is a metadata file for each migration operation and all migrated files in the same migration operation refer to the same metadata file.
Currently, no logic migration has been conducted on the archives kept in PAC.
The specifications described above allow us to consider the conduct of such an operation.
Two tools of the European program “Planets” to help preservation planning.
A tool for analyzing different scenarios giving solid and well documented recommendations to help in choosing of a strategy for preservation planning, according to predefined criteria.
Open Preservation Foundation
A dedicated research and experimentation environment where results can be evaluated and shared with the community at large.