Much of the problematic of long term preservation is based on file formats and their ability to be interpreted in the distant future.
Condition No. 1 for the format is archivable it must be usable in its entirety and on indefinitely.
For this there must exist an open specification that describes all its features.
The format and its specification must be free of any right of exploitation and without time limit.
The goal is to find a set of criteria to ensure Condition No. 1.
The format specification must be openerte. If the specification is associated with a standard that ensures its correct description.
If there is no associated standard, the format must be widely used because this suggests that the specifications are sufficiently exploited to be well written.
Note that this leads that such a format may be proprietary.
To be archived at CINES, a format must meet three criteria:
- Published
- Widely used (or promised to be)
- Standardized (if possible)
This selection is necessary to:
- Control of the format validity,
- Migration (conversion to another format)
- Reading and understanding the format.
The general policy of the Archive at CINES is to minimize the number of formats collected to, among others:
- Facilitate the management of the logical migration process of format,
- A better technology scouting on these formats.
Watching of the formats at CINES
The criteria outlined above are the basis of the reflection by the CINES around the selection of file formats for archiving.
The strategic foresight on formats is organized around a set of lists where are registered the differents file formats which are followed by the PAC team.
These lists are the number of 5:
- The list of formats under study contains the formats proposed by the submission services and the emerging formats detected by the foresight.
- The list of potentially archivable formats includes the formats considered relevant to the criteria described above.
- The archivable formats list is composed of formats considered relevant for archiving at PAC and validated by the hierarchy. The formats mentioned in this list
can be stored in the platform CINES. Only this list is public.
An archivable format can never be archived in PAC, if no service does not submit files in this format.
- The list of formats that will be obsolescents, contains archivable formats for which the monitoring unit has detected a risk of obsolescence. Negotiations are then undertaken with the services to stop the submissions in this format.
- The obsolete formats list includes formats that were part of the archivable formats list and who under the eligibility criteria of the PAC platform, are considered obsolete. These formats are no longer archivable in PAC and must be logically migrated to preserve the readability of their contents.