Frequently Asked Questions

Access to the data

  • Yes, a private URL gives access to files in a draft dataset that has not yet been published even if these files are restricted.

  • Yes, if the metadata provides additional information to that entered in the Recherche Data Gouv repository.

    Note: If the other repository assigns a DOI, do not create a dataset in the Recherche Data Gouv repository.

  • You can restrict access to a file in a dataset. The Recherche Data Gouv repository does not automatically manage the duration of this access restriction (embargo); it is up to the depositor of the dataset to remove the restriction when the time comes. 

    To restrict access to a dataset as a whole, it must not have been published. Access to this dataset will then only be possible to authorized persons or via its private URL.

    Access to a published Dataverse collection cannot be restricted.

  • The "Link to data" metadata indicates the direct link to the data. A link to a database will not allow to find precisely the data described by the dataset. The database from which the dataset is extracted can be mentioned in the metadata "Data sources".

  • Yes, the data are indexed by Google Dataset Search.

  • The accessibility of the datasets remains the same. However, you should keep a contact person for the dataset (the administrator of the parent collection for example) and also possibly assign rights to modify the dataset or give access to restricted files.

Account management

  • After logging in, go to your user profile and click on "My data" at the top right of the screen. In the facets, just select "Dataverses" and the "Administrator" role.

  • There is no validation of registrations for people who create an external account. This external account does not open any rights by default.

  • You will no longer be able to log into the institutional account linked to the institution you are leaving. If this is the case, you will need to create a new account. This may either be an institutional account associated with your new institution, an ORCID account or an external account. Then, please contact the Repository-registry resource centre (support-recherchedatagouv@inrae.fr) to request the merger of the two accounts and the conservation of the rights to previously created datasets.

  • Yes, it is possible to link an email alias to a user account.

Administration of the Dataverse collections

  • Yes, optional metadata can be made obligatory. This has no impact on datasets that have already been published as long as these are not modified. However when a modification is made, the obligatory metadata needs to be filled in. 

  • No, the Research Data Gouv platform makes no specific recommendations on graphics although some technical constraints are indicated in the collection settings (Theme + Widget menu).

    Nonetheless you may consider it advisable to check the specific documentation of the institutional space you are working in to see if there are any particular recommendations.

  • No. However, it IS possible to identify datasets in a collection that were created on a specific date and remain unpublished.

    For example: write the dateOfDeposit:[* TO 2020] as your query then select "unpublished" datasets via the facets (you need to log in to use this facet).

  • Most of the collection functionality is available through the Dataverse APIs, including assigning roles to set up a curation process. A dataset can be sent for review via the API and will only be published if the publish command is executed. A dataset created by the API can also be modified via the user interface.

Choosing a data repository

  • Only research data produced by research teams with at least one collaborator affiliated to an institution belonging to the French public research community are accepted by right in the Recherche Data Gouv repository.

  • A dataset needs to meet the following criteria to be published in the Recherche Data Gouv repository:

    • The data has to have been produced in a research context;

    • The data should preferably be structured and deposited in a machine-readable format.

  • Yes, you are strongly recommended to deposit your research data in a reference repository for your thematic community. If this is not possible, the data can be deposited in a repository housed by your institution or in the Research Data Gouv repository.

  • Yes, all the data from this kind of project can be deposited in the Recherche Data Gouv repository. If one of the partners possesses an institutional space, then the data should be deposited there in the project's dedicated collection. If this is not the case, the data should be deposited in the generic space.

    If the datasets have already been deposited elsewhere, please indicate the link to these datasets in the "Related datasets" metadata.

    Please remember to specify each producer in the "Producer" metadata.

Curation

  • No. Depositors can take no action on their datasets while these are being reviewed.

  • Most of the collection functionality is available through the Dataverse APIs, including assigning roles to set up a curation process. A dataset can be sent for review via the API and will only be published if the publish command is executed. A dataset created by the API can also be modified via the user interface.

Data documentation

  • Final versions of the data management plan (DMP) can be deposited in the Recherche Data Gouv repository in the same collection as the data it was used to present and will thus be assigned a DOI. Links in both directions between the datasets and this DMP must be set up via the Related Dataset metadata. If there is only one dataset, the DMP will be one of the files associated with it.

Data papers

  • Yes, a private URL gives access to files in a draft dataset that has not yet been published even if these files are restricted.

  • No, generating a data paper using the dedicated functionality requires you to use a dataset's DOI.

  • Either may be published first. This is a question of scientific strategy and is the authors' responsibility. If authors are concerned about the use of the data, the data paper can be written first but published after the scientific article has been published. If in doubt, please contact the editor.

Dataset templates

  • The Recherche Data Gouv dataset template only pre-fills in the information linked to the disseminator and the Open/Etalab licence. Collection administrators are strongly recommended to create new and more complete templates to make it easier to fill in as many metadata fields as possible when depositing, for example in the context of a project or thematic collection.

    To create a dataset template, please consult: 

    https://recherche.data.gouv.fr/en/category/27/guide/modifying-the-parameters-of-a-collection#Creating-dataset-templates

  • No, a template can only be used to create new datasets.

  • Collection: Yes, it is possible to have several templates in a collection including those of the parent collection (e.g. the INRAE template).

    Dataset: No, it is not possible to choose several templates for a dataset or to change your template once the dataset has been created.

  • Dataset templates are created by the administrator in a collection and apply to the datasets created in that collection.

Datasets

  • No, a dataset is always created in a collection. This collection can be the generic space or a collection in one of the institutional spaces.

  • Depositors are recommended to publish their datasets as soon as possible because the Research Data Gouv repository aims to serve as a repository for data access.

    It is advisable not to take more than one year to publish data which are not linked to a publication. After this period, an administrator of the collection the data is linked to may contact the depositor to alert him or her.

  • Yes, a dataset can be deleted as long as it has not yet been published. Only draft versions can be deleted.

  • The accessibility of a dataset does not depend on files being present. However, if all the files in a dataset are deleted then you should specify the link to the data in the "Link to data" metadata.

  • There is no limit to the size of a dataset.

    Please see "What is the size of an institutional space?" and "What is the maximum size for a file?".

Dataverse collections

  • A collection can only be linked to one parent collection, the collection in which it was created. One collection can be linked to another so that it appears as a sub-collection. To link a collection, please get in touch with the Repository-registry resource centre: support-recherchedatagouv@inrae.fr.

  • Yes, but only the Recherche Data Gouv repository resource center can move a dataset from one collection to another. It is better to make a link than a move.

  • There is no maximum size for a collection other than the volume of the institutional space that contains it. Please see "What is the size of an institutional space?"

  • Only the functional administrators of the Recherche Data Gouv repository can do this and not the administrator of a given collection. This means you should think carefully about the positioning of a collection before creating it.

    It is possible to link a collection to another collection to make it appear as a sub-collection but the contents of the latter will not be visible in the target collection.

    To move or link a collection please contact the Repository-registry resource centre at support-recherchedatagouv@inrae.fr.

DOI

  • Yes, the DOI is based on the Handle identifier system and is ISO certified (ISO 26324, Digital Object Identifier System). Uniqueness is ensured by the fact that suffixes are unique for a given prefix.

  • When a Scientific Collective Infrastructure (SCI) has contributed to the dataset, it can be indicated in the Contributor metadata by specifying the "Type" and choosing "DOI" in "Contributor Identifier Scheme". The DOI itself is entered in "Contributor Identifier".

  • The sandbox publishes datasets only on the DataCite test environment, which is a closed system: it is the only way to directly find the resource corresponding to the DOI. The DOI does not lead to any landing page for the dataset.

  • The DOI of a dataset is generated and reserved when the dataset is created. It is activated when the dataset is published.

    As a reminder, data deposited in the Research Data Gouv warehouse must be published within 12 months.

  • No, it is not recommended to create a new dataset in the Recherche Data Gouv repository as this would mean a new DOI would be assigned to it. Please see the depositing guide:

    https://recherche.data.gouv.fr/en/category/9/guide/before-depositing

  • No, the DOI stays the same if the version changes. Please see the depositing guide:

    https://recherche.data.gouv.fr/en/category/9/guide/modifying-and-managing-versions-of-a-published-dataset

Files

  • The repository interface of the Recherche Data Gouv repository limits the upload to 1000 files at a time. Beyond that, it is possible to use the DVUploader tool or the native API.

  • Files can be exported in their original format. Tabular data can also be exported in Rdata and tabulated formats.

  • A file modification creates a new draft dataset. A new publication of this dataset makes such changes public and updates the version.

  • Recommendations for naming and organizing data files are available on the Doranum website ((https://doranum.fr/tags/nommage-fichier/). In addition, specific recommendations can be proposed by the administrator of a Dataverse collection and integrated into the data management plan associated with it.

  • A file can only be deposited in one dataset. However, it is possible to refer to a file deposited in another dataset via the "Related Datasets" metadata.

  • This is not possible from the user interface. You have to use the native API - Accessing (downloading) files - by indicating the DOI of the file with the command https://entrepot.recherche.data.gouv.fr/api/datafile/:persistentId/?persistentId=doi:{DOI du fichier}

  • It is not possible to modify the contents of a file in the user interface. To update a file, it needs to be replaced. Please note that this would involve a version change. The dataset must be republished for the new file to be accessed.

  • Yes, all file formats are accepted. If your dataset has a large tree structure, we strongly recommend you use DVUploader : https://recherche.data.gouv.fr/en/category/33/guide/dv-uploader-1 to upload all your files and the tree structure rather than an archive.

  • Please consult the guide to identify the cause of the error (https://recherche.data.gouv.fr/en/category/9/guide/depositing-a-dataset#Tabulated+data+files), then replace the problem file with the corrected file.

  • The maximum size for file is 50 Go.

  • No, it is not possible to sort files by number of downloads. You can sort the files in a dataset by name, date of deposit, size and category (file type).

  • No, if the file contains several sheets, it can be uploaded to the Recherche Data Gouv repository but only the first sheet is ingested and thus transformed into a .tab file. 

    In order to achieve good ingestion, it is therefore advisable to upload only files containing a single tab with the variables on the first line (column headers) and one observation per line (see Case of tabulated data files).

  • It is possible to use anonymization tools, such as Amnesia (https://amnesia.openaire.eu/), which allows anonymized data to be send directly to the Recherche Data Gouv repository.

General terms and conditions

  • The secure hosting and availability of the data is guaranteed for a minimum of 5 years renewable after publication. This period is a minimum, not a maximum period of data retention, which may vary depending on the dataset.

  • Yes, all the data from this kind of project can be deposited in the Recherche Data Gouv repository. If one of the partners possesses an institutional space, then the data should be deposited there in the project's dedicated collection. If this is not the case, the data should be deposited in the generic space.

    If the datasets have already been deposited elsewhere, please indicate the link to these datasets in the "Related datasets" metadata.

    Please remember to specify each producer in the "Producer" metadata.

  • Secure hosting and availability of the data is guaranteed by the Recherche Data Gouv repository representative for a minimum of 5 years renewable after the dataset is published.

    The link between DOI and dataset description page in the Recherche Data Gouv repository is guaranteed. This follows the rules set by DataCite, the agency managing the DOI. 

Harvesting

  • No, Zenodo harvesting is not supported by the current version of the Dataverse software.

Institutional repositories

  • Information on how to request the creation of an institutional repository is available on the "Joining the ecosystem" page.

  • An institutional space has 5 TB when it is created.

  • By default, the volume of an institutional space is 5To.

Metadata

  • Most of the metadata comes from Dataverse software (see Metadata References) and is compliant with the Documentation Initiative (DDI), Dublin Core, DataCite and ISA-Tab standards. Other metadata has been created for the specific needs of an institution, such as the Semantic resource block developed by INRAE.

    The values of some metadata can come from external repositories or from the Recherche Data Gouv repository.

  • The author is the person responsible for the dataset and is cited in the dataset. The authorName metadata associated with the DOI is obligatory in the Recherche Data Gouv repository.

    A contributor is a person or organisation who/which took part in the collection, management, distribution of the dataset or contributed to it in any other way. Contributors do not appear in dataset citations.

  • When a Scientific Collective Infrastructure (SCI) has contributed to the dataset, it can be indicated in the Contributor metadata by specifying the "Type" and choosing "DOI" in "Contributor Identifier Scheme". The DOI itself is entered in "Contributor Identifier".

  • Yes, the DOI is based on the Handle identifier system and is ISO certified (ISO 26324, Digital Object Identifier System). Uniqueness is ensured by the fact that suffixes are unique for a given prefix.

  • Information about the journal in which the data is published should be entered here.

  • The Data Documentation Initiative (DDI) is a standard created by the DDI Alliance which enables users to document survey and observational data in the social, behavioural, economic and health sciences.

  • Yes, it is perfectly possible to complete the metadata for a dataset after this has been published. A new (minor or major) version will thus be created and must be published in turn.

  • No, the choice of language depends on the target audience for your publication. Different "Description" metadata can be created by specifying the language of each.

  • No, this functionality does not exist from the user interface. It is however possible to use APIs to perform this action. For more information, contact the Recherche Data Gouv repository resource center.

  • It is not possible for a user to add metadata to the form. It is possible, however, to upload an documentation file with the data file containing the specific metadata.

    To suggest that metadata be added to the input form, contact the Recherche Data Gouv repository resource center..

  • Yes, if the metadata provides additional information to that entered in the Recherche Data Gouv repository.

    Note: If the other repository assigns a DOI, do not create a dataset in the Recherche Data Gouv repository.

  • The ORCID identifier is recommended in the second French Plan for Open Science to identify authors and contributors.

    However, you can use other identifiers in the Recherche Data Gouv repository, knowing that it is only possible to indicate one identifier per author or contributor.

Sandbox and production environments

  • No, these two environments are independent and there is no possibility to export a dataset from one to import it into the other.

  • Yes there is a sandbox https://demo.recherche.data.gouv.fr/.

    All authorised users can create datasets or collections while in the "Travaux Pratiques"collection.

  • When you create an account in the sandbox environment, you then have the default permissions of a dataset and collection creator in the “Travaux Pratiques collection.

  • The sandbox publishes datasets only on the DataCite test environment, which is a closed system: it is the only way to directly find the resource corresponding to the DOI. The DOI does not lead to any landing page for the dataset.

The generic space

  • No, a dataset can only be deposited in the generic space if neither the depositor or one of her/his team members has no access to an institutional space.

  • Yes, it is possible to deposit data in the generic space rather than waiting for an institutional space to be set up. However, if one of the team members has an institutional space then the dataset must be deposited there.

Types of content

  • Final versions of the data management plan (DMP) can be deposited in the Recherche Data Gouv repository in the same collection as the data it was used to present and will thus be assigned a DOI. Links in both directions between the datasets and this DMP must be set up via the Related Dataset metadata. If there is only one dataset, the DMP will be one of the files associated with it.

User access and rights management