Synthetic Data Examples – Realistic – using AI (SYNDERAI), pronounced /ˈsɪn.də.raɪ/

SYNDERAI Synthetic Data Generation and Use Policy

FHIR context

Applies to: All FHIR resources and datasets published under the SYNDERAI Synthetic Dataset Collection

Effective Date: 2025-10-01

Version: 1.2

Maintained by: SYNDERAI Project, a work package 3.3 task of the xShare Project

Purpose

This policy defines the conditions under which synthetic (non-real) healthcare data are generated, validated, published, and used within the SYNDERAI environment.

Its purpose is to ensure that no real patient, provider, or organization can be re-identified or inferred from the data while maintaining structural and semantic fidelity to HL7 FHIR resources.

Definition

Synthetic data refers to data that are entirely artificially generated and not derived from any identifiable individual. Synthetic datasets may be:

No original patient data are used, and no record linkage to real systems is possible.

Generation Process

Data are generated by designated software components (FHIR Device resources identified in Provenance).

Generation models are validated for structural correctness (FHIR Shorthand / JSON) and semantic plausibility (code system integrity, terminology consistency).

Data are labelled with:

Privacy and Legal Status

Synthetic data are not personal data under GDPR (Recital 26) because they do not relate to an identifiable natural person. They may therefore be freely shared for education, research, interoperability testing, and demonstration purposes, provided that redistribution retains this policy reference.

Reuse and Citation

Reuse of SYNDERAI synthetic data is permitted under the Creative Commons Attribution 4.0 International License (CC BY 4.0), provided that attribution includes:

Data generated under the SYNDERAI Synthetic Data Policy (v 1.2) https://synderai.net/synderai-synthetic-data-policy

Limitations

Synthetic data are not guaranteed to reflect clinical truth or epidemiological prevalence.

They must not be used for clinical decision-making or any production environment.

Contact

For questions or verification of data generation methods: mailto:synthetic-data@synderai.net