Privacy-Preserving Data Synthesis Using Trusted Execution Environments
Kuupäev
2022
Autorid
Ajakirja pealkiri
Ajakirja ISSN
Köite pealkiri
Kirjastaja
Tartu Ülikool
Abstrakt
Data synthesis is the process of generating new synthetic data from existing data. Often
companies do not have the the in-house competence to synthesize data themselves, and
are willing to outsource the process. However, synthesis requires access to the original
data. Sharing data with a third party can be complex, especially so if it contains sensitive
information or is considered as personal data by regulations such as the GDPR.
The goal of this thesis is to develop a proof-of-concept privacy-preserving data
synthesis service showing that it is possible to use trusted execution environments to
perform data synthesis in a privacy-preserving manner. Such a service would enable
outsourcing the data synthesis process to an untrusted remote server by ensuring that
both the original and synthesized data are fully hidden from the untrusted server host
throughout the lifecycle of the service.
A prototype of the service was developed in the scope of an ongoing proof-of-concept
project. To achieve the required security goals the service prototype uses trusted execution
environment technologies, specifically the Sharemind HI development platform, which is
in turn based on the Intel SGX platform. The developed service shows that synthesizing
data in a privacy-preserving manner is indeed feasible if trusted execution environments
are used. However, future work is needed to optimize the service to allow larger input
and output files, and to support additional data synthesis methods.
Kirjeldus
Märksõnad
Data synthesis, trusted execution environments, privacy-preserving technologies