Please use this identifier to cite or link to this item: https://hdl.handle.net/10316/101583
Title: Generating Synthetic Missing Data: A Review by Missing Mechanism
Authors: Santos, Miriam Seoane 
Pereira, Ricardo Cardoso 
Costa, Adriana Fonseca 
Soares, Jastin Pompeu 
Santos, Joao
Abreu, Pedro Henriques 
Keywords: Data preprocessing; missing data; missing data generation; missing data mechanisms
Issue Date: 2019
Project: NORTE-01-0145-FEDER-000027 
FCT - SFRH/BD/138749/2018 
Serial title, monograph or event: IEEE Access
Volume: 7
Abstract: The performance evaluation of imputation algorithms often involves the generation of missing values. Missing values can be inserted in only one feature (univariate con guration) or in several features (multivariate con guration) at different percentages (missing rates) and according to distinct missing mechanisms, namely, missing completely at random, missing at random, and missing not at random. Since the missing data generation process de nes the basis for the imputation experiments (con guration, missing rate, and missing mechanism), it is essential that it is appropriately applied; otherwise, conclusions derived from ill-de ned setups may be invalid. The goal of this paper is to review the different approaches to synthetic missing data generation found in the literature and discuss their practical details, elaborating on their strengths and weaknesses. Our analysis revealed that creating missing at random and missing not at random scenarios in datasets comprising qualitative features is the most challenging issue in the related work and, therefore, should be the focus of future work in the field.
URI: https://hdl.handle.net/10316/101583
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2019.2891360
Rights: openAccess
Appears in Collections:I&D CISUC - Artigos em Revistas Internacionais

Files in This Item:
Show full item record

SCOPUSTM   
Citations

25
checked on Nov 17, 2022

WEB OF SCIENCETM
Citations

26
checked on May 2, 2023

Page view(s)

61
checked on Apr 24, 2024

Download(s)

101
checked on Apr 24, 2024

Google ScholarTM

Check

Altmetric

Altmetric


This item is licensed under a Creative Commons License Creative Commons