Analysis and design of scalable pre-processing techniques of instances for imbalanced Big Data problems : Applications in humanitarian emergencies situations

The enormous volume of data from different sources, really varied in its typology, generated and processed at great speed, is known as Big Data. The importance of data lies in extracting knowledge from it. Hence, being able to take advantage of a large amount of data allows us to explore and better...

Descripción completa

Detalles Bibliográficos
Autor principal: Basgall, María José
Formato: Articulo Contribucion a revista
Lenguaje:Inglés
Publicado: 2022
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/146935
Aporte de:
id I19-R120-10915-146935
record_format dspace
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Informáticas
Big Data
Data preprocessing
Machine Learning
spellingShingle Ciencias Informáticas
Big Data
Data preprocessing
Machine Learning
Basgall, María José
Analysis and design of scalable pre-processing techniques of instances for imbalanced Big Data problems : Applications in humanitarian emergencies situations
topic_facet Ciencias Informáticas
Big Data
Data preprocessing
Machine Learning
description The enormous volume of data from different sources, really varied in its typology, generated and processed at great speed, is known as Big Data. The importance of data lies in extracting knowledge from it. Hence, being able to take advantage of a large amount of data allows us to explore and better understand the problems, providing a priori higher quality solutions. To do this, applying Machine Learning for the generation of models is essential, as well as Smart Data so that these models reflect reality and support decision-making. However, it must be noted that the Machine Learning techniques that until now have offered good results are not always able to handle Big Data due to scalability issues. For this reason, they need to be adapted to work in distributed environments, or new techniques or strategies need to be created to deal with this new scenario. In addition, datasets can usually have certain undesired characteristics or complexities that interfere with the effectiveness of the knowledge extraction process, so they must be preprocessed due to the fact that most learning models assume that the data are free of those characteristics. Therefore, and since there are few scalable solutions capable of handling Big Data related to this topic, this thesis addresses the distributed and scalable pre-processing of Big Data sets, in order to obtain good quality data, known as Smart Data. Particularly, it focuses on classification problems, and on addressing the following characteristics: (a) imbalanced data; (b) redundancy; (c) high dimensionality; and (d) overlapping.
format Articulo
Contribucion a revista
author Basgall, María José
author_facet Basgall, María José
author_sort Basgall, María José
title Analysis and design of scalable pre-processing techniques of instances for imbalanced Big Data problems : Applications in humanitarian emergencies situations
title_short Analysis and design of scalable pre-processing techniques of instances for imbalanced Big Data problems : Applications in humanitarian emergencies situations
title_full Analysis and design of scalable pre-processing techniques of instances for imbalanced Big Data problems : Applications in humanitarian emergencies situations
title_fullStr Analysis and design of scalable pre-processing techniques of instances for imbalanced Big Data problems : Applications in humanitarian emergencies situations
title_full_unstemmed Analysis and design of scalable pre-processing techniques of instances for imbalanced Big Data problems : Applications in humanitarian emergencies situations
title_sort analysis and design of scalable pre-processing techniques of instances for imbalanced big data problems : applications in humanitarian emergencies situations
publishDate 2022
url http://sedici.unlp.edu.ar/handle/10915/146935
work_keys_str_mv AT basgallmariajose analysisanddesignofscalablepreprocessingtechniquesofinstancesforimbalancedbigdataproblemsapplicationsinhumanitarianemergenciessituations
bdutipo_str Repositorios
_version_ 1764820460611043330