Teoresi logo

IO.T Solutions new section

Le nostre tesi
logo unina

Università degli Studi di Napoli Federico II

Computer Science

Bachelor's degree

Autore

Giuseppe Luongo

2021

Design and development of a data augmentation tool to support the training of neural networks

logo unina

Università degli Studi di Napoli Federico II

Computer Science

Bachelor's degree

Autore

Giuseppe Luongo

Artificial Intelligence
Relatori Teoresi coinvolti

Alessandro Serrapica

Relatori Accademici

Prof. Del Riccio


Abstract

The purpose of this tool is to create a reliable dataset, in order to improve the training of a machine learning model, improving the accuracy of the predictions. To do this we chose to operate on two fundamental characteristics: quality and quantity. The quality allows building a dataset, containing the most representative samples in terms of information, applying the concept of entropy on the amount of information contained in a certain data. The second characteristic, the quantity, allows building a dataset containing a greater number of samples, in order to provide more examples for learning. In this way, we provide samples that simulate possible scenarios found in the application contexts. For this, it has been chosen to apply the data augmentation.

Objectives

Design and development of a data augmentation tool to support neural network training, with Keras, OpenCv, SIFT, Canny

Research methodology

The purpose of the thesis is to develop a tool whose main objective is to build a qualitatively and quantitatively reliable dataset. In that sense, it was decided to operate by applying the data augmentation operations, in order to increase the samples present in the dataset, and by applying filtering operations, using entropy, in order to make the dataset highly informative. By doing so, the training of the neural network is improved and consequently, its results are improved.

Conclusions

The tool allows building a qualitative and quantitative dataset of images, through the functions implemented for data augmentation, in order to improve neural network training and its performance.

Future developments

Automation of data labeling and image segmentation through CNN or clustering