“Unsupervised Entropy-Based Selection of Data Sets for Improved Model Fitting”

From Navigators

(Difference between revisions)
Jump to: navigation, search
 
(3 intermediate revisions not shown)
Line 6: Line 6:
|month=jul
|month=jul
|year=2016
|year=2016
 +
|abstract=A method based on the information theory concept of entropy is presented for selecting subsets of data for off-line model identification. By using entropy-based data selection instead of random equiprobable sampling before training models, significant improvements are achieved in parameter convergence, accuracy and generalisation ability. Furthermore, model evaluation metrics exhibit less variance, therefore allowing faster convergence when multiple modelling trials have to be executed. These features are experimentally demonstrated by the results of an extensive number of neural network predictive modelling
 +
experiments, where the single difference in the identification of pairs of models was the data set used to tune model parameters. Unlike most active learning and instance selection procedures, the method is not iterative, does not rely on an existing model,
 +
and does not require a specific modelling technique. Instead, it selects data points in one unsupervised step relying solely on Shannon’s information measure.
|address=Vancouver, Canada
|address=Vancouver, Canada
-
|booktitle=Proceedings of the IEEE International Joint Conference on Neural Networks (World Congress on Computaional Intelligence)
+
|booktitle=2016 International Joint Conference on Neural Networks (IJCNN) (World Congress on Computational Intelligence)
-
|note=accepted
+
|pages=3330-3337
|publisher=IEEE
|publisher=IEEE
-
|url=https://www.researchgate.net/publication/301199941_Unsupervised_Entropy-Based_Selection_of_Data_Sets_for_Improved_Model_Fitting
+
|url=https://doi.org/10.1109/IJCNN.2016.7727625
}}
}}

Latest revision as of 15:55, 27 December 2016

Pedro M. Ferreira

in 2016 International Joint Conference on Neural Networks (IJCNN) (World Congress on Computational Intelligence), Vancouver, Canada, Jul. 2016, pp. 3330–3337.

Abstract: A method based on the information theory concept of entropy is presented for selecting subsets of data for off-line model identification. By using entropy-based data selection instead of random equiprobable sampling before training models, significant improvements are achieved in parameter convergence, accuracy and generalisation ability. Furthermore, model evaluation metrics exhibit less variance, therefore allowing faster convergence when multiple modelling trials have to be executed. These features are experimentally demonstrated by the results of an extensive number of neural network predictive modelling experiments, where the single difference in the identification of pairs of models was the data set used to tune model parameters. Unlike most active learning and instance selection procedures, the method is not iterative, does not rely on an existing model, and does not require a specific modelling technique. Instead, it selects data points in one unsupervised step relying solely on Shannon’s information measure.

Download paper

Download Unsupervised Entropy-Based Selection of Data Sets for Improved Model Fitting

Export citation

BibTeX

Project(s):

Research line(s): Timeliness and Adaptation in Dependable Systems (TADS)

Personal tools
Navigators toolbox