Data augmentation enhances plant-genomic-enabled predictions

By:

Montesinos-Lopez, O.A

Contributor(s):

Material type: Article

ArticleLanguage: English Publication details: MDPI, 2024. Basel (Switzerland) :ISSN:

2073-4425 (Online)

Subject(s):

Online resources:

Open Access through DSpace

In: Genes v. 15, no. 3, art. 286Summary: Genomic selection (GS) is revolutionizing plant breeding. However, its practical implementation is still challenging, since there are many factors that affect its accuracy. For this reason, this research explores data augmentation with the goal of improving its accuracy. Deep neural networks with data augmentation (DA) generate synthetic data from the original training set to increase the training set and to improve the prediction performance of any statistical or machine learning algorithm. There is much empirical evidence of their success in many computer vision applications. Due to this, DA was explored in the context of GS using 14 real datasets. We found empirical evidence that DA is a powerful tool to improve the prediction accuracy, since we improved the prediction accuracy of the top lines in the 14 datasets under study. On average, across datasets and traits, the gain in prediction performance of the DA approach regarding the Conventional method in the top 20% of lines in the testing set was 108.4% in terms of the NRMSE and 107.4% in terms of the MAAPE, but a worse performance was observed on the whole testing set. We encourage more empirical evaluations to support our findings.

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Current library	Collection	Call number	Status	Date due	Barcode	Item holds
Article	CIMMYT Knowledge Center: John Woolston Library	CIMMYT Staff Publications Collection		Available

Total holds: 0

Peer review

Open Access

Genomic selection (GS) is revolutionizing plant breeding. However, its practical implementation is still challenging, since there are many factors that affect its accuracy. For this reason, this research explores data augmentation with the goal of improving its accuracy. Deep neural networks with data augmentation (DA) generate synthetic data from the original training set to increase the training set and to improve the prediction performance of any statistical or machine learning algorithm. There is much empirical evidence of their success in many computer vision applications. Due to this, DA was explored in the context of GS using 14 real datasets. We found empirical evidence that DA is a powerful tool to improve the prediction accuracy, since we improved the prediction accuracy of the top lines in the 14 datasets under study. On average, across datasets and traits, the gain in prediction performance of the DA approach regarding the Conventional method in the top 20% of lines in the testing set was 108.4% in terms of the NRMSE and 107.4% in terms of the MAAPE, but a worse performance was observed on the whole testing set. We encourage more empirical evaluations to support our findings.

Text in English

Montesinos-Lopez, O.A. : No CIMMYT Affiliation

Click on an image to view it in the image viewer

Knowledge Center Catalog

Data augmentation enhances plant-genomic-enabled predictions

International Maize and Wheat Improvement Center (CIMMYT) © Copyright 2021.
Carretera México-Veracruz. Km. 45, El Batán, Texcoco, México, C.P. 56237.
If you have any question, please contact us at
CIMMYT-Knowledge-Center@cgiar.org

Knowledge Center Catalog

Data augmentation enhances plant-genomic-enabled predictions

International Maize and Wheat Improvement Center (CIMMYT) © Copyright 2021. Carretera México-Veracruz. Km. 45, El Batán, Texcoco, México, C.P. 56237. If you have any question, please contact us at CIMMYT-Knowledge-Center@cgiar.org

International Maize and Wheat Improvement Center (CIMMYT) © Copyright 2021.
Carretera México-Veracruz. Km. 45, El Batán, Texcoco, México, C.P. 56237.
If you have any question, please contact us at
CIMMYT-Knowledge-Center@cgiar.org