Normal view MARC view ISBD view

Bayesian functional regression as an alternative statistical analysis of high‑throughput phenotyping data of modern agriculture [Electronic Resource]

By: Montesinos-Lopez, A.
Contributor(s): Montesinos-Lopez, O.A | De los Campos, G | Crossa, J | Burgueño, J | Luna-Vazquez, F.J.
Material type: materialTypeLabelArticlePublisher: London : BioMed Central, 2018Other Relationship Entry (R): Correction to : bayesian functional regression as an alternative statistical analysis of high-throughput phenotyping data of modern agriculture Subject(s): Phenotypes | Economic activities | Statistical methods | Regression analysis | Bayesian theory | Data analysisOnline resources: Open Access through Dspace In: Plant Methods v. 14, art. 46Summary: Background: Modern agriculture uses hyperspectral cameras with hundreds of reflectance data at discrete narrow bands measured in several environments. Recently, Montesinos-López et al. (Plant Methods 13(4):1–23, 2017a. https ://doi.org/10.1186/s1300 7-016-0154-2; Plant Methods 13(62):1–29, 2017b. https ://doi.org/10.1186/s1300 7-017-0212- 4) proposed using functional regression analysis (as functional data analyses) to help reduce the dimensionality of the bands and thus decrease the computational cost. The purpose of this paper is to discuss the advantages and disadvantages that functional regression analysis offers when analyzing hyperspectral image data. We provide a brief review of functional regression analysis and examples that illustrate the methodology. We highlight critical elements of model specification: (i) type and number of basis functions, (ii) the degree of the polynomial, and (iii) the methods used to estimate regression coefficients. We also show how functional data analyses can be integrated into Bayesian models. Finally, we include an in-depth discussion of the challenges and opportunities presented by functional regression analysis. Results: We used seven model-methods, one with the conventional model (M1), three methods using the B-splines model (M2, M4, and M6) and three methods using the Fourier basis model (M3, M5, and M7). The data set we used comprises 976 wheat lines under irrigated environments with 250 wavelengths. Under a Bayesian Ridge Regression (BRR), we compared the prediction accuracy of the model-methods proposed under different numbers of basis functions, and compared the implementation time (in seconds) of the seven proposed model-methods for different numbers of basis. Our results as well as previously analyzed data (Montesinos-López et al. 2017a, 2017b) support that around 23 basis functions are enough. Concerning the degree of the polynomial in the context of B-splines, degree 3 approximates most of the curves very well. Two satisfactory types of basis are the Fourier basis for period curves and the B-splines model for non-periodic curves. Under nine different basis, the seven method-models showed similar prediction accuracy. Regarding implementation time, results show that the lower the number of basis, the lower the implementation time required. Methods M2, M3, M6 and M7 were around 3.4 times faster than methods M1, M4 and M5. Conclusions: In this study, we promote the use of functional regression modeling for analyzing high-throughput phenotypic data and indicate the advantages and disadvantages of its implementation. In addition, many key elements that are needed to understand and implement this statistical technique appropriately are provided using a real data set. We provide details for implementing Bayesian functional regression using the developed genomic functional regression (GFR) package. In summary, we believe this paper is a good guide for breeders and scientists interested in using functional regression models for implementing prediction models when their data are curves. Keywords: Hyperspectral data, Functional regression analysis, Bayesian functional regression, Functional data, Bayesian Ridge Regression.
Tags from this library: No tags from this library for this title. Log in to add tags.
    average rating: 0.0 (0 votes)
Item type Current location Collection Call number Status Date due Barcode Item holds
Article CIMMYT Knowledge Center: John Woolston Library

Lic. Jose Juan Caballero Flores

 

CIMMYT Staff Publications Collection Available
Total holds: 0

Open Access

Peer review

Background: Modern agriculture uses hyperspectral cameras with hundreds of reflectance data at discrete narrow bands measured in several environments. Recently, Montesinos-López et al. (Plant Methods 13(4):1–23, 2017a. https ://doi.org/10.1186/s1300 7-016-0154-2; Plant Methods 13(62):1–29, 2017b. https ://doi.org/10.1186/s1300 7-017-0212- 4) proposed using functional regression analysis (as functional data analyses) to help reduce the dimensionality of the bands and thus decrease the computational cost. The purpose of this paper is to discuss the advantages and disadvantages that functional regression analysis offers when analyzing hyperspectral image data. We provide a brief review of functional regression analysis and examples that illustrate the methodology. We highlight critical elements of model specification: (i) type and number of basis functions, (ii) the degree of the polynomial, and (iii) the methods used to estimate regression coefficients. We also show how functional data analyses can be integrated into Bayesian models. Finally, we include an in-depth discussion of the challenges and opportunities presented by functional regression analysis. Results: We used seven model-methods, one with the conventional model (M1), three methods using the B-splines model (M2, M4, and M6) and three methods using the Fourier basis model (M3, M5, and M7). The data set we used comprises 976 wheat lines under irrigated environments with 250 wavelengths. Under a Bayesian Ridge Regression (BRR), we compared the prediction accuracy of the model-methods proposed under different numbers of basis functions, and compared the implementation time (in seconds) of the seven proposed model-methods for different numbers of basis. Our results as well as previously analyzed data (Montesinos-López et al. 2017a, 2017b) support that around 23 basis functions are enough. Concerning the degree of the polynomial in the context of B-splines, degree 3 approximates most of the curves very well. Two satisfactory types of basis are the Fourier basis for period curves and the B-splines model for non-periodic curves. Under nine different basis, the seven method-models showed similar prediction accuracy. Regarding implementation time, results show that the lower the number of basis, the lower the implementation time required. Methods M2, M3, M6 and M7 were around 3.4 times faster than methods M1, M4 and M5. Conclusions: In this study, we promote the use of functional regression modeling for analyzing high-throughput phenotypic data and indicate the advantages and disadvantages of its implementation. In addition, many key elements that are needed to understand and implement this statistical technique appropriately are provided using a real data set. We provide details for implementing Bayesian functional regression using the developed genomic functional regression (GFR) package. In summary, we believe this paper is a good guide for breeders and scientists interested in using functional regression models for implementing prediction models when their data are curves. Keywords: Hyperspectral data, Functional regression analysis, Bayesian functional regression, Functional data, Bayesian Ridge Regression.

Text in English

CIMMYT Informa : 2019 (September 13, 2018)

There are no comments for this item.

Log in to your account to post a comment.

Click on an image to view it in the image viewer

baner

International Maize and Wheat Improvement Center (CIMMYT) © Copyright 2015. Carretera México-Veracruz. Km. 45, El Batán, Texcoco, México, C.P. 56237.
Monday –Friday 9:00 am. 17:00 pm. If you have any question, please contact us at CIMMYT-Knowledge-Center@cgiar.org

Centro Internacional de Mejoramiento de Maíz y Trigo (CIMMYT) © Copyright 2015. Carretera México-Veracruz. Km. 45, El Batán, Texcoco, México, C.P. 56237.
Lunes –Viernes 9:00 am. 17:00 pm. Si tiene cualquier pregunta, contáctenos a CIMMYT-Knowledge-Center@cgiar.org