Optional Workshop
Multi-Omics Integration for Applied Prediction, Concepts, Foundations, and Strategies
This course focuses on the development and implementation of prediction models for private and public sectors in plant and animal sciences. Students will learn the basis for modeling responses assisted by the integration of multiple data types ‘omics’ under different approaches (parametric, non-parametric AI) using R software with code explained in detail.
Meet the Instructor
Diego Jarquin
Assistant Professor
UF/IFAS Agronomy
Gainesville, FL
One of the biggest challenge of the humanity is to ensure the current and future food supply chain of a growing population in a world that is facing more often and extreme environmental variations. The contributions of Dr. Jarquin research program are relevant to the society because these are helping to the development of improved (more productive, resistant to biotic and abiotic stress, etc.) genotypes by the integrating artificial intelligence (AI) methods and multi-omics analyses in plant breeding. Dr. Jarquin's program is focused on the development of interpretable AI methods and related disciplines (biostatistics, quantitative genetics, and modeling) that can be applied to large multi-omics data sets for providing solutions to complex plant breeding and plant systems biology questions. More specifically, these developments are oriented to find new ways of driving genetic improvement and biological insights designing and optimizing methods for plant breeding, leveraging information from multiple facets of plant biology—physiology, agronomy, and biochemistry to quantitative genetics and multi-omics (genomics, transcriptomics, proteomics, metabolomics and high throughput phenotyping), and provide novel solutions to unravel the biological basis of complex traits for plant breeding programs.
The attendees will:
- Identify the underlying foundations of the different prediction paradigms (parametric [frequentist, Bayesian], non-parametric Artificial Intelligence AI [Reproducing Kernels Hilbert Spaces, Random Forest, Artificial Neural Networks and Deep Learning]) and effective manners to combine results from multiple approaches using Stacking Ensemble Learning.
- Write R scripts replicating the results of elaborated functions.
- Identify the best prediction strategy to adopt according to the needs under study.
- Identify the presence of genotype-by-environment (G×E) interaction and leverage it in prediction models.
- Integrate multiple layers ‘Omics’ of information in prediction models.
- Identify and analyze when and how to implement the different cross-validation scenarios.
- Application areas will include agriculture and natural resources.
Workshop Course Objectives and Information
The course emphasizes the practical aspect of performing parametric and non-parametric (AI) methods. The objectives are to familiarize statistical programmers and practitioners with the essentials of the parametric and non-parametric paradigms through a series of worked-out examples that demonstrate sound practices for a variety of concepts of interest in plant and animal sciences.
Prerequisites:
- Background equivalent to an M.S. in applied statistics or related disciplines
- Earlier exposure to parametric (penalized and Bayesian) and non-parametric (Artificial Intelligence) methods
Note: Hands-on exercises will be conducted using the statistical software R and related libraries.