This code accompanies the manuscript
Bioprocess optimisation via joint machine learning and metabolic modelling Zampieri, G., Sandner, V., Verma, S., Kraemer, J., Lennon, C., Occhipinti, A., McCreath, G., and Angione, C.
Three scripts have the precedence as they generate the data for the following analyses:
- parse_process_data.mlx generates metabolic rates to be used as constraints in the genome-scale metabolic models and associated experimental parameters starting from raw offline/online metabolite concentrations and dry cell weights.
- MM_pFBA.m performs metabolic modelling (MM) for all experimental conditions using parsimonious enzyme usage flux balance analysis (pFBA). The input is the data generated by parse_process_data.mlx along with critical enzyme temperatures, process titres, and genome-scale metabolic models for E. coli BL21 and W3110. The main outputs are metabolic fluxes simulated for the pre-induction phase, for the post-induction phase when ignoring end-of-process data such as total protein content, and for the post-induction phase when employing such data.
- MM_MEP.m performs MM for all experimental conditions using metabolic expectation propagation (MEP). Inputs and outputs are the same as above.
Other scripts perform machine-learning and multivariate analysis:
- archetypal_analysis.m determines the metabolic archetypes of the considered experimental conditions.
- PCA_permutation_test.m tests the significance of metabolic principal components of the considered experimental conditions.
- SVM_classification.m runs repeated nested cross-validation on linear support vector machine (SVM) models using DoE data, metabolite concentrations, and metabolic fluxes to classiffy low/high titre conditions.
- SVM_classification_with_permutation.m does the same with permuted input data.
- SVM_feature_importance.m extracts feature weights from the cross-validation results.
MATLAB dependencies:
- MATLAB R2017b
- the Symbolic Math Toolbox
- the Statistics and Machine Learning Toolbox
- the COBRA Toolbox v3.0
- Metabolic-EP
- PropError
- PCHA
- cbrewer
- alluvialflow
R dependencies:
- R 3.5
- InPosition
- TExPosition
- openxlsx
- plyr
- R.matlab
Other:
- the IBM CPLEX solver 12.8