DRAFT.ipynb
DRAFT¶
Content in this section is based on general ML project pipeline¶
- The tutorial is on the lithofacies predictions from well log data
- It will run through a typical machine learning workflow/pipeline
- Getting data
- Data Cleaning
- Data Visualization and Exploratory Data ANalysis
- Data Preparation
- Model Training and Prediction
- Model Evaluation
- Model Performance Improvement with different techniques
- Data Augmentation
- Cross validation technique
- Model regularization for better training (for slower weight decay)
Content in this section is focused more on the FORCE competition¶
- It will cover key things to watch out for when working with big subsurface data in the format which we have (the FORCE dataset) e.g.
- When using gradient boosting trees (possibility of model loss not converging while training)
- How K-Fold cross validation could help prevent that
- How selecting a proper cross validation technique helps in making more confident decisions
- Preventing overfitting by;
- Creating different validations sets (give "whys" on choice of method for creating them)
- Making proper evaluations on validation sets with different metrics