DRAFT.ipynb

DRAFT

Content in this section is based on general ML project pipeline

  • The tutorial is on the lithofacies predictions from well log data
  • It will run through a typical machine learning workflow/pipeline
    • Getting data
    • Data Cleaning
    • Data Visualization and Exploratory Data ANalysis
    • Data Preparation
    • Model Training and Prediction
    • Model Evaluation
    • Model Performance Improvement with different techniques
      • Data Augmentation
      • Cross validation technique
      • Model regularization for better training (for slower weight decay)

Content in this section is focused more on the FORCE competition

  • It will cover key things to watch out for when working with big subsurface data in the format which we have (the FORCE dataset) e.g.
    • When using gradient boosting trees (possibility of model loss not converging while training)
    • How K-Fold cross validation could help prevent that
    • How selecting a proper cross validation technique helps in making more confident decisions
    • Preventing overfitting by;
      • Creating different validations sets (give "whys" on choice of method for creating them)
      • Making proper evaluations on validation sets with different metrics