Seismic Facies Identification Challenge
[Explainer] Detectron2 & COCO Dataset 🔥 • Web Application & Visualizations • End-to-End Baseline & Tensorflow
Detectron2 & COCO Dataset 🔥 • Web Application & Visualizations • End-to-End Baseline & Tensorflow
So, me Shubhamai and I have come up with these 3 things -
COCO Dataset & using Detectron2, MMDetection
YES! I have converted this dataset into COCO Dataset and which we train Mask-RCNN using Detectron2.
There we go boys - Colab Link
More things will be added so like this post RIGHT NOW
Web Application & Visualisation
https://seismic-facies-identification.herokuapp.com/
But this time, I found that a great preprocessing pipeline can help to model to find accurate features and increasing overall accuracy. But it kinda isn’t that easy as it looks —
So I made a Web Application based on that which allows you to play/experiment with many of the image preprocessing functions/methods, changing parameters or writing custom image preprocessing functions to experiment.
And it also contains all the visualizations from the colab notebook .
I hope that it will help you in making the perfect preprocessing pipelines .
End-to-End Baseline & Tensorflow
https://colab.research.google.com/drive/1t1hF_Vs4xIyLGMw_B9l1G6qzLBxLB5eG?usp=sharing
I have made a complete colab notebook from Data Exploration to Submitting Predictions. Here are some of the glimpse of the image visualization section!
And this 3D Plot!
Tables of Content -
- Setting our Workspace
- Data Exploration
- Image Preprocessing Techniqes
- Creating our Dataset
- Creating our Model
- Training the Model
- Evaluating the model
- Testing on test Data
- Generate More Data + Some tips & tricks
The main libraries covered in this notebook is —
- Tensorflow 2.0 & Keras
- Plotly
- cv2
and much more…
The model that i am using is UNet, pretty much standard in image segmentation. More is in the colab notebook!
I hope the colab notebook will help you get started in this competition or learning something new . If the notebook did help you, make sure to like the post. lol.
https://colab.research.google.com/drive/1t1hF_Vs4xIyLGMw_B9l1G6qzLBxLB5eG?usp=sharing
Please like the topic if this helps in any way possible . I really appreciate that
🌎 Facies Identification Challenge: 3D image interpretation by Machine Learning¶
In this challange we need to identify facies as an image, from 3D seismic image using Deep Learing with various tools like tensorflow, keras, numpy, pandas, matplotlib, plotly and much much more..
Problem¶
Segmentating the 3D seismic image into an image with each pixel can be classfied into 6 labels based on patterns in the image.
https://www.aicrowd.com/challenges/seismic-facies-identification-challenge#introduction
Dataset¶
We have 3D datasets both ( features X, and labels Y ) with shape for X in 1006 × 782 × 590, in axis corresponding Z, X, Y and Y in 1006 × 782 × 590 in also axis corresponsing Z, X, Y.
https://www.aicrowd.com/challenges/seismic-facies-identification-challenge/dataset_files
We can say that we have total of 2,378 trainig images with their corresponsing labels and we also have same number of 2,378 testing images which we will predict labels for.
https://www.aicrowd.com/challenges/seismic-facies-identification-challenge#dataset
Evaluation¶
The evaluation metrics are the F1 score and accuracy.
https://www.aicrowd.com/challenges/seismic-facies-identification-challenge#evaluation-criteria
Tables of Content¶
- Setting our Workspace 💼
- Downloading our Dataset
- Importing Necessary Libraries
Data Exploration 🧐
- Reading our Dataset
- Image Visualisations
Image Preprocessing Techniqes 🧹
- Image preprocessing
Creating our Dataset 🔨
- Loading data into memory
- Making 2D Images
Creating our Model 🏭
- Creating Unet Model
- Setting up hyperparameters
Training the Model 🚂
- Setting up Tensorboard
- Start Training!
Evaluating the model 🧪
- Evaluating our Model
Testing on test Data 💯
Generate More Data + Some tips & tricks 💡
Setting our Workspace 💼¶
In this section we are going to download our dataset & also downloading some libraries, and then importing up all libraries to get ready!
Downloading our Dataset¶
# Downloading training data ( Seismic Images | X )
!wget https://datasets.aicrowd.com/default/aicrowd-public-datasets/seamai-facies-challenge/v0.1/public/data_train.npz
# Downloading training data ( Labels | Y )
!wget https://datasets.aicrowd.com/default/aicrowd-public-datasets/seamai-facies-challenge/v0.1/public/labels_train.npz
# Downloading Testing Dataset
!wget https://datasets.aicrowd.com/default/aicrowd-public-datasets/seamai-facies-challenge/v0.1/public/data_test_1.npz
Importing Necessary Libraries¶
!pip install git+https://github.com/tensorflow/examples.git
!pip install git+https://github.com/karolzak/keras-unet
# # install dependencies: (use cu101 because colab has CUDA 10.1)
!pip install -U torch==1.5 torchvision==0.6 -f https://download.pytorch.org/whl/cu101/torch_stable.html
!pip install cython pyyaml==5.1
!pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
import torch, torchvision
print(torch.__version__, torch.cuda.is_available())
!gcc --version
# install detectron2:
!pip install detectron2==0.1.2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/index.html
!pip install imantics
# For data preprocessing & manipulation
import numpy as np
import pandas as pd
# FOr data visualisations & graphs
import matplotlib.pyplot as plt
import plotly.graph_objects as go
import plotly.express as px
from plotly.subplots import make_subplots
# utilities
from tqdm.notebook import tqdm
import datetime
from IPython.display import HTML
import os
# For Deep learning
import tensorflow as tf
from tensorflow_examples.models.pix2pix import pix2pix
import tensorflow_datasets as tfds
import tensorflow_addons as tfa
# For Image Preprocessing
import cv2
# Detectron2
import detectron2
from detectron2.utils.logger import setup_logger
from imantics import Polygons, Mask
setup_logger()
import random
# import some common detectron2 utilities
from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog
from detectron2.structures import BoxMode
from pycocotools import mask
from skimage import measure
from detectron2.data import DatasetCatalog, MetadataCatalog
from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg
# Setting a bigger figure size
plt.rcParams["figure.figsize"] = (20, 15)
Data Exploration 🧐¶
In this section we are going to explore our dataset, firstly load it and seeing some array, categories and then image visualisations
Reading Our Dataset¶
# Reading our Training dataset ( Seismic Images | X )
data = np.load("/content/data_train.npz",
allow_pickle=True, mmap_mode = 'r')
# Reading our Traning Dataset ( Labels | Y)
labels = np.load("/content/labels_train.npz",
allow_pickle=True, mmap_mode = 'r')
# Picking the actual data
X = data['data']
Y = labels['labels']
# Dimensions of features & labels
X.shape, Y.shape
# Showing the data
X[:, 6, :], Y[:, 6, :]
Here we are making a 2D array of image, so we are picking the 6th index of X axis and seing the Z and Y axis values!
Also it looks like that we have got negative values also in X, but the Y looks good!
np.unique(Y)
Ther are 6 different unique values in labels, as said before, each pixel can be classified into 6 different labels
Image Visualisations¶
# Making a subplot with 1 row and 2 column
fig = make_subplots(1, 2, subplot_titles=("Image", "Label"))
# Visualising a section of the 3D array
fig.add_trace(go.Heatmap(z=X[:, :, 70][:300, :300]), 1, 1)
fig.add_trace(go.Heatmap(z=Y[:, :, 70][:300, :300]), 1, 2)
fig.update_layout(height=600, width=1100, title_text="Seismic Image & Label")
HTML(fig.to_html())
# Making a subplot with 1 row and 2 column
fig = make_subplots(1, 2, subplot_titles=("Image", "Label"), specs=[[{"type": "Surface"}, {"type": "Surface"}]])
# Making a 3D Surphace graph with image and corresponsing label
fig.add_trace(go.Surface(z=X[:,75, :][:300, :300]), 1, 1)
fig.add_trace(go.Surface(z=Y[:,75, :][:300, :300]), 1, 2)
fig.update_layout(height=600, width=1100, title_text="Seismic Image & Label in 3D!")
HTML(fig.to_html())
# Making a subplot with 1 row and 2 column
fig = make_subplots(1, 2, subplot_titles=("Image", "Label"))
# Making a contour graph
fig.add_trace(go.Contour(
z=X[:,34, :][:300, :300]), 1, 1)
fig.add_trace(go.Contour(
z=Y[:,34, :][:300, :300]
), 1, 2)
fig.update_layout(height=600, width=1100, title_text="Seismic Image & Label in with contours")
HTML(fig.to_html())
# Making a subplot with 2 row and 2 column
fig = make_subplots(2, 2, subplot_titles=("Image", "Label", "Label Histogram"))
# Making a contour graph
fig.add_trace(go.Contour(
z=X[:,34, :][:300, :300], contours_coloring='lines',
line_width=2,), 1, 1)
# Showing the label ( also the contour )
fig.add_trace(go.Contour(
z=Y[:,34, :][:300, :300]
), 1, 2)
# Showing histogram for the label column
fig.add_trace(go.Histogram(x=Y[:,34, :][:300, :300].ravel()), 2, 1)
fig.update_layout(height=800, width=1100, title_text="Seismic Image & Label in with contours ( only line )")
HTML(fig.to_html())
# Making a subplot with 2 row and 1 column
fig = make_subplots(2, 1, subplot_titles=("Image", "label"))
# Making a contour graph
fig.add_trace(
go.Contour(
z=X[:,:, 56][:200, :200]
), 1, 1)
fig.add_trace(go.Contour(
z=Y[:,:, 56][:200, :200]
), 2, 1)
fig.update_layout(height=1000, width=1100, title_text="Seismic Image & Label in with contours ( More Closer Look )")
HTML(fig.to_html())
Image Preprocessing Techniqes 🧹¶
In this section we are going to take a look at some image processing technique to see how we can improve the features so that our model and give more accuracy!
# Reading a sample seismic image with label
img = X[:,:, 56]
label = Y[:, :, 56]
plt.imshow(img, cmap='gray')
plt.show()
plt.imshow(label)
# Image Thresholding
ret,thresh1 = cv2.threshold(img,0,255,cv2.THRESH_TOZERO)
plt.imshow(thresh1, cmap='gray')
# Sobel Y
sobely = cv2.Sobel(img,cv2.CV_64F, 0, 4,ksize=5)
plt.imshow(sobely, cmap='gray')
# Erosion
kernel = np.ones((5,5),np.uint8)
erosion = cv2.erode(img,kernel,iterations = 1)
plt.imshow(erosion, cmap='gray')
# Dialation
dilation = cv2.dilate(img,kernel,iterations = 1)
plt.imshow(dilation, cmap='gray')
# Sharping Image
kernel = np.array([[0, -1, -1],[2, -1, 2],[-1, 2, -1]], np.float32)
sharp = cv2.filter2D(thresh1, -1, kernel)
plt.imshow(sharp, cmap='gray')
# Making a subplot containing all image preprocessing
fig,a = plt.subplots(4,2)
x = np.arange(1,5)
plt.title("All Image Processing")
a[0][0].imshow(img , cmap='gray')
a[0][0].set_title('Original')
a[0][1].imshow(label)
a[0][1].set_title('Label')
a[1][0].imshow(thresh1, cmap='gray')
a[1][0].set_title('Threshold')
a[1][1].imshow(sobely, cmap='gray')
a[1][1].set_title('Sobel Y')
a[2][0].imshow(erosion, cmap='gray')
a[2][0].set_title('Erosion')
a[2][1].imshow(dilation, cmap='gray')
a[2][1].set_title('Dialation')
a[3][0].imshow(sharp, cmap='gray')
a[3][0].set_title('Sharpen')
fig.delaxes(a[3,1])
plt.show()
Creating our Model 🏭¶
IN this section we are going to create a UNet model from scratch using keras & tensorflow!
Creating UNet Model¶
The tensorflow guide on image segmentation https://www.tensorflow.org/tutorials/images/segmentation helped me a lot of implment UNet Model. Really Recommended to take a look at that before continuing forward!
# Making the Base Model First
# Using transfer learning ( MobileNetV2 Model )
base_model = tf.keras.applications.MobileNetV2(input_shape=[128, 128, 3], include_top=False)
# Use the activations of these layers
layer_names = [
'block_1_expand_relu', # 64x64
'block_3_expand_relu', # 32x32
'block_6_expand_relu', # 16x16
'block_13_expand_relu', # 8x8
'block_16_project', # 4x4
]
layers = [base_model.get_layer(name).output for name in layer_names]
# Creating thr base Model
down_stack = tf.keras.Model(inputs=base_model.input, outputs=layers)
# Setting base model trainable to false
down_stack.trainable = False
up_stack = [
pix2pix.upsample(512, 3),
pix2pix.upsample(256, 3),
pix2pix.upsample(128, 3),
pix2pix.upsample(64, 3),
]
# Making the unet model
def unet_model():
inputs = tf.keras.layers.Input(shape=[128, 128, 3])
x = inputs
# Downsampling through the model
skips = down_stack(x)
x = skips[-1]
skips = reversed(skips[:-1])
# Upsampling and establishing the skip connections
for up, skip in zip(up_stack, skips):
x = up(x)
concat = tf.keras.layers.Concatenate()
x = concat([x, skip])
# This is the last layer of the model
last = tf.keras.layers.Conv2DTranspose(
1, 3, strides=2,
padding='same')
x = last(x)
return tf.keras.Model(inputs=inputs, outputs=x)
# Creating the Model
model = unet_model()
tf.keras.utils.plot_model(model, show_shapes=True)
Setting up hyperparameters & Callbacks¶
Dice Loss¶
# Here https://stackoverflow.com/questions/49012025/generalized-dice-loss-for-multi-class-segmentation-keras-implementation
def gen_dice(y_true, y_pred, eps=1e-6):
"""both tensors are [b, h, w, classes] and y_pred is in logit form"""
# [b, h, w, classes]
pred_tensor = tf.nn.softmax(y_pred)
y_true_shape = tf.shape(y_true)
# [b, h*w, classes]
y_true = tf.reshape(y_true, [-1, y_true_shape[1]*y_true_shape[2], y_true_shape[3]])
y_pred = tf.reshape(pred_tensor, [-1, y_true_shape[1]*y_true_shape[2], y_true_shape[3]])
# [b, classes]
# count how many of each class are present in
# each image, if there are zero, then assign
# them a fixed weight of eps
counts = tf.reduce_sum(y_true, axis=1)
weights = 1. / (counts ** 2)
weights = tf.where(tf.math.is_finite(weights), weights, eps)
multed = tf.reduce_sum(y_true * y_pred, axis=1)
summed = tf.reduce_sum(y_true + y_pred, axis=1)
# [b]
numerators = tf.reduce_sum(weights*multed, axis=-1)
denom = tf.reduce_sum(weights*summed, axis=-1)
dices = 1. - 2. * numerators / denom
dices = tf.where(tf.math.is_finite(dices), dices, tf.zeros_like(dices))
return tf.reduce_mean(dices)
log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1, write_images=True, update_freq='batch')
model.compile(optimizer='adam',
loss='mean_squared_error',
metrics=['accuracy'])
Creating our Dataset 🔨¶
In this section we are going to load all the dataset into memory/RAM ( because memory/RAM is faster then hard drive/SSD 😄 ) and then getting into right shape
Loading data into memory¶
with np.load('/content/data_train.npz') as dataset:
train_dataset = dataset['data']
with np.load('/content/labels_train.npz') as labels:
train_labels = labels['labels']
train_dataset.shape
Making 2D Images¶
training_img_data = []
training_label_data = []
for i in tqdm(range(0, 580)):
img = train_dataset[:, :, i]
label = train_labels[:, :, i]
img = np.expand_dims(img, axis=2).astype('float32')
label = np.expand_dims(label, axis=2).astype('float32')
img = cv2.resize(img, (128, 128))
label = cv2.resize(label, (128, 128))
img = img/np.amax(img)
img = np.clip(img, 0, 255)
img = (img*255).astype(int)
img = img/255.
img = cv2.merge([img,img,img])
training_img_data.append(img)
training_label_data.append(label)
# Changing it into a numpy array
training_img_data = np.asarray(training_img_data)
training_label_data = np.asarray(training_label_data)
training_img_data.shape, training_label_data.shape
training_img_data[0, :, :, 0]
plt.imshow(training_img_data[0, :, :])
Training the Model 🚂¶
Setting up Tensorboard¶
%load_ext tensorboard
%tensorboard --logdir logs
Start Training!¶
model_history = model.fit(training_img_data, training_label_data,
validation_split=0.1,
epochs=20,
callbacks=[tensorboard_callback])
We have got overfitting, but that i will leave that up to you how you improve that 🙂
Evaluating the model 🧪¶
pred_mask = model.predict(training_img_data)
plt.imshow(pred_mask[0, :, :, 0])
plt.imshow(training_label_data[0, :, :])
Testing on test Data 💯¶
In this section we are going to test the model using testing set and then saving all our predictions
# Reading the test data
test = np.load("/content/data_test_1.npz",
allow_pickle=True, mmap_mode = 'r')
test_data = test['data']
# Function to Preprocessing the inputs to match the model input
def preprocess_input(data, axis):
for i in range(0, axis):
img = test_data[i, :, :]
img = np.expand_dims(img, axis=2).astype('float32')
img = cv2.resize(img, (128, 128))
img = img/np.amax(img)
img = np.clip(img, 0, 255)
img = (img*255).astype(int)
img = cv2.merge([img,img,img])
data.append(img)
return data
test_image = []
# Preprocessing the inputs
test_image = preprocess_input(test_image, 1006)
# Converting it into a numpy array
test_image = np.asarray(test_image)
test_image.shape
# Predicting all images and converting it to each pixel to integers
test_predictions = model.predict(test_image).astype(int)
np.unique(test_predictions), test_predictions.shape
# Making the pixel values in range 1 - 6
test_predictions[test_predictions > 6] = 6
test_predictions[test_predictions < 1] = 1
np.unique(test_predictions)
# Function Resizing the images to match the output
def resize_img(data, shape):
local_data = []
for i in data:
img = i[:, :, 0].astype('float32')
img = cv2.resize(img, shape)
local_data.append(img)
return np.asarray(local_data)
# Resizing the image
test_predictions = resize_img(test_predictions, (251, 782))
# Making sure that the output matches
test_predictions.shape, test_predictions.dtype
# Converting into intergers
test_predictions = test_predictions.astype(int)
test_predictions.dtype, test_predictions
# Saving the Predictions
np.savez_compressed(
"prediction.npz",
prediction=test_predictions
)
Generate More Data + Some tips & tricks 💡¶
- MSE loss in not normally used in Image Segmentation , try different!
- Data Argumentation isn't done, we can try that, it will improve significantly
- I didn't train the model on complete dataset, this also can be done ( X and y indices can also be trained ! )
- Accuracy is not a metrics for image segmentation, try different like Dice or something similar
Keras U-NET¶
one_hot_train_label_data = []
for img in training_label_data:
img = img.astype(int)
one_hot_train_label_data.append(np.eye(img.max()+1)[img])
one_hot_train_label_data = np.array(one_hot_train_label_data)
one_hot_train_label_data.shape
one_hot_train_label_data[0, :, :, 1]
from keras_unet.utils import plot_imgs
plot_imgs(org_imgs=training_img_data, mask_imgs=one_hot_train_label_data[:, :, :, 1], nm_img_to_plot=10, figsize=6)
from keras_unet.utils import get_augmented
#.reshape(training_label_data.shape[0], training_label_data.shape[1], training_label_data.shape[2], 1)
train_gen = get_augmented(
training_img_data, one_hot_train_label_data[:,:,:,1].reshape(training_label_data.shape[0], training_label_data.shape[1], training_label_data.shape[2], 1), batch_size=2,
data_gen_args = dict(
rotation_range=15.,
width_shift_range=0.05,
height_shift_range=0.05,
shear_range=50,
zoom_range=0.2,
horizontal_flip=True,
vertical_flip=True,
fill_mode='constant'
))
sample_batch = next(train_gen)
xx, yy = sample_batch
print(xx.shape, yy.shape)
from keras_unet.utils import plot_imgs
plot_imgs(org_imgs=xx, mask_imgs=yy, nm_img_to_plot=2, figsize=6)
from keras_unet.models import custom_unet
input_shape = training_img_data[0].shape
model = custom_unet(
input_shape,
use_batch_norm=True,
num_classes=1,
filters=64,
dropout=0.2,
output_activation='sigmoid'
)
model.summary()
from tensorflow.keras.callbacks import ModelCheckpoint
model_filename = 'segm_model_v0.h5'
callback_checkpoint = ModelCheckpoint(
model_filename,
verbose=1,
monitor='val_loss',
save_best_only=True,
)
from keras.optimizers import Adam, SGD
from keras_unet.metrics import iou, iou_thresholded
from keras_unet.losses import jaccard_distance
model.compile(
#optimizer=Adam(),
optimizer=SGD(lr=0.01, momentum=0.99),
loss='binary_crossentropy',
#loss=jaccard_distance,
metrics=[iou, iou_thresholded]
)
history = model.fit_generator(
train_gen,
steps_per_epoch=100,
epochs=10,
#validation_data=(x_val, y_val),
callbacks=[callback_checkpoint]
)
Content
Comments
You must login before you can post a comment.