Loading

Seismic Facies Identification Challenge

[Explainer]: Seismic Facies Identification Starter Pack

keras-image-segmentation package to produce a model, train it, and generate the final predictions.

santiactis

Just a simple notebook

Hi there! I am a recently graduated geophysicist from Argentina. I got into Data Science and Machine Learning just a few months ago, so I’m certainly an inexperienced little tiny deep learning practitioner as you may well guess.
Here are some questions you may ask yourselves…

What is this challenge?

This challenge tries to overcome something that has troubled the Oil&Gas industry for so many years: how can I interpret such huge amount of information in so little time?
Seismic facies are one of the abstractions that are regularly used to compress the amount of information. We define that certain patterns in the seismic image are defined by the response of a certain layer, this layer, when grouped with several other layers with the same characteristics, tend to generate a response that stands out from other patterns in the seismic image.
This common response helps grouping these layers that behave similarly, and thus helping reducing the amount of information on image.
But how do we group all these patterns together?! Well, Deep Learning to the rescue!

Let’s dive in:

What did you do?

The last few hours I’ve tried to put together a simple notebook that goes from showing some simple seismic attributes, to implementing a deep learning model. I hope that if you are starting just like me you’ll find it useful. And if you are an already experienced machine learning practitioner, you might find some insights on how to improve your results!

A couple of examples from the notebook:
TECVA seismic attribute

 

 

RMS seismic attribute

 

 

How did you do it?

I started by using my seismic attributes knowledge, then I tried to think of a way of implementing different kinds of information, and finally, I used the keras-image-segmentation package that you can find here: https://github.com/divamgupta/image-segmentation-keras to produce a model, train it, and generate the final predictions.

Sample image of the results of training the net (label set / trained set):

 

 

Well, but I could have done that on my own!..

Well, of course you could! But beware that I also try to give an insight on some facts over using this seismic dataset and how you may improve your results by taking this into account. Maybe it will help you!

I don’t really know about these seismic facies and stuff…

Well, maybe you just like to watch at some random guys’ notebook and you’ll probably like those nice images!

LINK TO THE NOTEBOOK: https://colab.research.google.com/drive/1935VS3tMKoJ1FbgR1AOkoAC2z0IO9HWr?usp=sharing

You can also get in touch via Linkedin (link to the post with this notebook)!:

Hope you guys find it useful, funny and maybe consider giving it a thumbs up!

See you around!

Just a simple notebook

Hi there! I am a recently graduated geophysicist from Argentina. I got into Data Science and Machine Learning just a few months ago, so I'm certainly an inexperienced little tiny deep learning practitioner as you may well guess. Here are some questions you may ask yourselves...

What did you do?

The last few hours I've tried to put together a simple notebook that goes from showing some simple seismic attributes, to implementing a deep learning model.

How did you do it?

I started by using my seismic attributes knowledge, then I tried to think of a way of implementing different kinds of information, and finally, I used the keras-image-segmentation package that you can find here: https://github.com/divamgupta/image-segmentation-keras to produce a model, train it, and test it.

Well, but I could have done that on my own!...

Well, of course you could! But beware that I also try to give an insight on some facts over using this seismic dataset and how you may improve your results by taking this into account. Maybe it will help you!

I don't really know about these seismic facies and stuff...

Well, maybe you just like to watch at some random guys' notebook and you'll probably like those nice images!

Hope you find it useful, funny and maybe consider giving it a thumbs up on the discourse!

Loading packages

This notebook uses keras_segmentation package to take advantage of other pre-trained models used in image segmentation.

It also uses:

  • NumPy (duh..)
  • matplotlib (duh..x2)
  • cv2 (dux..x3)
  • tensorflow (kind of duh..)
  • scipy (wait... why?)
In [ ]:
# Installing image_segmentation_keras
!pip install git+https://github.com/santiactis/image-segmentation-keras

# For data preprocessing & manipulation
import numpy as np

# FOr data visualisations & image processing
import matplotlib.pyplot as plt
import cv2
import scipy

# utilities
from tqdm.notebook import tqdm
import datetime

# For Deep learning
import tensorflow as tf
from tensorflow import keras
from keras_segmentation.models.unet import vgg_unet, resnet50_unet
from keras_segmentation.models.model_utils import get_segmentation_model
Collecting git+https://github.com/santiactis/image-segmentation-keras
  Cloning https://github.com/santiactis/image-segmentation-keras to /tmp/pip-req-build-2khdz18u
  Running command git clone -q https://github.com/santiactis/image-segmentation-keras /tmp/pip-req-build-2khdz18u
Requirement already satisfied: Keras>=2.0.0 in /usr/local/lib/python3.6/dist-packages (from keras-segmentation==0.3.0) (2.4.3)
Collecting imageio==2.5.0
  Downloading https://files.pythonhosted.org/packages/af/0a/943c965d372dae0b1f1482677d29030ab834351a61a9a632fd62f27f1523/imageio-2.5.0-py3-none-any.whl (3.3MB)
     |████████████████████████████████| 3.3MB 4.6MB/s 
Requirement already satisfied: imgaug==0.2.9 in /usr/local/lib/python3.6/dist-packages (from keras-segmentation==0.3.0) (0.2.9)
Requirement already satisfied: opencv-python in /usr/local/lib/python3.6/dist-packages (from keras-segmentation==0.3.0) (4.1.2.30)
Requirement already satisfied: tqdm in /usr/local/lib/python3.6/dist-packages (from keras-segmentation==0.3.0) (4.41.1)
Requirement already satisfied: scipy>=0.14 in /usr/local/lib/python3.6/dist-packages (from Keras>=2.0.0->keras-segmentation==0.3.0) (1.4.1)
Requirement already satisfied: h5py in /usr/local/lib/python3.6/dist-packages (from Keras>=2.0.0->keras-segmentation==0.3.0) (2.10.0)
Requirement already satisfied: numpy>=1.9.1 in /usr/local/lib/python3.6/dist-packages (from Keras>=2.0.0->keras-segmentation==0.3.0) (1.18.5)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.6/dist-packages (from Keras>=2.0.0->keras-segmentation==0.3.0) (3.13)
Requirement already satisfied: pillow in /usr/local/lib/python3.6/dist-packages (from imageio==2.5.0->keras-segmentation==0.3.0) (7.0.0)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.6/dist-packages (from imgaug==0.2.9->keras-segmentation==0.3.0) (3.2.2)
Requirement already satisfied: Shapely in /usr/local/lib/python3.6/dist-packages (from imgaug==0.2.9->keras-segmentation==0.3.0) (1.7.1)
Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from imgaug==0.2.9->keras-segmentation==0.3.0) (1.15.0)
Requirement already satisfied: scikit-image>=0.11.0 in /usr/local/lib/python3.6/dist-packages (from imgaug==0.2.9->keras-segmentation==0.3.0) (0.16.2)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.6/dist-packages (from matplotlib->imgaug==0.2.9->keras-segmentation==0.3.0) (0.10.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib->imgaug==0.2.9->keras-segmentation==0.3.0) (1.2.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib->imgaug==0.2.9->keras-segmentation==0.3.0) (2.4.7)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib->imgaug==0.2.9->keras-segmentation==0.3.0) (2.8.1)
Requirement already satisfied: networkx>=2.0 in /usr/local/lib/python3.6/dist-packages (from scikit-image>=0.11.0->imgaug==0.2.9->keras-segmentation==0.3.0) (2.5)
Requirement already satisfied: PyWavelets>=0.4.0 in /usr/local/lib/python3.6/dist-packages (from scikit-image>=0.11.0->imgaug==0.2.9->keras-segmentation==0.3.0) (1.1.1)
Requirement already satisfied: decorator>=4.3.0 in /usr/local/lib/python3.6/dist-packages (from networkx>=2.0->scikit-image>=0.11.0->imgaug==0.2.9->keras-segmentation==0.3.0) (4.4.2)
Building wheels for collected packages: keras-segmentation
  Building wheel for keras-segmentation (setup.py) ... done
  Created wheel for keras-segmentation: filename=keras_segmentation-0.3.0-cp36-none-any.whl size=30880 sha256=8796c58f35b298520c6fa79392488741869fee3c3d399fa365ef94dea851e219
  Stored in directory: /tmp/pip-ephem-wheel-cache-jjqtr4ic/wheels/31/3a/95/c6b5f8623b2525bdce6b13b95bb254a5237fb7e6e947e5365a
Successfully built keras-segmentation
ERROR: albumentations 0.1.12 has requirement imgaug<0.2.7,>=0.2.5, but you'll have imgaug 0.2.9 which is incompatible.
Installing collected packages: imageio, keras-segmentation
  Found existing installation: imageio 2.4.1
    Uninstalling imageio-2.4.1:
      Successfully uninstalled imageio-2.4.1
Successfully installed imageio-2.5.0 keras-segmentation-0.3.0

Loading the data

We download the datasets for the competition using !wget and their corresponding URLs

In [ ]:
# Downloading training data
!wget https://datasets.aicrowd.com/default/aicrowd-public-datasets/seamai-facies-challenge/v0.1/public/data_train.npz

# Downloading training data 
!wget https://datasets.aicrowd.com/default/aicrowd-public-datasets/seamai-facies-challenge/v0.1/public/labels_train.npz

# Downloading testing data 
!wget https://datasets.aicrowd.com/default/aicrowd-public-datasets/seamai-facies-challenge/v0.1/public/data_test_1.npz
--2020-10-21 14:18:37--  https://datasets.aicrowd.com/default/aicrowd-public-datasets/seamai-facies-challenge/v0.1/public/data_train.npz
Resolving datasets.aicrowd.com (datasets.aicrowd.com)... 35.189.208.115
Connecting to datasets.aicrowd.com (datasets.aicrowd.com)|35.189.208.115|:443... connected.
HTTP request sent, awaiting response... 302 FOUND
Location: https://s3.us-west-002.backblazeb2.com/aicrowd-public-datasets/seamai-facies-challenge/v0.1/public/data_train.npz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=002ae2491b744be0000000002%2F20201021%2Fus-west-002%2Fs3%2Faws4_request&X-Amz-Date=20201021T141838Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=4a26457c18bef7c58841d48ca0ccf31a3029fcaf246f29af72a25b17f3a79fb4 [following]
--2020-10-21 14:18:38--  https://s3.us-west-002.backblazeb2.com/aicrowd-public-datasets/seamai-facies-challenge/v0.1/public/data_train.npz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=002ae2491b744be0000000002%2F20201021%2Fus-west-002%2Fs3%2Faws4_request&X-Amz-Date=20201021T141838Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=4a26457c18bef7c58841d48ca0ccf31a3029fcaf246f29af72a25b17f3a79fb4
Resolving s3.us-west-002.backblazeb2.com (s3.us-west-002.backblazeb2.com)... 206.190.215.254
Connecting to s3.us-west-002.backblazeb2.com (s3.us-west-002.backblazeb2.com)|206.190.215.254|:443... connected.
HTTP request sent, awaiting response... 200 
Length: 1715555445 (1.6G) [application/octet-stream]
Saving to: ‘data_train.npz’

data_train.npz      100%[===================>]   1.60G  27.2MB/s    in 49s     

2020-10-21 14:19:27 (33.7 MB/s) - ‘data_train.npz’ saved [1715555445/1715555445]

--2020-10-21 14:19:27--  https://datasets.aicrowd.com/default/aicrowd-public-datasets/seamai-facies-challenge/v0.1/public/labels_train.npz
Resolving datasets.aicrowd.com (datasets.aicrowd.com)... 35.189.208.115
Connecting to datasets.aicrowd.com (datasets.aicrowd.com)|35.189.208.115|:443... connected.
HTTP request sent, awaiting response... 302 FOUND
Location: https://s3.us-west-002.backblazeb2.com/aicrowd-public-datasets/seamai-facies-challenge/v0.1/public/labels_train.npz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=002ae2491b744be0000000002%2F20201021%2Fus-west-002%2Fs3%2Faws4_request&X-Amz-Date=20201021T141928Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=55ed937229323c72c568ee1d811984a3d67e36f6cd99b669986819d73d446d0c [following]
--2020-10-21 14:19:28--  https://s3.us-west-002.backblazeb2.com/aicrowd-public-datasets/seamai-facies-challenge/v0.1/public/labels_train.npz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=002ae2491b744be0000000002%2F20201021%2Fus-west-002%2Fs3%2Faws4_request&X-Amz-Date=20201021T141928Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=55ed937229323c72c568ee1d811984a3d67e36f6cd99b669986819d73d446d0c
Resolving s3.us-west-002.backblazeb2.com (s3.us-west-002.backblazeb2.com)... 206.190.215.254
Connecting to s3.us-west-002.backblazeb2.com (s3.us-west-002.backblazeb2.com)|206.190.215.254|:443... connected.
HTTP request sent, awaiting response... 200 
Length: 7160425 (6.8M) [application/octet-stream]
Saving to: ‘labels_train.npz’

labels_train.npz    100%[===================>]   6.83M  23.4MB/s    in 0.3s    

2020-10-21 14:19:29 (23.4 MB/s) - ‘labels_train.npz’ saved [7160425/7160425]

--2020-10-21 14:19:29--  https://datasets.aicrowd.com/default/aicrowd-public-datasets/seamai-facies-challenge/v0.1/public/data_test_1.npz
Resolving datasets.aicrowd.com (datasets.aicrowd.com)... 35.189.208.115
Connecting to datasets.aicrowd.com (datasets.aicrowd.com)|35.189.208.115|:443... connected.
HTTP request sent, awaiting response... 302 FOUND
Location: https://s3.us-west-002.backblazeb2.com/aicrowd-public-datasets/seamai-facies-challenge/v0.1/public/data_test_1.npz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=002ae2491b744be0000000002%2F20201021%2Fus-west-002%2Fs3%2Faws4_request&X-Amz-Date=20201021T141930Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=bd0d97469d850f5429dcb93ab47982d4ac29d4bcf6dccf795aaed45f68df1a4d [following]
--2020-10-21 14:19:30--  https://s3.us-west-002.backblazeb2.com/aicrowd-public-datasets/seamai-facies-challenge/v0.1/public/data_test_1.npz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=002ae2491b744be0000000002%2F20201021%2Fus-west-002%2Fs3%2Faws4_request&X-Amz-Date=20201021T141930Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=bd0d97469d850f5429dcb93ab47982d4ac29d4bcf6dccf795aaed45f68df1a4d
Resolving s3.us-west-002.backblazeb2.com (s3.us-west-002.backblazeb2.com)... 206.190.215.254
Connecting to s3.us-west-002.backblazeb2.com (s3.us-west-002.backblazeb2.com)|206.190.215.254|:443... connected.
HTTP request sent, awaiting response... 200 
Length: 731382806 (698M) [application/octet-stream]
Saving to: ‘data_test_1.npz’

data_test_1.npz     100%[===================>] 697.50M  91.5MB/s    in 8.2s    

2020-10-21 14:19:38 (85.3 MB/s) - ‘data_test_1.npz’ saved [731382806/731382806]

Don't look in here.......

In [ ]:
#@title
import matplotlib.colors as mcolors
# Helpful functions
def hex_to_rgb(value):
    '''
    Converts hex to rgb colours
    value: string of 6 characters representing a hex colour.
    Returns: list length 3 of RGB values'''
    value = value.strip("#") # removes hash symbol if present
    lv = len(value)
    return tuple(int(value[i:i + lv // 3], 16) for i in range(0, lv, lv // 3))


def rgb_to_dec(value):
    '''
    Converts rgb to decimal colours (i.e. divides each value by 256)
    value: list (length 3) of RGB values
    Returns: list (length 3) of decimal values'''
    return [v/256 for v in value]

def get_continuous_cmap(hex_list, float_list=None):
    ''' creates and returns a color map that can be used in heat map figures.
        If float_list is not provided, colour map graduates linearly between each color in hex_list.
        If float_list is provided, each color in hex_list is mapped to the respective location in float_list. 
        
        Parameters
        ----------
        hex_list: list of hex code strings
        float_list: list of floats between 0 and 1, same length as hex_list. Must start with 0 and end with 1.
        
        Returns
        ----------
        colour map'''
    rgb_list = [rgb_to_dec(hex_to_rgb(i)) for i in hex_list]
    if float_list:
        pass
    else:
        float_list = list(np.linspace(0,1,len(rgb_list)))
        
    cdict = dict()
    for num, col in enumerate(['red', 'green', 'blue']):
        col_list = [[float_list[i], rgb_list[i][num], rgb_list[i][num]] for i in range(len(float_list))]
        cdict[col] = col_list
    cmp = mcolors.LinearSegmentedColormap('my_cmp', segmentdata=cdict, N=256)
    return cmp

We've got to talk about Seismic (Attributes)

We are going to make a gentle and very visual introduction to seismic attributes. These are widely used in the Oil&Gas industry to aid Geophysicists and Geologist to find new prospects fast and easy. Of course that afterwards, each potential prospect is further analyzed to confirm if it is just an anomaly in the seismic image or in fact, a potential prospect.

Visualizing the seismic slice over Y=380

We are going to visualize one of our seismic sections over which we are going to calculate differente seismic attributes.

In [ ]:
# Taking a seismic slice for calculating seismic attributes
data_seis = np.load('/content/data_train.npz', 
               allow_pickle=True, mmap_mode = 'r')['data'][:,:,380]
In [ ]:
# Setting figure size
plt.rcParams["figure.figsize"] = (20, 10)
# We plot the slice
plt.imshow(data_seis, cmap='binary')
Out[ ]:
<matplotlib.image.AxesImage at 0x7fae78be29b0>

Instantaneous Attributes

They are defined by taking the analytic trace in consideration. This trace is obtained by taking the Hilbert transform over the real trace, and thus, obtaining the complex part of the analytic trace. By doing this, the seismic trace is composed by a real part (the real trace) and a complex part (its Hilbert transform.

This trace now allows us to calculate the so called 'Instantaneous Attributes'. These seismic attributes were of the first ever used when looking for DHI (Direct Hydrocarbon Indicators), generally by looking at the Envelope searching for amplitude anomalies.

The analytic trace $u(t)$ is defined as: $u(t) = x(t) + i y(t)$ , where $x(t)$ is the real trace and $y(t)$ is the Hilbert transform of $x(t)$ ( $H[x(t)]$ )

In [ ]:
# We calculate the Hilbert transform for every trace in the slice to obtain the analytic traces
def calc_hil(data):
  H = []
  for i in range(data.shape[1]):
    H.append(np.imag(scipy.signal.hilbert(data[:,i])))
  H = np.asarray(H)
  H = np.swapaxes(H, 0,1)
  return H

H = calc_hil(data_seis)
In [ ]:
# We plot the slice
plt.imshow(H, cmap='binary')
Out[ ]:
<matplotlib.image.AxesImage at 0x7fae786d0278>

Ok, so at first glance this might seem the same thing... So let's go a little further

Envelope

This seismic attribute contributes information about the strength of a reflection, therefore giving information about "how different" are two geologic formations on the subsurface. The envelope is defined as: $E(t)=\sqrt{x^2 + y^2}$

In [ ]:
# We calculate the Envelope
def calc_env(data):
  env = data.copy()*0.0
  for i in range(data.shape[1]):
    for j in range(data.shape[0]):
      env[j,i] = np.sqrt((data[j,i])**2 + (H[j,i])**2)
  return env

env = calc_env(data_seis)
# We plot the slice
plt.imshow(env, cmap='jet')
Out[ ]:
<matplotlib.image.AxesImage at 0x7fae76e3c358>

Well... This looks a little bit more promising now, doesn't it? I can already assure you that some geophysicist or geologist are looking into those bright anomalies next to that main fault drooling over the potential fields... This could be enhanced visually by changing the middle values of the palette. Let's keep going...

Instantaneous Phase

This one is kind of a trickier one. It generally helps to define the edges of the reflectors and also helps on defining discontinuities. Its meaning it's a little difficult to define, some say that it has to do with the wave front, thus, the values that have a certain continuity correspond to the same wave front, and since we are looking at reflections, well, we could be seeing the same reflector if we follow a certain pattern. It is defined as: $ \theta = atan\left (\frac{x(t)}{y(t)} \right)$ . This generates some troubles over certain values, therefore it's useful to take the Cosine of that function, which is what I will do here.

In [ ]:
# We calculate the Cosine of the Instantaneous Phase
def cos_ph(data):
  ph = data.copy()*0.0
  for i in range(data.shape[1]):
    for j in range(data.shape[0]):
      ph[j,i] = np.cos((data[j,i])/(H[j,i]))
  return ph

ph = cos_ph(data_seis)
# We plot the slice
hex_list = ['#000000', '#ffffff', '#000000']
plt.imshow(ph, cmap=get_continuous_cmap(hex_list))
Out[ ]:
<matplotlib.image.AxesImage at 0x7fae76da5358>