Loading

Environment Classification

Solution for submission 156842

A detailed solution for submission 156842 submitted for challenge Environment Classification

konstantin_diachkov

Environment Classification

In this challenge, you will have images of a self driving car moving through a town in different weather conditions. Your goal will be to classify the environment into 5 different classes ( using unsupervised methonds ), 1 means the weather is really good for a self driving car while 5 means the weather is very challenging for a self driving car.

  • Unsupvised Image Classification

Image clustering using Transfer learning

Resnet50 + Kmeans based image clustering model

https://towardsdatascience.com/image-clustering-using-transfer-learning-df5862779571

In [1]:
!pip install -q aicrowd-cli
%load_ext aicrowd.magic
     |████████████████████████████████| 44 kB 1.3 MB/s 
     |████████████████████████████████| 1.1 MB 7.2 MB/s 
     |████████████████████████████████| 54 kB 2.8 MB/s 
     |████████████████████████████████| 211 kB 52.1 MB/s 
     |████████████████████████████████| 170 kB 40.0 MB/s 
     |████████████████████████████████| 62 kB 863 kB/s 
     |████████████████████████████████| 63 kB 1.8 MB/s 
     |████████████████████████████████| 51 kB 7.2 MB/s 
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires requests~=2.23.0, but you have requests 2.26.0 which is incompatible.
datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.
In [2]:
%aicrowd login
Please login here: https://api.aicrowd.com/auth/qBNvIq0YzlT6GTsihXKlzHlxe7hIACPEFHrs6maKIgQ
API Key valid
Saved API Key successfully!
In [3]:
# Downloading the Dataset
!rm -rf data
!mkdir data
%aicrowd ds dl -c environment-classification -o data
In [4]:
# Unzipping and Organising the datasets
!unzip data/images.zip  -d data/images > /dev/null
In [5]:
import os
import csv 
from pathlib import Path
import random
import time

import pandas as pd
import numpy as np
In [6]:
DATA_DIR = "data/images/"

Model

In [7]:
from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.models import Sequential

resnet = ResNet50(include_top=False, pooling='avg', weights='imagenet')
my_new_model = Sequential()
my_new_model.add(resnet)
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5
94773248/94765736 [==============================] - 1s 0us/step
94781440/94765736 [==============================] - 1s 0us/step
In [8]:
# Say not to train first layer (ResNet) model. It is already trained
my_new_model.layers[0].trainable = False

Images Preprocessing

In [9]:
%%time
from tensorflow.keras.applications.resnet50 import preprocess_input
import cv2 
import numpy as np

resnet_feature_list = []
images = [f for f in os.listdir(DATA_DIR)]
for image in images:
    file = DATA_DIR+image
    #print(file)
    im = cv2.imread(file)
    #im = cv2.resize(im,(256,256))
    img = preprocess_input(np.expand_dims(im.copy(), axis=0))
    resnet_feature = my_new_model.predict(img)
    resnet_feature_np = np.array(resnet_feature)
    resnet_feature_list.append(resnet_feature_np.flatten())

array = np.array(resnet_feature_list)
CPU times: user 1min 13s, sys: 2.45 s, total: 1min 15s
Wall time: 1min 39s
In [10]:
array.shape
Out[10]:
(700, 2048)
In [11]:
from sklearn.cluster import KMeans 

kmeans = KMeans(n_clusters=5, random_state=None, n_init=50, max_iter=1000).fit(array)  # 

print(kmeans.labels_)
[3 4 3 4 4 4 3 1 4 4 3 0 0 4 4 4 3 4 2 4 2 4 3 3 1 4 4 2 4 4 1 4 3 0 4 4 0
 3 3 3 1 2 3 3 4 3 1 4 4 3 2 2 3 4 1 2 3 3 4 3 2 3 4 2 3 0 2 1 2 3 2 4 4 4
 4 4 4 2 4 3 4 4 4 0 3 2 3 1 2 4 4 4 1 4 3 1 3 3 1 1 3 4 4 1 1 3 4 2 3 3 3
 3 3 4 4 4 4 3 2 3 3 1 4 2 3 3 3 4 4 4 4 3 3 3 0 3 2 3 3 4 4 3 3 1 3 4 2 1
 3 3 2 2 4 2 2 4 4 4 3 3 2 3 4 4 3 2 2 4 4 3 1 1 3 4 1 2 2 1 3 4 3 3 3 3 3
 2 2 4 3 1 1 3 3 4 4 4 2 2 3 4 4 4 1 1 4 3 4 2 3 4 4 4 2 3 2 1 3 4 2 2 4 2
 1 1 1 3 3 1 3 3 2 4 3 4 2 2 4 1 4 4 4 4 4 4 4 4 2 2 4 4 4 3 3 1 1 4 2 2 1
 3 1 4 4 2 4 4 3 4 3 3 4 3 2 4 4 4 2 2 3 1 4 4 3 4 1 3 3 4 4 1 4 1 4 2 4 2
 2 4 2 1 3 2 4 3 4 2 4 4 4 3 4 1 1 3 3 1 4 1 3 1 4 1 4 1 2 1 2 3 3 4 2 3 1
 2 2 3 1 1 3 4 4 3 1 2 1 2 4 1 2 2 3 4 3 3 2 4 1 3 3 3 1 3 2 3 2 0 1 4 1 2
 3 4 4 1 0 3 1 1 4 3 4 4 3 4 3 1 4 3 4 0 4 4 3 4 4 3 3 4 0 4 2 4 3 3 3 4 0
 2 2 1 4 3 3 3 4 3 3 4 2 3 4 1 1 3 4 1 2 2 3 3 4 4 2 3 4 2 4 4 3 4 4 4 3 2
 4 4 2 2 3 4 3 2 3 1 4 3 3 4 3 3 4 1 3 1 0 4 1 2 3 4 1 4 3 2 1 4 4 4 3 4 1
 3 2 4 3 4 3 2 3 1 4 1 3 4 4 1 1 1 3 3 3 2 3 3 3 1 3 2 4 1 1 1 4 3 4 4 3 1
 4 2 3 1 4 2 3 3 0 3 3 3 1 4 4 3 4 4 3 3 3 4 1 4 4 2 3 3 1 4 2 1 4 3 2 0 4
 3 4 2 4 1 4 0 2 4 1 3 3 0 0 3 3 3 1 2 3 2 4 2 4 1 1 2 3 3 3 4 4 3 1 2 4 3
 3 3 2 1 3 4 3 3 3 2 3 1 2 1 3 4 4 2 4 3 1 3 4 4 3 1 3 4 3 4 3 4 4 2 2 3 1
 3 4 2 0 3 4 2 1 1 3 1 1 3 4 2 4 1 4 1 0 4 4 3 4 2 4 4 4 4 1 3 1 3 3 4 4 4
 4 3 1 4 3 3 3 4 1 3 3 2 3 4 3 3 3 3 3 3 4 4 2 2 2 3 3 3 3 4 3 1 3 2]

Submission

In [12]:
img_ids_list = [f[:-4] for f in images]
In [15]:
img_ids_list[0]
Out[15]:
'200'
In [23]:
pre_sub = {'ImageID':img_ids_list, "label":kmeans.labels_}
pre_sub = pd.DataFrame(pre_sub)

pre_sub = pre_sub.astype(int)
pre_sub = pre_sub.sort_values(by=['ImageID'])  

pre_sub
Out[23]:
ImageID label
160 0 2
127 1 4
410 2 4
484 3 3
324 4 2
... ... ...
528 695 3
94 696 3
175 697 2
249 698 4
97 699 3

700 rows × 2 columns

In [24]:
pre_sub.label.value_counts()
Out[24]:
4    231
3    224
2    114
1    111
0     20
Name: label, dtype: int64

It is clear that 20 images are missclassified, we get rid of them and repeat training process

In [33]:
to_del = np.array(pre_sub[pre_sub.label == 0].ImageID)
to_del = set(to_del)
images_clean = []
for image in images:
    if int(image[:-4]) not in to_del:
        images_clean.append(image)
len(images_clean)
Out[33]:
680
In [34]:
%%time
from tensorflow.keras.applications.resnet50 import preprocess_input
import cv2 
import numpy as np

resnet_feature_list = []
# images = [f for f in os.listdir(DATA_DIR)]
for image in images_clean:
    file = DATA_DIR+image
    #print(file)
    im = cv2.imread(file)
    #im = cv2.resize(im,(256,256))
    img = preprocess_input(np.expand_dims(im.copy(), axis=0))
    resnet_feature = my_new_model.predict(img)
    resnet_feature_np = np.array(resnet_feature)
    resnet_feature_list.append(resnet_feature_np.flatten())

array = np.array(resnet_feature_list)
CPU times: user 1min 10s, sys: 1.56 s, total: 1min 11s
Wall time: 1min 10s
In [36]:
array.shape
Out[36]:
(680, 2048)
In [37]:
from sklearn.cluster import KMeans 

kmeans = KMeans(n_clusters=5, random_state=None, n_init=50, max_iter=1000).fit(array)  #
In [38]:
img_ids_list_clean = [f[:-4] for f in images_clean]
In [84]:
pre_sub_2 = {'ImageID':img_ids_list_clean, "label":kmeans.labels_}
pre_sub_2 = pd.DataFrame(pre_sub_2)

pre_sub_2 = pre_sub_2.astype(int)

rnd_labels = []
for i in range(len(to_del)):
    rnd_labels.append(random.randint(0,4))
missing_labels = [3, 3, 2, 1, 1, 0, 1, 2, 1, 2, 2, 2, 3, 2, 1, 4, 0, 2, 0, 3]

ending = {'ImageID':list(to_del), 'label':rnd_labels}
ending = pd.DataFrame(ending)

submission = pd.concat([pre_sub_2, ending], axis=0)
submission = submission.sort_values(by=['ImageID'])
Out[84]:
20
In [65]:
!rm -rf assets
!mkdir assets

submission.to_csv(os.path.join("assets", "submission.csv"), index=False)
In [ ]:

Making Direct Submission thought Aicrowd CLI

In [66]:

/usr/local/lib/python3.7/dist-packages/aicrowd/notebook/helpers.py:361: UserWarning: `%aicrowd` magic command can be used to save the notebook inside jupyter notebook/jupyterLab environment and also to get the notebook directly from the frontend without mounting the drive in colab environment. You can use magic command to skip mounting the drive and submit using the code below:
 %load_ext aicrowd.magic
%aicrowd notebook submit -c environment-classification -a assets --no-verify
  warnings.warn(description + code)
Mounting Google Drive 💾
Your Google Drive will be mounted to access the colab notebook
Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.activity.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fexperimentsandconfigs%20https%3a%2f%2fwww.googleapis.com%2fauth%2fphotos.native&response_type=code

Enter your authorization code:
4/1AX4XfWgbhNxSVaym8w-m_l37MSNBND4uLZlPzKW86wtP8nl4v_QZSpgZpkk
Mounted at /content/drive
Using notebook: BlitzXI_ResNet50_Kmeans_Cluster.ipynb for submission...
Scrubbing API keys from the notebook...
Collecting notebook...
submission.zip ━━━━━━━━━━━━━━━━━━━━━━ 100.0%25.9/24.2 KB1.4 MB/s0:00:00
                                                       ╭─────────────────────────╮                                                       
                                                       │ Successfully submitted! │                                                       
                                                       ╰─────────────────────────╯                                                       
                                                             Important links                                                             
┌──────────────────┬────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│  This submission │ https://www.aicrowd.com/challenges/ai-blitz-xi/problems/environment-classification/submissions/156841              │
│                  │                                                                                                                    │
│  All submissions │ https://www.aicrowd.com/challenges/ai-blitz-xi/problems/environment-classification/submissions?my_submissions=true │
│                  │                                                                                                                    │
│      Leaderboard │ https://www.aicrowd.com/challenges/ai-blitz-xi/problems/environment-classification/leaderboards                    │
│                  │                                                                                                                    │
│ Discussion forum │ https://discourse.aicrowd.com/c/ai-blitz-xi                                                                        │
│                  │                                                                                                                    │
│   Challenge page │ https://www.aicrowd.com/challenges/ai-blitz-xi/problems/environment-classification                                 │
└──────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
In [ ]:


Comments

You must login before you can post a comment.

Execute