Loading

Food Recognition Benchmark 2022

πŸ• Food Recognition Benchmark: Data Exploration & Baseline

Dataset exploration and `detectron2` baseline training code

shivam

🍕 Food Recognition Benchmark

Credits: This notebook is fork of notebook created by @shubhamai for previous iteration of the challenge. You can find the original notebook here.


Problem Statement

Detecting & Segmenting various kinds of food from an image. For ex. Someone got into new restaurent and get a food that he has never seen, well our DL model is in rescue, so our DL model will help indentifying which food it is from the class our model is being trained on!

Dataset

We will be using data from Food Recognition Challenge - A benchmark for image-based food recognition challange which is running since 2020.

https://www.aicrowd.com/challenges/food-recognition-benchmark-2022#datasets

We have a total of 39k training images with 3k validation set and 4k public-testing set. All the images are RGB and annotations exist in MS-COCO format.

Evaluation

The evaluation metrics is IOU aka. Intersection Over Union ( more about that later ).

The actualy metric is computed by averaging over all the precision and recall values for IOU which greater than 0.5.

https://www.aicrowd.com/challenges/food-recognition-challenge#evaluation-criteria

What does this notebook contains?

  1. Setting our Workspace πŸ’Ό

  2. Data Exploration 🧐

    • Reading Dataset
    • Data Visualisations
  3. Image Visulisation πŸ–ΌοΈ

    • Reading Images
  4. Creating our Dataset πŸ”¨

    • Fixing the Dataset
    • Creating our dataset
  5. Creating our Model 🏭

    • Creating R-CNN Model
    • Setting up hyperparameters
  6. Training the Model πŸš‚

    • Setting up Tensorboard
    • Start Training!
  7. Evaluating the model πŸ§ͺ

    • Evaluating our Model
  8. Testing the Model πŸ’―

    • Testing the Model
  9. Submitting our predictions πŸ“

  10. Generate More Data + Some tips & tricks πŸ’‘

Setting our Workspace 💼

In this section we will be downloading our dataset, unzipping it & downliading detectron2 library and importing all libraries that we will be using

Downloading & Unzipping our Dataset

In [1]:
# Login to AIcrowd
!pip install aicrowd-cli > /dev/null
!aicrowd login
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires requests~=2.23.0, but you have requests 2.26.0 which is incompatible.
datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.
Please login here: https://api.aicrowd.com/auth/NnwCEee07iHXtuxG0ByOQmZge6Jb-VfXvENxWAjJhls
/usr/bin/xdg-open: 851: /usr/bin/xdg-open: www-browser: not found
/usr/bin/xdg-open: 851: /usr/bin/xdg-open: links2: not found
/usr/bin/xdg-open: 851: /usr/bin/xdg-open: elinks: not found
/usr/bin/xdg-open: 851: /usr/bin/xdg-open: links: not found
/usr/bin/xdg-open: 851: /usr/bin/xdg-open: lynx: not found
/usr/bin/xdg-open: 851: /usr/bin/xdg-open: w3m: not found
xdg-open: no method available for opening 'https://api.aicrowd.com/auth/NnwCEee07iHXtuxG0ByOQmZge6Jb-VfXvENxWAjJhls'
API Key valid
Saved API Key successfully!
In [2]:
# List dataset for this challenge
!aicrowd dataset list -c food-recognition-benchmark-2022

# Download dataset
!aicrowd dataset download -c food-recognition-benchmark-2022
                          Datasets for challenge #962                           
β”Œβ”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ # β”‚ Title                          β”‚ Description                    β”‚   Size β”‚
β”œβ”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 0 β”‚ public_validation_set_2.0.tar… β”‚ Validation Dataset (contains   β”‚    59M β”‚
β”‚   β”‚                                β”‚ 1000 images and 498            β”‚        β”‚
β”‚   β”‚                                β”‚ categories, with annotations)  β”‚        β”‚
β”‚ 1 β”‚ public_test_release_2.0.tar.gz β”‚ [Public] Testing Dataset       β”‚   197M β”‚
β”‚   β”‚                                β”‚ (contains 3000 images and 498  β”‚        β”‚
β”‚   β”‚                                β”‚ categories, without            β”‚        β”‚
β”‚   β”‚                                β”‚ annotations)                   β”‚        β”‚
β”‚ 2 β”‚ public_training_set_release_2… β”‚ Training Dataset (contains     β”‚ 2.14GB β”‚
β”‚   β”‚                                β”‚ 39962 images and 498           β”‚        β”‚
β”‚   β”‚                                β”‚ categories)                    β”‚        β”‚
β””β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”˜
public_validation_set_2.0.tar.gz: 100% 62.4M/62.4M [00:05<00:00, 11.6MB/s]
public_test_release_2.0.tar.gz: 100% 207M/207M [00:12<00:00, 16.2MB/s]
public_training_set_release_2.0.tar.gz: 100% 2.30G/2.30G [02:38<00:00, 14.5MB/s]
In [3]:
# Create data directory
!mkdir -p data/ data/train data/val data/test
!cp *test* data/test && cd data/test && echo "Extracting test dataset" && tar -xvf *test* > /dev/null
!cp *val* data/val && cd data/val && echo "Extracting val dataset" &&  tar -xvf *val* > /dev/null
!cp *train* data/train && cd data/train && echo "Extracting train dataset" &&  tar -xvf *train* > /dev/null
Extracting test dataset
Extracting val dataset
Extracting train dataset

So, the data directory is something like this:

Importing Necessary Libraries

In [4]:
# Making sure that we are using GPUs
!nvidia-smi
Tue Dec 21 06:43:52 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.44       Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   33C    P8    27W / 149W |      0MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
In [1]:
# Colab has cuda 11.1 pre-installed nowadays, downgrading to 1.9 for detectron2
!pip install -U torch==1.9.0+cu111 torchvision==0.10.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html
!pip install cython pyyaml==5.1
!pip install -U pycocotools
import torch, torchvision
print(torch.__version__, torch.cuda.is_available())
!gcc --version
WARNING: Ignoring invalid distribution -orch (/usr/local/lib/python3.7/dist-packages)
WARNING: Ignoring invalid distribution -orch (/usr/local/lib/python3.7/dist-packages)
Looking in links: https://download.pytorch.org/whl/torch_stable.html
Requirement already satisfied: torch==1.9.0+cu111 in /usr/local/lib/python3.7/dist-packages (1.9.0+cu111)
Requirement already satisfied: torchvision==0.10.0+cu111 in /usr/local/lib/python3.7/dist-packages (0.10.0+cu111)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from torch==1.9.0+cu111) (3.10.0.2)
Requirement already satisfied: pillow>=5.3.0 in /usr/local/lib/python3.7/dist-packages (from torchvision==0.10.0+cu111) (7.1.2)
Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from torchvision==0.10.0+cu111) (1.19.5)
WARNING: Ignoring invalid distribution -orch (/usr/local/lib/python3.7/dist-packages)
WARNING: Ignoring invalid distribution -orch (/usr/local/lib/python3.7/dist-packages)
WARNING: Ignoring invalid distribution -orch (/usr/local/lib/python3.7/dist-packages)
Requirement already satisfied: cython in /usr/local/lib/python3.7/dist-packages (0.29.24)
Requirement already satisfied: pyyaml==5.1 in /usr/local/lib/python3.7/dist-packages (5.1)
WARNING: Ignoring invalid distribution -orch (/usr/local/lib/python3.7/dist-packages)
WARNING: Ignoring invalid distribution -orch (/usr/local/lib/python3.7/dist-packages)
WARNING: Ignoring invalid distribution -orch (/usr/local/lib/python3.7/dist-packages)
Requirement already satisfied: pycocotools in /usr/local/lib/python3.7/dist-packages (2.0.3)
Requirement already satisfied: setuptools>=18.0 in /usr/local/lib/python3.7/dist-packages (from pycocotools) (57.4.0)
Requirement already satisfied: cython>=0.27.3 in /usr/local/lib/python3.7/dist-packages (from pycocotools) (0.29.24)
Requirement already satisfied: matplotlib>=2.1.0 in /usr/local/lib/python3.7/dist-packages (from pycocotools) (3.2.2)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.1.0->pycocotools) (3.0.6)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.1.0->pycocotools) (0.11.0)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.1.0->pycocotools) (2.8.2)
Requirement already satisfied: numpy>=1.11 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.1.0->pycocotools) (1.19.5)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.1.0->pycocotools) (1.3.2)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib>=2.1.0->pycocotools) (1.15.0)
WARNING: Ignoring invalid distribution -orch (/usr/local/lib/python3.7/dist-packages)
1.9.0+cu111 True
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

In [2]:
!nvidia-smi
Tue Dec 21 07:20:35 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.44       Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   33C    P8    27W / 149W |      3MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
In [3]:
# install detectron2:
!pip install -U detectron2==0.6+cu111 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu111/index.html
WARNING: Ignoring invalid distribution -orch (/usr/local/lib/python3.7/dist-packages)
WARNING: Ignoring invalid distribution -orch (/usr/local/lib/python3.7/dist-packages)
Looking in links: https://dl.fbaipublicfiles.com/detectron2/wheels/cu111/index.html
Collecting detectron2==0.6+cu111
  Downloading https://dl.fbaipublicfiles.com/detectron2/wheels/cu111/torch1.9/detectron2-0.6%2Bcu111-cp37-cp37m-linux_x86_64.whl (6.9 MB)
     |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 6.9 MB 811 kB/s 
Requirement already satisfied: tqdm>4.29.0 in /usr/local/lib/python3.7/dist-packages (from detectron2==0.6+cu111) (4.62.3)
Requirement already satisfied: pycocotools>=2.0.2 in /usr/local/lib/python3.7/dist-packages (from detectron2==0.6+cu111) (2.0.3)
Requirement already satisfied: Pillow>=7.1 in /usr/local/lib/python3.7/dist-packages (from detectron2==0.6+cu111) (7.1.2)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/dist-packages (from detectron2==0.6+cu111) (3.2.2)
Requirement already satisfied: tensorboard in /usr/local/lib/python3.7/dist-packages (from detectron2==0.6+cu111) (2.7.0)
Requirement already satisfied: fvcore<0.1.6,>=0.1.5 in /usr/local/lib/python3.7/dist-packages (from detectron2==0.6+cu111) (0.1.5.post20211023)
Requirement already satisfied: iopath<0.1.10,>=0.1.7 in /usr/local/lib/python3.7/dist-packages (from detectron2==0.6+cu111) (0.1.9)
Requirement already satisfied: yacs>=0.1.8 in /usr/local/lib/python3.7/dist-packages (from detectron2==0.6+cu111) (0.1.8)
Requirement already satisfied: black==21.4b2 in /usr/local/lib/python3.7/dist-packages (from detectron2==0.6+cu111) (21.4b2)
Requirement already satisfied: pydot in /usr/local/lib/python3.7/dist-packages (from detectron2==0.6+cu111) (1.3.0)
Requirement already satisfied: hydra-core>=1.1 in /usr/local/lib/python3.7/dist-packages (from detectron2==0.6+cu111) (1.1.1)
Requirement already satisfied: tabulate in /usr/local/lib/python3.7/dist-packages (from detectron2==0.6+cu111) (0.8.9)
Requirement already satisfied: cloudpickle in /usr/local/lib/python3.7/dist-packages (from detectron2==0.6+cu111) (1.3.0)
Requirement already satisfied: termcolor>=1.1 in /usr/local/lib/python3.7/dist-packages (from detectron2==0.6+cu111) (1.1.0)
Requirement already satisfied: future in /usr/local/lib/python3.7/dist-packages (from detectron2==0.6+cu111) (0.16.0)
Requirement already satisfied: omegaconf>=2.1 in /usr/local/lib/python3.7/dist-packages (from detectron2==0.6+cu111) (2.1.1)
Requirement already satisfied: mypy-extensions>=0.4.3 in /usr/local/lib/python3.7/dist-packages (from black==21.4b2->detectron2==0.6+cu111) (0.4.3)
Requirement already satisfied: typed-ast>=1.4.2 in /usr/local/lib/python3.7/dist-packages (from black==21.4b2->detectron2==0.6+cu111) (1.5.1)
Requirement already satisfied: regex>=2020.1.8 in /usr/local/lib/python3.7/dist-packages (from black==21.4b2->detectron2==0.6+cu111) (2021.11.10)
Requirement already satisfied: pathspec<1,>=0.8.1 in /usr/local/lib/python3.7/dist-packages (from black==21.4b2->detectron2==0.6+cu111) (0.9.0)
Requirement already satisfied: typing-extensions>=3.7.4 in /usr/local/lib/python3.7/dist-packages (from black==21.4b2->detectron2==0.6+cu111) (3.10.0.2)
Requirement already satisfied: click>=7.1.2 in /usr/local/lib/python3.7/dist-packages (from black==21.4b2->detectron2==0.6+cu111) (7.1.2)
Requirement already satisfied: toml>=0.10.1 in /usr/local/lib/python3.7/dist-packages (from black==21.4b2->detectron2==0.6+cu111) (0.10.2)
Requirement already satisfied: appdirs in /usr/local/lib/python3.7/dist-packages (from black==21.4b2->detectron2==0.6+cu111) (1.4.4)
Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from fvcore<0.1.6,>=0.1.5->detectron2==0.6+cu111) (1.19.5)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.7/dist-packages (from fvcore<0.1.6,>=0.1.5->detectron2==0.6+cu111) (5.1)
Requirement already satisfied: importlib-resources in /usr/local/lib/python3.7/dist-packages (from hydra-core>=1.1->detectron2==0.6+cu111) (5.4.0)
Requirement already satisfied: antlr4-python3-runtime==4.8 in /usr/local/lib/python3.7/dist-packages (from hydra-core>=1.1->detectron2==0.6+cu111) (4.8)
Requirement already satisfied: portalocker in /usr/local/lib/python3.7/dist-packages (from iopath<0.1.10,>=0.1.7->detectron2==0.6+cu111) (2.3.2)
Requirement already satisfied: setuptools>=18.0 in /usr/local/lib/python3.7/dist-packages (from pycocotools>=2.0.2->detectron2==0.6+cu111) (57.4.0)
Requirement already satisfied: cython>=0.27.3 in /usr/local/lib/python3.7/dist-packages (from pycocotools>=2.0.2->detectron2==0.6+cu111) (0.29.24)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->detectron2==0.6+cu111) (3.0.6)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib->detectron2==0.6+cu111) (0.11.0)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->detectron2==0.6+cu111) (2.8.2)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->detectron2==0.6+cu111) (1.3.2)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib->detectron2==0.6+cu111) (1.15.0)
Requirement already satisfied: zipp>=3.1.0 in /usr/local/lib/python3.7/dist-packages (from importlib-resources->hydra-core>=1.1->detectron2==0.6+cu111) (3.6.0)
Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.6+cu111) (1.0.1)
Requirement already satisfied: absl-py>=0.4 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.6+cu111) (0.12.0)
Requirement already satisfied: wheel>=0.26 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.6+cu111) (0.37.0)
Requirement already satisfied: requests<3,>=2.21.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.6+cu111) (2.26.0)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.6+cu111) (0.4.6)
Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.6+cu111) (0.6.1)
Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.6+cu111) (1.8.0)
Requirement already satisfied: google-auth<3,>=1.6.3 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.6+cu111) (1.35.0)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.6+cu111) (3.3.6)
Requirement already satisfied: grpcio>=1.24.3 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.6+cu111) (1.42.0)
Requirement already satisfied: protobuf>=3.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.6+cu111) (3.17.3)
Requirement already satisfied: cachetools<5.0,>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard->detectron2==0.6+cu111) (4.2.4)
Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard->detectron2==0.6+cu111) (4.8)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard->detectron2==0.6+cu111) (0.2.8)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.7/dist-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard->detectron2==0.6+cu111) (1.3.0)
Requirement already satisfied: importlib-metadata>=4.4 in /usr/local/lib/python3.7/dist-packages (from markdown>=2.6.8->tensorboard->detectron2==0.6+cu111) (4.8.2)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.7/dist-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard->detectron2==0.6+cu111) (0.4.8)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard->detectron2==0.6+cu111) (2.10)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard->detectron2==0.6+cu111) (2021.10.8)
Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard->detectron2==0.6+cu111) (2.0.8)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard->detectron2==0.6+cu111) (1.24.3)
Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.7/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard->detectron2==0.6+cu111) (3.1.1)
WARNING: Ignoring invalid distribution -orch (/usr/local/lib/python3.7/dist-packages)
Installing collected packages: detectron2
  Attempting uninstall: detectron2
    WARNING: Ignoring invalid distribution -orch (/usr/local/lib/python3.7/dist-packages)
    Found existing installation: detectron2 0.6+cu101
    Uninstalling detectron2-0.6+cu101:
      Successfully uninstalled detectron2-0.6+cu101
WARNING: Ignoring invalid distribution -orch (/usr/local/lib/python3.7/dist-packages)
Successfully installed detectron2-0.6+cu111
In [4]:
# You may need to restart your runtime prior to this, to let your installation take effect
# Some basic setup:
# Setup detectron2 logger
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()

# import some common libraries
import numpy as np
import pandas as pd
import cv2
import json
from tqdm.notebook import tqdm

# import some common detectron2 utilities
from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog
from detectron2.utils.visualizer import ColorMode
from detectron2.data.datasets import register_coco_instances
from detectron2.engine import DefaultTrainer
from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from detectron2.data import build_detection_test_loader

# For reading annotations file
from pycocotools.coco import COCO

# utilities
from pprint import pprint # For beautiful print!
from collections import OrderedDict
import os 

# For data visualisation
import matplotlib.pyplot as plt
import plotly.graph_objects as go
import plotly.express as px
from google.colab.patches import cv2_imshow

Data Exploration 🧐

In this section we are going to read our dataset & doing some data visualisations

Reading Data

In [5]:
# Reading annotations.json
TRAIN_ANNOTATIONS_PATH = "data/train/annotations.json"
TRAIN_IMAGE_DIRECTIORY = "data/train/images/"

VAL_ANNOTATIONS_PATH = "data/val/annotations.json"
VAL_IMAGE_DIRECTIORY = "data/val/images/"

train_coco = COCO(TRAIN_ANNOTATIONS_PATH)
loading annotations into memory...
Done (t=8.87s)
creating index...
index created!
In [6]:
# Reading the annotation files
with open(TRAIN_ANNOTATIONS_PATH) as f:
  train_annotations_data = json.load(f)

with open(VAL_ANNOTATIONS_PATH) as f:
  val_annotations_data = json.load(f)
In [7]:
train_annotations_data['annotations'][0]
Out[7]:
{'area': 5059.0,
 'bbox': [39.5, 39.5, 167.0, 92.0],
 'category_id': 1352,
 'id': 184135,
 'image_id': 131094,
 'iscrowd': 0,
 'segmentation': [[115.0,
   206.5,
   98.0,
   204.5,
   74.5,
   182.0,
   65.0,
   167.5,
   47.5,
   156.0,
   39.5,
   137.0,
   39.5,
   130.0,
   51.0,
   118.5,
   62.00000000000001,
   112.5,
   76.0,
   113.5,
   121.5,
   151.0,
   130.5,
   169.0,
   131.5,
   185.0,
   128.5,
   195.0]]}

Data Format 🔍

Our COCO data format is something like this -

"info": {...},
"categories": [...],
"images": [...],
"annotations": [...],

In which categories is like this

[
  {'id': 2578,
  'name': 'water',
  'name_readable': 'Water',
  'supercategory': 'food'},
  {'id': 1157,
  'name': 'pear',
  'name_readable': 'Pear',
  'supercategory': 'food'},
  ...
  {'id': 1190,
  'name': 'peach',
  'name_readable': 'Peach',
  'supercategory': 'food'}
]

Info is empty ( not sure why )

images is like this

[
  {'file_name': '065537.jpg', 
  'height': 464, 
  'id': 65537, 
  'width': 464},
  {'file_name': '065539.jpg', 
  'height': 464, 
  'id': 65539, 
  'width': 464},
 ...
  {'file_name': '069900.jpg', 
  'height': 391, 
  'id': 69900, 
  'width': 392},
]

Annotations is like this

{'area': 44320.0,
 'bbox': [86.5, 127.49999999999999, 286.0, 170.0],
 'category_id': 2578,
 'id': 102434,
 'image_id': 65537,
 'iscrowd': 0,
 'segmentation': [[235.99999999999997,
   372.5,
   169.0,
   372.5,
   ...
   368.5,
   264.0,
   371.5]]}
In [9]:
# Reading all classes
category_ids = train_coco.loadCats(train_coco.getCatIds())
category_names = [_["name_readable"] for _ in category_ids]

print("## Categories\n-", "\n- ".join(category_names))
## Categories
- Bread, wholemeal
- Jam
- Water
- Bread, sourdough
- Banana
- Soft cheese
- Ham, raw
- Hard cheese
- Cottage cheese
- Bread, half white
- Coffee, with caffeine
- Fruit salad
- Pancakes
- Tea
- Salmon, smoked
- Avocado
- Spring onion / scallion
- Ristretto, with caffeine
- Ham
- Egg
- Bacon, frying
- Chips, french fries
- Juice, apple
- Chicken
- Tomato, raw 
- Broccoli
- Shrimp, boiled
- Beetroot, steamed, without addition of salt
- Carrot, raw
- Chickpeas
- French salad dressing
- Pasta, HΓΆrnli
- Sauce, cream
- Meat balls
- Pasta
- Tomato sauce
- Cheese
- Pear
- Cashew nut
- Almonds
- Lentils
- Mixed vegetables
- Peanut butter
- Apple
- Blueberries
- Cucumber
- Cocoa powder
- Greek Yaourt, yahourt, yogourt ou yoghourt
- Maple syrup (Concentrate)
- Buckwheat, grain peeled
- Butter
- Herbal tea
- Mayonnaise
- Soup, vegetable
- Wine, red
- Wine, white
- Green bean, steamed, without addition of salt
- Sausage
- Pizza, Margherita, baked
- Salami
- Mushroom
- (bread, meat substitute, lettuce, sauce)
- Tart
- Tea, verveine
- Rice
- White coffee, with caffeine
- Linseeds
- Sunflower seeds
- Ham, cooked
- Bell pepper, red, raw 
- Zucchini
- Green asparagus
- Tartar sauce
- Lye pretzel (soft)
- Cucumber, pickled 
- Curry, vegetarian
- Yaourt, yahourt, yogourt ou yoghourt, natural
- Soup of lentils, Dahl (Dhal)
- Soup, cream of vegetables
- Balsamic vinegar
- Salmon
- Salt cake (vegetables, filled)
- Bacon
- Orange
- Pasta, noodles
- Cream
- Cake, chocolate
- Pasta, spaghetti
- Black olives
- Parmesan
- Spaetzle
- Salad, lambs' ear
- Salad, leaf / salad, green
- Potatoes steamed
- White cabbage
- Halloumi
- Beetroot, raw
- Bread, grain
- Applesauce, unsweetened, canned
- Cheese for raclette
- Mushrooms
- Bread, white
- Curds, natural, with at most 10% fidm
- Bagel (without filling)
- Quiche, with cheese, baked, with puff pastry
- Soup, potato
- Bouillon, vegetable
- Beef, sirloin steak
- TaboulΓ©, prepared, with couscous
- Eggplant
- Bread
- Turnover with meat (small meat pie, empanadas)
- Mungbean sprouts
- Mozzarella
- Pasta, penne
- Lasagne, vegetable, prepared
- Mandarine
- Kiwi
- French beans
- Tartar (meat)
- Spring roll (fried)
- Pork, chop
- Caprese salad (Tomato Mozzarella)
- Leaf spinach
- Roll of half-white or white flour, with large void
- Pasta, ravioli, stuffing
- Omelette, plain
- Tuna
- Dark chocolate
- Sauce (savoury)
- Dried raisins
- Ice tea
- Kaki
- Macaroon
- Smoothie
- CrΓͺpe, plain
- Chicken nuggets
- Chili con carne, prepared
- Veggie burger
- Cream spinach
- Cod
- Chinese cabbage
- Hamburger (Bread, meat, ketchup)
- Soup, pumpkin
- Sushi
- Chestnuts
- Coffee, decaffeinated
- Sauce, soya
- Balsamic salad dressing
- Pasta, twist
- Bolognaise sauce
- Leek
- Fajita (bread only)
- Potato-gnocchi
- Beef, cut into stripes (only meat)
- Rice noodles/vermicelli
- Tea, ginger
- Tea, green
- Bread, whole wheat
- Onion
- Garlic
- Hummus
- Pizza, with vegetables, baked
- Beer
- Glucose drink 50g
- Chicken, wing
- Ratatouille
- Peanut
- High protein pasta (made of lentils, peas, ...)
- Cauliflower
- Quiche, with spinach, baked, with cake dough
- Green olives
- Brazil nut
- Eggplant caviar
- Bread, pita
- Pasta, wholemeal
- Sauce, pesto
- Oil
- Couscous
- Sauce, roast
- Prosecco
- Crackers
- Bread, toast
- Shrimp / prawn (small)
- Panna cotta
- Romanesco
- Water with lemon juice
- Espresso, with caffeine
- Egg, scrambled, prepared
- Juice, orange
- Ice cubes
- Braided white loaf
- Emmental cheese
- Croissant, wholegrain
- Hazelnut-chocolate spread(Nutella, Ovomaltine, Caotina)
- Tomme
- Water, mineral
- Hazelnut
- Bacon, raw
- Bread, nut
- Black Forest Tart
- Soup, Miso
- Peach
- Figs
- Beef, filet
- Mustard, Dijon
- Rice, Basmati
- Mashed potatoes, prepared, with full fat milk, with butter
- Dumplings
- Pumpkin
- Swiss chard
- Red cabbage
- Spinach, raw
- Naan (indien bread)
- Chicken curry (cream/coconut milk. curry spices/paste))
- Crunch MΓΌesli
- Biscuits
- Bread, French (white flour)
- Meatloaf
- Fresh cheese
- Honey
- Vegetable mix, peas and carrots
- Parsley
- Brownie
- Dairy ice cream
- Tea, black
- Carrot cake
- Fish fingers (breaded)
- Salad dressing
- Dried meat
- Chicken, breast
- Mixed salad (chopped without sauce)
- Feta
- Praline
- Tea, peppermint
- Walnut
- Potato salad, with mayonnaise yogurt dressing
- Kebab in pita bread
- Kolhrabi
- Alfa sprouts
- Brussel sprouts
- Bacon, cooking
- Gruyère
- Bulgur
- Grapes
- Pork, escalope
- Chocolate egg, small
- Cappuccino
- Zucchini, stewed, without addition of fat, without addition of salt
- Crisp bread, Wasa
- Bread, black
- Perch fillets (lake)
- Rosti
- Mango
- Sandwich (ham, cheese and butter)
- MΓΌesli
- Spinach, steamed, without addition of salt
- Fish
- Risotto, without cheese, cooked
- Milk Chocolate with hazelnuts
- Cake (oblong)
- Crisps
- Pork
- Pomegranate
- Sweet corn, canned
- Flakes, oat
- Greek salad
- Cantonese fried rice
- Sesame seeds
- Bouillon
- Baked potato
- Fennel
- Meat
- Bread, olive
- Croutons
- Philadelphia
- Mushroom, (average), stewed, without addition of fat, without addition of salt
- Bell pepper, red, stewed, without addition of fat, without addition of salt
- White chocolate
- Mixed nuts
- Breadcrumbs (unspiced)
- Fondue
- Sauce, mushroom
- Tea, spice
- Strawberries
- Tea, rooibos
- Pie, plum, baked, with cake dough
- Potatoes au gratin, dauphinois, prepared
- Capers
- Vegetables
- Bread, wholemeal toast
- Red radish
- Fruit tart
- Beans, kidney
- Sauerkraut
- Mustard
- Country fries
- Ketchup
- Pasta, linguini, parpadelle, Tagliatelle
- Chicken, cut into stripes (only meat)
- Cookies
- Sun-dried tomatoe
- Bread, Ticino
- Semi-hard cheese
- Margarine
- Porridge, prepared, with partially skimmed milk
- Soya drink (soy milk)
- Juice, multifruit
- Popcorn salted
- Chocolate, filled
- Milk chocolate
- Bread, fruit
- Mix of dried fruits and nuts
- Corn
- TΓͺte de Moine
- Dates
- Pistachio
- Celery
- White radish
- Oat milk
- Cream cheese
- Bread, rye
- Witloof chicory
- Apple crumble
- Goat cheese (soft)
- Grapefruit, pomelo
- Risotto, with mushrooms, cooked
- Blue mould cheese
- Biscuit with Butter
- Guacamole
- Pecan nut
- Tofu
- Cordon bleu, from pork schnitzel, fried
- Paprika chips
- Quinoa
- Kefir drink
- M&M's
- Salad, rocket
- Bread, spelt
- Pizza, with ham, with mushrooms, baked
- Fruit coulis
- Plums
- Beef, minced (only meat)
- Pizza, with ham, baked
- Pineapple
- Soup, tomato
- Cheddar
- Tea, fruit
- Rice, Jasmin
- Seeds
- Focaccia
- Milk
- Coleslaw (chopped without sauce)
- Pastry, flaky
- Curd
- Savoury puff pastry stick
- Sweet potato
- Chicken, leg
- Croissant
- Sour cream
- Ham, turkey
- Processed cheese
- Fruit compotes
- Cheesecake
- Pasta, tortelloni, stuffing
- Sauce, cocktail
- Croissant with chocolate filling
- Pumpkin seeds
- Artichoke
- Champagne
- Grissini
- Sweets / candies
- Brie
- Wienerli (Swiss sausage)
- Syrup (diluted, ready to drink)
- Apple pie
- White bread with butter, eggs and milk
- Savoury puff pastry
- Anchovies
- Tuna, in oil, drained
- Lemon pie
- Meat terrine, patΓ©
- Coriander
- Falafel (balls)
- Berries
- Latte macchiato, with caffeine
- Faux-mage Cashew, vegan chers
- Beans, white
- Sugar Melon
- Mixed seeds
- Hamburger
- Hamburger bun
- Oil & vinegar salad dressing
- Soya Yaourt, yahourt, yogourt ou yoghourt
- Chocolate milk, chocolate drink
- Celeriac
- Chocolate mousse
- Cenovis, yeast spread
- Thickened cream (> 35%)
- Meringue
- Lamb, chop
- Shrimp / prawn (large)
- Beef
- Lemon
- Croque monsieur
- Chives
- Chocolate cookies
- BirchermΓΌesli, prepared, no sugar added
- Fish crunchies (battered)
- Muffin
- Savoy cabbage, steamed, without addition of salt
- Pine nuts
- Chorizo
- Chia grains
- Frying sausage
- French pizza from Alsace, baked
- Chocolate
- Cooked sausage
- Grits, polenta, maize flour
- Gummi bears, fruit jellies, Jelly babies with fruit essence
- Wine, rosΓ©
- Coca Cola
- Raspberries
- Roll with pieces of chocolate
- Goat, (average), raw
- Lemon Cake
- Coconut milk
- Rice, wild
- Gluten-free bread
- Pearl onions
- Buckwheat pancake
- Bread, 5-grain
- Light beer
- Sugar, glazing
- Tzatziki
- Butter, herb
- Ham croissant
- Corn crisps
- Lentils green (du Puy, du Berry)
- Cocktail
- Rice, whole-grain
- Veal sausage
- Cervelat
- Sorbet
- Aperitif, with alcohol, apΓ©rol, Spritz
- Dips
- Corn Flakes
- Peas
- Tiramisu
- Apricots
- Cake, marble
- Lamb
- Lasagne, meat, prepared
- Coca Cola Zero
- Cake, salted
- Dough (puff pastry, shortcrust, bread, pizza dough)
- Rice waffels
- Sekt
- Brioche
- Vegetable au gratin, baked
- Mango dried
- Processed meat, Charcuterie
- Mousse
- Sauce, sweet & sour
- Basil
- Butter, spread, puree almond
- Pie, apricot, baked, with cake dough
- Rusk, wholemeal
- Beef, roast
- Vanille cream, cooked, Custard, Crème dessert
- Pasta in conch form
- Nuts
- Sauce, carbonara
- Fig, dried
- Pasta in butterfly form, farfalle
- Minced meat
- Carrot, steamed, without addition of salt
- Ebly
- Damson plum
- Shoots
- Bouquet garni
- Coconut
- Banana cake
- Waffle
- Apricot, dried
- Sauce, curry
- Watermelon, fresh
- Sauce, sweet-salted (asian)
- Pork, roast
- Blackberry
- Smoked cooked sausage of pork and beef meat sausag
- bean seeds
- Italian salad dressing
- White asparagus
- Pie, rhubarb, baked, with cake dough
- Tomato, stewed, without addition of fat, without addition of salt
- Cherries
- Nectarine
In [10]:
# Getting all categoriy with respective to their total images
no_images_per_category = {}

for n, i in enumerate(train_coco.getCatIds()):
  imgIds = train_coco.getImgIds(catIds=i)
  label = category_names[n]
  no_images_per_category[label] = len(imgIds)

img_info = pd.DataFrame(train_coco.loadImgs(train_coco.getImgIds()))
no_images_per_category = OrderedDict(sorted(no_images_per_category.items(), key=lambda x: -1*x[1]))

# Top 30 categories, based on number of images
i = 0
for k, v in no_images_per_category.items():
  print(k, v)
  i += 1
  if i > 30:
    break
Water 2928
Salad, leaf / salad, green 2002
Bread, white 1891
Tomato, raw  1865
Butter 1601
Carrot, raw 1482
Bread, wholemeal 1452
Coffee, with caffeine 1406
Rice 1024
Egg 1015
Mixed vegetables 892
Apple 892
Jam 797
Cucumber 742
Wine, red 728
Banana 654
Cheese 646
Potatoes steamed 644
Bell pepper, red, raw  549
Hard cheese 547
Espresso, with caffeine 547
Tea 516
Bread, whole wheat 504
Mixed salad (chopped without sauce) 498
Avocado 480
White coffee, with caffeine 470
Tomato sauce 466
Wine, white 430
Broccoli 421
Strawberries 412
Pasta, spaghetti 398

Data Visualisations

In [15]:
fig = go.Figure([go.Bar(x=list(no_images_per_category.keys())[:50], y=list(no_images_per_category.values())[:50])])
fig.update_layout(
    title="No of Image per class",)

fig.show()

fig = go.Figure([go.Bar(x=list(no_images_per_category.keys())[50:200], y=list(no_images_per_category.values())[50:200])])
fig.update_layout(
    title="No of Image per class",)

fig.show()

fig = go.Figure([go.Bar(x=list(no_images_per_category.keys())[200:], y=list(no_images_per_category.values())[200:])])
fig.update_layout(
    title="No of Image per class",)

fig.show()
In [16]:
pprint(f"Average number of image per class : { sum(list(no_images_per_category.values())) / len(list(no_images_per_category.values())) }")
pprint(f"Highest number of image per class is : { list(no_images_per_category.keys())[0]} of { list(no_images_per_category.values())[0] }")
pprint(f"Lowest number of image per class is : Veggie Burger of { sorted(list(no_images_per_category.values()))[0] }")
'Average number of image per class : 141.359437751004'
'Highest number of image per class is : Water of 2928'
'Lowest number of image per class is : Veggie Burger of 12'
In [17]:
fig = go.Figure(data=[go.Pie(labels=list(no_images_per_category.keys()), values=list(no_images_per_category.values()), 
                             hole=.3, textposition='inside', )], )
fig.update_layout(
    title="No of Image per class ( In pie )",)
fig.show()
In [18]:
fig = go.Figure()
fig.add_trace(go.Histogram(x=img_info['height']))
fig.add_trace(go.Histogram(x=img_info['width']))

# Overlay both histograms
fig.update_layout(barmode='stack', title="Histogram of Image width & height",)


fig.show()

Image Visulisation 🖼️

In this section we are going to do imaghe visualisations!

In [19]:
print(img_info)
           id   file_name  width  height
0      131094  131094.jpg    480     480
1      131097  131097.jpg    391     390
2      131098  131098.jpg    391     390
3      131100  131100.jpg    391     390
4      131101  131101.jpg    391     390
...       ...         ...    ...     ...
39957  131017  131017.jpg    480     480
39958  131021  131021.jpg    464     464
39959  131053  131053.jpg    391     390
39960  131066  131066.jpg    464     464
39961  131071  131071.jpg    464     464

[39962 rows x 4 columns]
In [20]:
len(train_annotations_data['annotations'][n]['segmentation']), len(train_annotations_data['annotations'][n]['bbox'])
Out[20]:
(1, 4)
In [11]:
img_no = 7

annIds = train_coco.getAnnIds(imgIds=train_annotations_data['images'][img_no]['id'])
anns = train_coco.loadAnns(annIds)

# load and render the image
plt.imshow(plt.imread(TRAIN_IMAGE_DIRECTIORY+train_annotations_data['images'][img_no]['file_name']))
plt.axis('off')
# Render annotations on top of the image
train_coco.showAnns(anns)
In [22]:
w, h = 15, 15 # Setting width and height of every image
rows, cols = 5, 5 # Setting the number of image rows & cols

fig = plt.figure(figsize=(15, 15)) # Making the figure with size 

plt.title("Images") 
plt.axis('off')

# Going thought every cell in rows and cols
for i in range(1, cols * rows+1):
  annIds = train_coco.getAnnIds(imgIds=img_info['id'][i])
  anns = train_coco.loadAnns(annIds)

  fig.add_subplot(rows, cols, i)

  # Show the image

  img = plt.imread(TRAIN_IMAGE_DIRECTIORY+img_info['file_name'][i])
  for i in anns:
    [x,y,w,h] = i['bbox']
    cv2.rectangle(img, (int(x), int(y)), (int(x+h), int(y+w)), (255,0,0), 2)
  plt.imshow(img)

  # Render annotations on top of the image
  train_coco.showAnns(anns)

  # Setting the axis off
  plt.axis("off")

# Showing the figure
plt.show()