Loading

Age Prediction

[ Baseline ] Age Prediction

Baseline for Age Prediction,we will be using random forest classfier on image pixel to predict age.

aditya_jha150402

We are going to use a very naive approach here.

We will be reducing the number of pixels, take them out in a list, and then use a random forest classifier model.

Age_Desktop Banner-3.png

Getting Started with Age Prediction

In this puzzle, we have to predict the age from the given human faces.

This is a starter kit explaining how to download the data and also submit direcly via this notebook.

In this baseline, we are going to reduce the number of pixels, take them out in a list, and are going to use a random forest classifier model.

Download the files 💾¶

Download AIcrowd CLI

We will first install aicrowd-cli which will help you download and later make submission directly via the notebook.

In [1]:
!pip install aicrowd-cli
%load_ext aicrowd.magic
Collecting aicrowd-cli
  Downloading aicrowd_cli-0.1.12-py3-none-any.whl (48 kB)
     |████████████████████████████████| 48 kB 2.5 MB/s 
Collecting pyzmq==22.1.0
  Downloading pyzmq-22.1.0-cp37-cp37m-manylinux1_x86_64.whl (1.1 MB)
     |████████████████████████████████| 1.1 MB 9.7 MB/s 
Collecting GitPython==3.1.18
  Downloading GitPython-3.1.18-py3-none-any.whl (170 kB)
     |████████████████████████████████| 170 kB 64.8 MB/s 
Collecting requests<3,>=2.25.1
  Downloading requests-2.27.1-py2.py3-none-any.whl (63 kB)
     |████████████████████████████████| 63 kB 1.5 MB/s 
Requirement already satisfied: click<8,>=7.1.2 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (7.1.2)
Collecting rich<11,>=10.0.0
  Downloading rich-10.16.2-py3-none-any.whl (214 kB)
     |████████████████████████████████| 214 kB 47.3 MB/s 
Requirement already satisfied: tqdm<5,>=4.56.0 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (4.62.3)
Collecting requests-toolbelt<1,>=0.9.1
  Downloading requests_toolbelt-0.9.1-py2.py3-none-any.whl (54 kB)
     |████████████████████████████████| 54 kB 2.5 MB/s 
Requirement already satisfied: python-slugify<6,>=5.0.0 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (5.0.2)
Requirement already satisfied: toml<1,>=0.10.2 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (0.10.2)
Collecting gitdb<5,>=4.0.1
  Downloading gitdb-4.0.9-py3-none-any.whl (63 kB)
     |████████████████████████████████| 63 kB 1.5 MB/s 
Requirement already satisfied: typing-extensions>=3.7.4.0 in /usr/local/lib/python3.7/dist-packages (from GitPython==3.1.18->aicrowd-cli) (3.10.0.2)
Collecting smmap<6,>=3.0.1
  Downloading smmap-5.0.0-py3-none-any.whl (24 kB)
Requirement already satisfied: text-unidecode>=1.3 in /usr/local/lib/python3.7/dist-packages (from python-slugify<6,>=5.0.0->aicrowd-cli) (1.3)
Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (2.0.11)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (2.10)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (2021.10.8)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (1.24.3)
Collecting commonmark<0.10.0,>=0.9.0
  Downloading commonmark-0.9.1-py2.py3-none-any.whl (51 kB)
     |████████████████████████████████| 51 kB 6.0 MB/s 
Collecting colorama<0.5.0,>=0.4.0
  Downloading colorama-0.4.4-py2.py3-none-any.whl (16 kB)
Requirement already satisfied: pygments<3.0.0,>=2.6.0 in /usr/local/lib/python3.7/dist-packages (from rich<11,>=10.0.0->aicrowd-cli) (2.6.1)
Installing collected packages: smmap, requests, gitdb, commonmark, colorama, rich, requests-toolbelt, pyzmq, GitPython, aicrowd-cli
  Attempting uninstall: requests
    Found existing installation: requests 2.23.0
    Uninstalling requests-2.23.0:
      Successfully uninstalled requests-2.23.0
  Attempting uninstall: pyzmq
    Found existing installation: pyzmq 22.3.0
    Uninstalling pyzmq-22.3.0:
      Successfully uninstalled pyzmq-22.3.0
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires requests~=2.23.0, but you have requests 2.27.1 which is incompatible.
datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.
Successfully installed GitPython-3.1.18 aicrowd-cli-0.1.12 colorama-0.4.4 commonmark-0.9.1 gitdb-4.0.9 pyzmq-22.1.0 requests-2.27.1 requests-toolbelt-0.9.1 rich-10.16.2 smmap-5.0.0

Login to AIcrowd ㊗¶

In [2]:
%aicrowd login
Please login here: https://api.aicrowd.com/auth/Yp-R9CnuHDOklXbF8VxyuIwc_MMPmr_QrKcK2bUoOAg
API Key valid
Gitlab access token valid
Saved details successfully!

Download Dataset¶

We will create a folder name data and download the files there.

In [3]:
import os
os.getcwd()
Out[3]:
'/content'
In [4]:
!mkdir data
%aicrowd ds dl -c age-prediction -o data
In [5]:
!unzip data/train.zip -d data/train > /dev/null
!unzip data/val.zip -d data/val > /dev/null
!unzip data/test.zip -d data/test > /dev/null

Importing Libraries:

In [6]:
import pandas as pd
import numpy as np
import os
from PIL import Image
from tqdm import tqdm
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import f1_score

Diving in the dataset 🕵️‍♂️

In [7]:
train_df = pd.read_csv("data/train.csv")
val_df = pd.read_csv("data/val.csv")
test_df = pd.read_csv("data/sample_submission.csv")
In [8]:
print(train_df.head(3))
  ImageID     age
0   93vu1   30-40
1   yjifi   80-90
2   ldd2k  90-100

The number of datapoints in train is 4000 The number of datapoints in valid is 2000 The number of datapoints in test is 3000

In [9]:
print(train_df.shape[0])
print(val_df.shape[0])
print(test_df.shape[0])
4000
2000
3000

The target labels are '0-10' to '90-100'. So there are 10 target labels.

In [10]:
train_df.age.unique()
Out[10]:
array(['30-40', '80-90', '90-100', '40-50', '0-10', '60-70', '70-80',
       '20-30', '50-60', '10-20'], dtype=object)
In [11]:
train_df['ImageID'][0]
Out[11]:
'93vu1'

Modeling

We are going to use a very naive approach here.

We will be reducing the number of pixels, take them out in a list, and then use a random forest classifier model.

In [12]:
def preprocessor(image_path,dataframe):
  # Go through each test image
  imgdatas = []
  for i in tqdm(range(dataframe.shape[0]), total = len(dataframe)):
    # Reading the test image
    imgdata = Image.open(os.path.join(image_path, dataframe['ImageID'][i]+'.jpg'))

    #Convert to grayscale
    imgdata = imgdata.convert('L')

    #Reshapes the image to a fix sahpe -> 190×190(You can choose any shape)
    imgdata = imgdata.resize((190,190))
    imgdata =np.asarray(imgdata)

    #Squeezes the matrix for feeding the value to model
    imgdata = np.squeeze(imgdata[10,:])

    imgdatas.append(imgdata)
  # image_ids.append(test_imgs[i].split(".")[0])
  dataframe['imgData'] = imgdatas
  return dataframe
In [13]:
base_path = 'data'
preprocessor(os.path.join(base_path,'train'), train_df)
preprocessor(os.path.join(base_path,'test'), test_df)
preprocessor(os.path.join(base_path,'val'), val_df)
100%|██████████| 4000/4000 [00:29<00:00, 137.61it/s]
100%|██████████| 3000/3000 [00:21<00:00, 141.23it/s]
100%|██████████| 2000/2000 [00:14<00:00, 135.74it/s]
Out[13]:
ImageID age imgData
0 444vl 40-50 [60, 56, 55, 53, 52, 53, 52, 53, 52, 51, 50, 5...
1 4eg4u 80-90 [77, 77, 76, 76, 75, 74, 75, 76, 74, 74, 73, 7...
2 8pk8y 40-50 [125, 123, 124, 125, 124, 125, 126, 127, 128, ...
3 qow33 90-100 [76, 76, 77, 78, 78, 79, 81, 83, 83, 84, 85, 8...
4 7ittd 20-30 [77, 76, 75, 73, 72, 72, 71, 70, 69, 67, 66, 6...
... ... ... ...
1995 0od5t 0-10 [106, 105, 104, 101, 99, 98, 97, 96, 93, 92, 9...
1996 do352 80-90 [74, 72, 68, 67, 65, 65, 64, 64, 65, 66, 68, 7...
1997 m58bc 90-100 [157, 158, 160, 163, 165, 166, 170, 172, 174, ...
1998 6xxax 50-60 [118, 114, 111, 109, 107, 105, 103, 101, 100, ...
1999 vd7h5 40-50 [126, 126, 127, 128, 127, 127, 126, 127, 125, ...

2000 rows × 3 columns

In [14]:
train_df['imgData'][12].shape
Out[14]:
(190,)
In [15]:
train_x = train_df.imgData
train_y = train_df.age
In [16]:
age_predictor = RandomForestClassifier(max_features=0.15, random_state=2)
age_predictor.fit(list(train_x),train_y)
Out[16]:
RandomForestClassifier(max_features=0.15, random_state=2)
In [17]:
print(age_predictor.score(list(train_x),train_y))
1.0
In [18]:
val_x = val_df.imgData
val_y = val_df.age
In [19]:
val_predict = age_predictor.predict(list(val_x))
In [20]:
print(f1_score(val_predict,val_y,average='weighted'))
0.21341366743272625

Generating Prediction File

Now that we have created the baseline prediction, lets submit it.

In [21]:
test_x = test_df.imgData
test_predict = age_predictor.predict(list(test_x))
In [22]:
submission = pd.read_csv('data/sample_submission.csv')
In [23]:
submission['age'] = test_predict
In [24]:
!rm -rf assets
!mkdir assets
submission.to_csv(os.path.join("assets", "submission.csv"))

Submitting our Predictions

Note : Please save the notebook before submitting it (Ctrl + S)

In [25]:
%aicrowd notebook submit -c age-prediction -a assets --no-verify
Using notebook: getting-started-notebook-for-age-prediction.ipynb for submission...
Scrubbing API keys from the notebook...
Collecting notebook...


                                                  ╭─────────────────────────╮                                                  
                                                  │ Successfully submitted! │                                                  
                                                  ╰─────────────────────────╯                                                  
                                                        Important links                                                        
┌──────────────────┬──────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│  This submission │ https://www.aicrowd.com/challenges/ai-blitz-xiii/problems/age-prediction/submissions/172684              │
│                  │                                                                                                          │
│  All submissions │ https://www.aicrowd.com/challenges/ai-blitz-xiii/problems/age-prediction/submissions?my_submissions=true │
│                  │                                                                                                          │
│      Leaderboard │ https://www.aicrowd.com/challenges/ai-blitz-xiii/problems/age-prediction/leaderboards                    │
│                  │                                                                                                          │
│ Discussion forum │ https://discourse.aicrowd.com/c/ai-blitz-xiii                                                            │
│                  │                                                                                                          │
│   Challenge page │ https://www.aicrowd.com/challenges/ai-blitz-xiii/problems/age-prediction                                 │
└──────────────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Comments

You must login before you can post a comment.

Execute