Description¶
This solution is based on efficientnet-b3 (net was taken from https://github.com/lukemelas/EfficientNet-PyTorch).
The model was trained in the following order:
Criterion -MSELoss, optimizer - Adam
2 epoch - Learning rate - 0.001
4 epoch - Learning rate - 0.0005
4 epoch - Learning rate - 0.0001
4 epoch - Learning rate - 0.00005
4 epoch - Learning rate - 0.00001
2 epoch - Learning rate - 0.000005
2 epoch - Learning rate - 0.000001
Train took a long time on Colab. Due to limit restriction, model weights were saved on some checkpoint and after that training continued on another account. Final model weights were saved in file.
In [1]:
import torch
workDir='/home/data/'
#imSize=224
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
In [2]:
# this mounts your Google Drive to the Colab VM.
from google.colab import drive
drive.mount('/content/drive', force_remount=True)
%cd '/home'
!mkdir 'data'
%cd '/home/data'
Mounted at /content/drive /home /home/data
Download data¶
In [3]:
!pip install --upgrade fastai
!pip install -U aicrowd-cli
Collecting fastai Downloading https://files.pythonhosted.org/packages/e8/79/e8a87e4c20238e114671314426227db8647d2b42744eab79e0917c59865e/fastai-2.3.1-py3-none-any.whl (194kB) |████████████████████████████████| 204kB 6.8MB/s Requirement already satisfied, skipping upgrade: scikit-learn in /usr/local/lib/python3.7/dist-packages (from fastai) (0.22.2.post1) Requirement already satisfied, skipping upgrade: torchvision>=0.8.2 in /usr/local/lib/python3.7/dist-packages (from fastai) (0.9.1+cu101) Requirement already satisfied, skipping upgrade: pyyaml in /usr/local/lib/python3.7/dist-packages (from fastai) (3.13) Requirement already satisfied, skipping upgrade: fastprogress>=0.2.4 in /usr/local/lib/python3.7/dist-packages (from fastai) (1.0.0) Requirement already satisfied, skipping upgrade: requests in /usr/local/lib/python3.7/dist-packages (from fastai) (2.23.0) Requirement already satisfied, skipping upgrade: spacy<4 in /usr/local/lib/python3.7/dist-packages (from fastai) (2.2.4) Requirement already satisfied, skipping upgrade: scipy in /usr/local/lib/python3.7/dist-packages (from fastai) (1.4.1) Requirement already satisfied, skipping upgrade: torch<1.9,>=1.7.0 in /usr/local/lib/python3.7/dist-packages (from fastai) (1.8.1+cu101) Requirement already satisfied, skipping upgrade: matplotlib in /usr/local/lib/python3.7/dist-packages (from fastai) (3.2.2) Requirement already satisfied, skipping upgrade: pillow>6.0.0 in /usr/local/lib/python3.7/dist-packages (from fastai) (7.1.2) Requirement already satisfied, skipping upgrade: pip in /usr/local/lib/python3.7/dist-packages (from fastai) (19.3.1) Requirement already satisfied, skipping upgrade: packaging in /usr/local/lib/python3.7/dist-packages (from fastai) (20.9) Collecting fastcore<1.4,>=1.3.8 Downloading https://files.pythonhosted.org/packages/d8/b0/f1fbf554e0bf3c76e1bdc3b82eedfe41fcf656479586be38c64421082b1b/fastcore-1.3.20-py3-none-any.whl (53kB) |████████████████████████████████| 61kB 8.7MB/s Requirement already satisfied, skipping upgrade: pandas in /usr/local/lib/python3.7/dist-packages (from fastai) (1.1.5) Requirement already satisfied, skipping upgrade: numpy>=1.11.0 in /usr/local/lib/python3.7/dist-packages (from scikit-learn->fastai) (1.19.5) Requirement already satisfied, skipping upgrade: joblib>=0.11 in /usr/local/lib/python3.7/dist-packages (from scikit-learn->fastai) (1.0.1) Requirement already satisfied, skipping upgrade: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests->fastai) (1.24.3) Requirement already satisfied, skipping upgrade: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->fastai) (3.0.4) Requirement already satisfied, skipping upgrade: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests->fastai) (2020.12.5) Requirement already satisfied, skipping upgrade: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests->fastai) (2.10) Requirement already satisfied, skipping upgrade: tqdm<5.0.0,>=4.38.0 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (4.41.1) Requirement already satisfied, skipping upgrade: wasabi<1.1.0,>=0.4.0 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (0.8.2) Requirement already satisfied, skipping upgrade: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (2.0.5) Requirement already satisfied, skipping upgrade: srsly<1.1.0,>=1.0.2 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (1.0.5) Requirement already satisfied, skipping upgrade: setuptools in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (56.1.0) Requirement already satisfied, skipping upgrade: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (1.0.5) Requirement already satisfied, skipping upgrade: preshed<3.1.0,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (3.0.5) Requirement already satisfied, skipping upgrade: catalogue<1.1.0,>=0.0.7 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (1.0.0) Requirement already satisfied, skipping upgrade: plac<1.2.0,>=0.9.6 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (1.1.3) Requirement already satisfied, skipping upgrade: thinc==7.4.0 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (7.4.0) Requirement already satisfied, skipping upgrade: blis<0.5.0,>=0.4.0 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (0.4.1) Requirement already satisfied, skipping upgrade: typing-extensions in /usr/local/lib/python3.7/dist-packages (from torch<1.9,>=1.7.0->fastai) (3.7.4.3) Requirement already satisfied, skipping upgrade: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->fastai) (2.4.7) Requirement already satisfied, skipping upgrade: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->fastai) (2.8.1) Requirement already satisfied, skipping upgrade: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->fastai) (1.3.1) Requirement already satisfied, skipping upgrade: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib->fastai) (0.10.0) Requirement already satisfied, skipping upgrade: pytz>=2017.2 in /usr/local/lib/python3.7/dist-packages (from pandas->fastai) (2018.9) Requirement already satisfied, skipping upgrade: importlib-metadata>=0.20; python_version < "3.8" in /usr/local/lib/python3.7/dist-packages (from catalogue<1.1.0,>=0.0.7->spacy<4->fastai) (4.0.1) Requirement already satisfied, skipping upgrade: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib->fastai) (1.15.0) Requirement already satisfied, skipping upgrade: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata>=0.20; python_version < "3.8"->catalogue<1.1.0,>=0.0.7->spacy<4->fastai) (3.4.1) Installing collected packages: fastcore, fastai Found existing installation: fastai 1.0.61 Uninstalling fastai-1.0.61: Successfully uninstalled fastai-1.0.61 Successfully installed fastai-2.3.1 fastcore-1.3.20 Collecting aicrowd-cli Downloading https://files.pythonhosted.org/packages/a5/8a/fca67e8c1cb1501a9653cd653232bf6fdebbb2393e3de861aad3636a1136/aicrowd_cli-0.1.6-py3-none-any.whl (51kB) |████████████████████████████████| 61kB 5.0MB/s Collecting requests-toolbelt<1,>=0.9.1 Downloading https://files.pythonhosted.org/packages/60/ef/7681134338fc097acef8d9b2f8abe0458e4d87559c689a8c306d0957ece5/requests_toolbelt-0.9.1-py2.py3-none-any.whl (54kB) |████████████████████████████████| 61kB 10.8MB/s Collecting rich<11,>=10.0.0 Downloading https://files.pythonhosted.org/packages/6b/39/fbe8d15f0b017d63701f2a42e4ccb9a73cd4175e5c56214c1b5685e3dd79/rich-10.2.2-py3-none-any.whl (203kB) |████████████████████████████████| 204kB 52.0MB/s Collecting requests<3,>=2.25.1 Downloading https://files.pythonhosted.org/packages/29/c1/24814557f1d22c56d50280771a17307e6bf87b70727d975fd6b2ce6b014a/requests-2.25.1-py2.py3-none-any.whl (61kB) |████████████████████████████████| 61kB 10.6MB/s Requirement already satisfied, skipping upgrade: toml<1,>=0.10.2 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (0.10.2) Collecting gitpython<4,>=3.1.12 Downloading https://files.pythonhosted.org/packages/27/da/6f6224fdfc47dab57881fe20c0d1bc3122be290198ba0bf26a953a045d92/GitPython-3.1.17-py3-none-any.whl (166kB) |████████████████████████████████| 174kB 57.8MB/s Collecting tqdm<5,>=4.56.0 Downloading https://files.pythonhosted.org/packages/72/8a/34efae5cf9924328a8f34eeb2fdaae14c011462d9f0e3fcded48e1266d1c/tqdm-4.60.0-py2.py3-none-any.whl (75kB) |████████████████████████████████| 81kB 13.0MB/s Collecting click<8,>=7.1.2 Downloading https://files.pythonhosted.org/packages/d2/3d/fa76db83bf75c4f8d338c2fd15c8d33fdd7ad23a9b5e57eb6c5de26b430e/click-7.1.2-py2.py3-none-any.whl (82kB) |████████████████████████████████| 92kB 16.4MB/s Requirement already satisfied, skipping upgrade: typing-extensions<4.0.0,>=3.7.4; python_version < "3.8" in /usr/local/lib/python3.7/dist-packages (from rich<11,>=10.0.0->aicrowd-cli) (3.7.4.3) Collecting colorama<0.5.0,>=0.4.0 Downloading https://files.pythonhosted.org/packages/44/98/5b86278fbbf250d239ae0ecb724f8572af1c91f4a11edf4d36a206189440/colorama-0.4.4-py2.py3-none-any.whl Requirement already satisfied, skipping upgrade: pygments<3.0.0,>=2.6.0 in /usr/local/lib/python3.7/dist-packages (from rich<11,>=10.0.0->aicrowd-cli) (2.6.1) Collecting commonmark<0.10.0,>=0.9.0 Downloading https://files.pythonhosted.org/packages/b1/92/dfd892312d822f36c55366118b95d914e5f16de11044a27cf10a7d71bbbf/commonmark-0.9.1-py2.py3-none-any.whl (51kB) |████████████████████████████████| 51kB 9.2MB/s Requirement already satisfied, skipping upgrade: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (1.24.3) Requirement already satisfied, skipping upgrade: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (2020.12.5) Requirement already satisfied, skipping upgrade: chardet<5,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (3.0.4) Requirement already satisfied, skipping upgrade: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (2.10) Collecting gitdb<5,>=4.0.1 Downloading https://files.pythonhosted.org/packages/ea/e8/f414d1a4f0bbc668ed441f74f44c116d9816833a48bf81d22b697090dba8/gitdb-4.0.7-py3-none-any.whl (63kB) |████████████████████████████████| 71kB 11.9MB/s Collecting smmap<5,>=3.0.1 Downloading https://files.pythonhosted.org/packages/68/ee/d540eb5e5996eb81c26ceffac6ee49041d473bc5125f2aa995cf51ec1cf1/smmap-4.0.0-py2.py3-none-any.whl ERROR: google-colab 1.0.0 has requirement requests~=2.23.0, but you'll have requests 2.25.1 which is incompatible. ERROR: datascience 0.10.6 has requirement folium==0.2.1, but you'll have folium 0.8.3 which is incompatible. Installing collected packages: requests, requests-toolbelt, colorama, commonmark, rich, smmap, gitdb, gitpython, tqdm, click, aicrowd-cli Found existing installation: requests 2.23.0 Uninstalling requests-2.23.0: Successfully uninstalled requests-2.23.0 Found existing installation: tqdm 4.41.1 Uninstalling tqdm-4.41.1: Successfully uninstalled tqdm-4.41.1 Found existing installation: click 8.0.0 Uninstalling click-8.0.0: Successfully uninstalled click-8.0.0 Successfully installed aicrowd-cli-0.1.6 click-7.1.2 colorama-0.4.4 commonmark-0.9.1 gitdb-4.0.7 gitpython-3.1.17 requests-2.25.1 requests-toolbelt-0.9.1 rich-10.2.2 smmap-4.0.0 tqdm-4.60.0
In [4]:
API_KEY = '52ab6eb031245b7028158e2f3e993174' #Please enter your API Key from [https://www.aicrowd.com/participants/me]
!aicrowd login --api-key $API_KEY
API Key valid Saved API Key successfully!
In [5]:
!aicrowd dataset download --challenge f1-speed-recognition
sample_submission.csv: 100% 97.8k/97.8k [00:00<00:00, 1.59MB/s] train.csv: 100% 407k/407k [00:00<00:00, 3.67MB/s] train.zip: 0% 0.00/385M [00:00<?, ?B/s] train.zip: 78% 302M/385M [00:03<00:00, 85.5MB/s] train.zip: 100% 385M/385M [00:04<00:00, 83.2MB/s] val.csv: 100% 36.7k/36.7k [00:00<00:00, 1.46MB/s] val.zip: 0% 0.00/37.8M [00:00<?, ?B/s] test.zip: 100% 96.9M/96.9M [00:07<00:00, 13.5MB/s] val.zip: 100% 37.8M/37.8M [00:03<00:00, 11.8MB/s]
In [6]:
!rm -rf data
!mkdir data
!unzip -q train.zip -d data/train
!unzip -q val.zip -d data/val
!unzip -q test.zip -d data/test
!mv train.csv data/train.csv
!mv val.csv data/val.csv
!mv sample_submission.csv data/sample_submission.csv
Prepare data¶
In [7]:
# Custom dataset class
import torch
from torch.utils.data import Dataset,DataLoader,RandomSampler
from torchvision import transforms as T
import pandas as pd
from PIL import Image
class ImageDataset(Dataset):
def __init__(self,ImageFold,df,transforms):
self.ImageFold=ImageFold
self.df=df
self.trans=transforms
def __len__(self):
return len(self.df)
def __getitem__(self,ind):
im=self.load_image(self.df.iloc[ind][0])
speed=self.df.iloc[ind][1]
im=self.trans(im)
return im, speed
def load_image(self,ind):
return Image.open(self.ImageFold+str(self.df.iloc[ind][0])+'.jpg')
In [8]:
trainTrans=T.Compose([
# T.Resize(imSize),
# transforms.RandomHorizontalFlip(),
T.ToTensor(),
T.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
df_train=pd.read_csv('data/train.csv')
ds_train=ImageDataset(workDir+'data/train/',df_train,trainTrans)
dl_train=DataLoader(ds_train,batch_size=32,shuffle=True,num_workers=4)
df_val=pd.read_csv('data/val.csv')
ds_val=ImageDataset(workDir+'data/val/',df_val,trainTrans)
dl_val=DataLoader(ds_val,batch_size=32,shuffle=True,num_workers=4)
dataloaders_dict={'train':dl_train,'val':dl_val}
Training loop¶
In [9]:
def train_model(model, dataloaders, criterion, optimizer, num_epochs=25):
since = time.time()
val_acc_history = []
best_loss = 10e30
for epoch in range(num_epochs):
print('Epoch {}/{}'.format(epoch, num_epochs - 1))
print('-' * 10)
# Each epoch has a training and validation phase
for phase in ['train', 'val']:
if phase == 'train':
model.train() # Set model to training mode
else:
model.eval() # Set model to evaluate mode
running_loss = 0.0
running_corrects = 0
i=0
# Iterate over data.
for inputs, labels in dataloaders[phase]:
inputs = inputs.to(device)
labels = labels.to(device)
# zero the parameter gradients
optimizer.zero_grad()
# forward
# track history if only in train
with torch.set_grad_enabled(phase == 'train'):
i+=128
outputs = model(inputs)
loss = criterion(outputs.float(), torch.reshape(labels,(len(labels),1)).float())
if(i % 8192 ==0):
print(loss)
# backward + optimize only if in training phase
if phase == 'train':
loss.backward()
optimizer.step()
running_loss += loss.detach().item()*len(labels)
epoch_loss = running_loss / (len(dataloaders[phase].dataset))
if phase == 'val' and epoch_loss < best_loss:
best_loss = epoch_loss
best_model_wts = copy.deepcopy(model.state_dict())
if phase == 'val':
val_acc_history.append(epoch_loss)
# statistics
print('{} Loss: {:.4f} '.format(phase, epoch_loss))
print()
time_elapsed = time.time() - since
print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))
# load best model weights
model.load_state_dict(best_model_wts)
return model, val_acc_history
EfficientNet training¶
In [10]:
import random
import torchvision.models as models
from __future__ import print_function
from __future__ import division
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
import torchvision
from torchvision import datasets, models, transforms
import matplotlib.pyplot as plt
import time
import os
import copy
In [11]:
!pip install efficientnet_pytorch
Collecting efficientnet_pytorch Downloading https://files.pythonhosted.org/packages/2e/a0/dd40b50aebf0028054b6b35062948da01123d7be38d08b6b1e5435df6363/efficientnet_pytorch-0.7.1.tar.gz Requirement already satisfied: torch in /usr/local/lib/python3.7/dist-packages (from efficientnet_pytorch) (1.8.1+cu101) Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from torch->efficientnet_pytorch) (3.7.4.3) Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from torch->efficientnet_pytorch) (1.19.5) Building wheels for collected packages: efficientnet-pytorch Building wheel for efficientnet-pytorch (setup.py) ... done Created wheel for efficientnet-pytorch: filename=efficientnet_pytorch-0.7.1-cp37-none-any.whl size=16443 sha256=b39755b3dc041d8a60548dd1479d328053a7914bca3a8eed7d08b1e9ae9dc035 Stored in directory: /root/.cache/pip/wheels/84/27/aa/c46d23c4e8cc72d41283862b1437e0b3ad318417e8ed7d5921 Successfully built efficientnet-pytorch Installing collected packages: efficientnet-pytorch Successfully installed efficientnet-pytorch-0.7.1
In [12]:
from efficientnet_pytorch import EfficientNet
model = EfficientNet.from_pretrained('efficientnet-b5',num_classes = 1)
model.load_state_dict(torch.load('/content/drive/MyDrive/weights_ef5_net_adam4.txt'))
model.to(device)
Downloading: "https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/efficientnet-b5-b6417697.pth" to /root/.cache/torch/hub/checkpoints/efficientnet-b5-b6417697.pth
Loaded pretrained weights for efficientnet-b5
Out[12]:
EfficientNet( (_conv_stem): Conv2dStaticSamePadding( 3, 48, kernel_size=(3, 3), stride=(2, 2), bias=False (static_padding): ZeroPad2d(padding=(0, 1, 0, 1), value=0.0) ) (_bn0): BatchNorm2d(48, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_blocks): ModuleList( (0): MBConvBlock( (_depthwise_conv): Conv2dStaticSamePadding( 48, 48, kernel_size=(3, 3), stride=[1, 1], groups=48, bias=False (static_padding): ZeroPad2d(padding=(1, 1, 1, 1), value=0.0) ) (_bn1): BatchNorm2d(48, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 48, 12, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 12, 48, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 48, 24, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(24, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (1): MBConvBlock( (_depthwise_conv): Conv2dStaticSamePadding( 24, 24, kernel_size=(3, 3), stride=(1, 1), groups=24, bias=False (static_padding): ZeroPad2d(padding=(1, 1, 1, 1), value=0.0) ) (_bn1): BatchNorm2d(24, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 24, 6, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 6, 24, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 24, 24, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(24, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (2): MBConvBlock( (_depthwise_conv): Conv2dStaticSamePadding( 24, 24, kernel_size=(3, 3), stride=(1, 1), groups=24, bias=False (static_padding): ZeroPad2d(padding=(1, 1, 1, 1), value=0.0) ) (_bn1): BatchNorm2d(24, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 24, 6, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 6, 24, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 24, 24, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(24, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (3): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 24, 144, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(144, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 144, 144, kernel_size=(3, 3), stride=[2, 2], groups=144, bias=False (static_padding): ZeroPad2d(padding=(0, 1, 0, 1), value=0.0) ) (_bn1): BatchNorm2d(144, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 144, 6, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 6, 144, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 144, 40, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(40, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (4): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 40, 240, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(240, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 240, 240, kernel_size=(3, 3), stride=(1, 1), groups=240, bias=False (static_padding): ZeroPad2d(padding=(1, 1, 1, 1), value=0.0) ) (_bn1): BatchNorm2d(240, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 240, 10, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 10, 240, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 240, 40, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(40, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (5): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 40, 240, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(240, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 240, 240, kernel_size=(3, 3), stride=(1, 1), groups=240, bias=False (static_padding): ZeroPad2d(padding=(1, 1, 1, 1), value=0.0) ) (_bn1): BatchNorm2d(240, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 240, 10, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 10, 240, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 240, 40, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(40, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (6): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 40, 240, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(240, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 240, 240, kernel_size=(3, 3), stride=(1, 1), groups=240, bias=False (static_padding): ZeroPad2d(padding=(1, 1, 1, 1), value=0.0) ) (_bn1): BatchNorm2d(240, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 240, 10, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 10, 240, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 240, 40, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(40, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (7): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 40, 240, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(240, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 240, 240, kernel_size=(3, 3), stride=(1, 1), groups=240, bias=False (static_padding): ZeroPad2d(padding=(1, 1, 1, 1), value=0.0) ) (_bn1): BatchNorm2d(240, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 240, 10, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 10, 240, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 240, 40, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(40, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (8): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 40, 240, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(240, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 240, 240, kernel_size=(5, 5), stride=[2, 2], groups=240, bias=False (static_padding): ZeroPad2d(padding=(1, 2, 1, 2), value=0.0) ) (_bn1): BatchNorm2d(240, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 240, 10, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 10, 240, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 240, 64, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(64, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (9): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(384, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 384, 384, kernel_size=(5, 5), stride=(1, 1), groups=384, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(384, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 384, 16, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 16, 384, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(64, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (10): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(384, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 384, 384, kernel_size=(5, 5), stride=(1, 1), groups=384, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(384, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 384, 16, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 16, 384, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(64, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (11): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(384, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 384, 384, kernel_size=(5, 5), stride=(1, 1), groups=384, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(384, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 384, 16, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 16, 384, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(64, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (12): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(384, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 384, 384, kernel_size=(5, 5), stride=(1, 1), groups=384, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(384, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 384, 16, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 16, 384, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(64, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (13): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(384, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 384, 384, kernel_size=(3, 3), stride=[2, 2], groups=384, bias=False (static_padding): ZeroPad2d(padding=(1, 1, 1, 1), value=0.0) ) (_bn1): BatchNorm2d(384, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 384, 16, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 16, 384, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 384, 128, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(128, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (14): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 128, 768, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(768, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 768, 768, kernel_size=(3, 3), stride=(1, 1), groups=768, bias=False (static_padding): ZeroPad2d(padding=(1, 1, 1, 1), value=0.0) ) (_bn1): BatchNorm2d(768, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 768, 32, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 32, 768, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 768, 128, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(128, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (15): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 128, 768, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(768, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 768, 768, kernel_size=(3, 3), stride=(1, 1), groups=768, bias=False (static_padding): ZeroPad2d(padding=(1, 1, 1, 1), value=0.0) ) (_bn1): BatchNorm2d(768, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 768, 32, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 32, 768, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 768, 128, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(128, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (16): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 128, 768, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(768, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 768, 768, kernel_size=(3, 3), stride=(1, 1), groups=768, bias=False (static_padding): ZeroPad2d(padding=(1, 1, 1, 1), value=0.0) ) (_bn1): BatchNorm2d(768, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 768, 32, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 32, 768, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 768, 128, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(128, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (17): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 128, 768, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(768, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 768, 768, kernel_size=(3, 3), stride=(1, 1), groups=768, bias=False (static_padding): ZeroPad2d(padding=(1, 1, 1, 1), value=0.0) ) (_bn1): BatchNorm2d(768, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 768, 32, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 32, 768, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 768, 128, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(128, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (18): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 128, 768, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(768, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 768, 768, kernel_size=(3, 3), stride=(1, 1), groups=768, bias=False (static_padding): ZeroPad2d(padding=(1, 1, 1, 1), value=0.0) ) (_bn1): BatchNorm2d(768, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 768, 32, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 32, 768, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 768, 128, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(128, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (19): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 128, 768, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(768, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 768, 768, kernel_size=(3, 3), stride=(1, 1), groups=768, bias=False (static_padding): ZeroPad2d(padding=(1, 1, 1, 1), value=0.0) ) (_bn1): BatchNorm2d(768, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 768, 32, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 32, 768, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 768, 128, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(128, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (20): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 128, 768, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(768, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 768, 768, kernel_size=(5, 5), stride=[1, 1], groups=768, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(768, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 768, 32, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 32, 768, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 768, 176, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(176, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (21): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 176, 1056, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(1056, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 1056, 1056, kernel_size=(5, 5), stride=(1, 1), groups=1056, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(1056, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 1056, 44, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 44, 1056, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 1056, 176, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(176, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (22): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 176, 1056, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(1056, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 1056, 1056, kernel_size=(5, 5), stride=(1, 1), groups=1056, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(1056, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 1056, 44, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 44, 1056, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 1056, 176, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(176, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (23): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 176, 1056, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(1056, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 1056, 1056, kernel_size=(5, 5), stride=(1, 1), groups=1056, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(1056, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 1056, 44, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 44, 1056, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 1056, 176, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(176, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (24): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 176, 1056, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(1056, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 1056, 1056, kernel_size=(5, 5), stride=(1, 1), groups=1056, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(1056, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 1056, 44, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 44, 1056, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 1056, 176, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(176, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (25): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 176, 1056, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(1056, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 1056, 1056, kernel_size=(5, 5), stride=(1, 1), groups=1056, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(1056, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 1056, 44, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 44, 1056, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 1056, 176, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(176, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (26): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 176, 1056, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(1056, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 1056, 1056, kernel_size=(5, 5), stride=(1, 1), groups=1056, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(1056, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 1056, 44, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 44, 1056, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 1056, 176, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(176, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (27): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 176, 1056, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(1056, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 1056, 1056, kernel_size=(5, 5), stride=[2, 2], groups=1056, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(1056, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 1056, 44, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 44, 1056, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 1056, 304, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(304, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (28): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 304, 1824, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(1824, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 1824, 1824, kernel_size=(5, 5), stride=(1, 1), groups=1824, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(1824, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 1824, 76, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 76, 1824, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 1824, 304, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(304, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (29): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 304, 1824, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(1824, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 1824, 1824, kernel_size=(5, 5), stride=(1, 1), groups=1824, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(1824, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 1824, 76, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 76, 1824, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 1824, 304, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(304, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (30): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 304, 1824, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(1824, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 1824, 1824, kernel_size=(5, 5), stride=(1, 1), groups=1824, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(1824, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 1824, 76, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 76, 1824, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 1824, 304, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(304, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (31): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 304, 1824, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(1824, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 1824, 1824, kernel_size=(5, 5), stride=(1, 1), groups=1824, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(1824, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 1824, 76, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 76, 1824, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 1824, 304, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(304, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (32): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 304, 1824, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(1824, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 1824, 1824, kernel_size=(5, 5), stride=(1, 1), groups=1824, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(1824, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 1824, 76, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 76, 1824, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 1824, 304, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(304, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (33): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 304, 1824, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(1824, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 1824, 1824, kernel_size=(5, 5), stride=(1, 1), groups=1824, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(1824, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 1824, 76, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 76, 1824, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 1824, 304, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(304, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (34): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 304, 1824, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(1824, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 1824, 1824, kernel_size=(5, 5), stride=(1, 1), groups=1824, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(1824, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 1824, 76, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 76, 1824, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 1824, 304, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(304, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (35): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 304, 1824, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(1824, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 1824, 1824, kernel_size=(5, 5), stride=(1, 1), groups=1824, bias=False (static_padding): ZeroPad2d(padding=(2, 2, 2, 2), value=0.0) ) (_bn1): BatchNorm2d(1824, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 1824, 76, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 76, 1824, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 1824, 304, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(304, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (36): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 304, 1824, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(1824, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 1824, 1824, kernel_size=(3, 3), stride=[1, 1], groups=1824, bias=False (static_padding): ZeroPad2d(padding=(1, 1, 1, 1), value=0.0) ) (_bn1): BatchNorm2d(1824, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 1824, 76, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 76, 1824, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 1824, 512, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(512, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (37): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 512, 3072, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(3072, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 3072, 3072, kernel_size=(3, 3), stride=(1, 1), groups=3072, bias=False (static_padding): ZeroPad2d(padding=(1, 1, 1, 1), value=0.0) ) (_bn1): BatchNorm2d(3072, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 3072, 128, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 128, 3072, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 3072, 512, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(512, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) (38): MBConvBlock( (_expand_conv): Conv2dStaticSamePadding( 512, 3072, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn0): BatchNorm2d(3072, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_depthwise_conv): Conv2dStaticSamePadding( 3072, 3072, kernel_size=(3, 3), stride=(1, 1), groups=3072, bias=False (static_padding): ZeroPad2d(padding=(1, 1, 1, 1), value=0.0) ) (_bn1): BatchNorm2d(3072, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_se_reduce): Conv2dStaticSamePadding( 3072, 128, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_se_expand): Conv2dStaticSamePadding( 128, 3072, kernel_size=(1, 1), stride=(1, 1) (static_padding): Identity() ) (_project_conv): Conv2dStaticSamePadding( 3072, 512, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn2): BatchNorm2d(512, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_swish): MemoryEfficientSwish() ) ) (_conv_head): Conv2dStaticSamePadding( 512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn1): BatchNorm2d(2048, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_avg_pooling): AdaptiveAvgPool2d(output_size=1) (_dropout): Dropout(p=0.4, inplace=False) (_fc): Linear(in_features=2048, out_features=1, bias=True) (_swish): MemoryEfficientSwish() )
In [14]:
criterion = nn.MSELoss(reduction='mean')
In [ ]:
num_epochs=3
optimizer =torch.optim.Adam(model.parameters(), lr=0.001)
criterion = nn.MSELoss(reduction='mean')
# Train and evaluate
model, hist = train_model(model, dataloaders_dict, criterion, optimizer, num_epochs=num_epochs)
torch.save(model.state_dict(), '/content/drive/MyDrive/weights_ef_net_adam1.txt')
Epoch 0/2 ---------- tensor(1126039.3750, device='cuda:0', grad_fn=<MseLossBackward>) tensor(928345., device='cuda:0', grad_fn=<MseLossBackward>) tensor(936889.3125, device='cuda:0', grad_fn=<MseLossBackward>) tensor(455660.5625, device='cuda:0', grad_fn=<MseLossBackward>) tensor(496136.3750, device='cuda:0', grad_fn=<MseLossBackward>) tensor(389292.8125, device='cuda:0', grad_fn=<MseLossBackward>) tensor(272891.2500, device='cuda:0', grad_fn=<MseLossBackward>) tensor(289251.8750, device='cuda:0', grad_fn=<MseLossBackward>) tensor(73087.3594, device='cuda:0', grad_fn=<MseLossBackward>) tensor(137523.5156, device='cuda:0', grad_fn=<MseLossBackward>) tensor(46132.3516, device='cuda:0', grad_fn=<MseLossBackward>) tensor(38502.5234, device='cuda:0', grad_fn=<MseLossBackward>) tensor(33976.2461, device='cuda:0', grad_fn=<MseLossBackward>) tensor(43707.4648, device='cuda:0', grad_fn=<MseLossBackward>) tensor(21662.8145, device='cuda:0', grad_fn=<MseLossBackward>) tensor(13052.1143, device='cuda:0', grad_fn=<MseLossBackward>) tensor(29430.6094, device='cuda:0', grad_fn=<MseLossBackward>) tensor(45079.5547, device='cuda:0', grad_fn=<MseLossBackward>) tensor(22912.9082, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 365233.9742 tensor(26557.7129, device='cuda:0') val Loss: 24195.2131 Epoch 1/2 ---------- tensor(15004.9551, device='cuda:0', grad_fn=<MseLossBackward>) tensor(41623.3789, device='cuda:0', grad_fn=<MseLossBackward>) tensor(14875.4102, device='cuda:0', grad_fn=<MseLossBackward>) tensor(10762.4434, device='cuda:0', grad_fn=<MseLossBackward>) tensor(12154.8115, device='cuda:0', grad_fn=<MseLossBackward>) tensor(12880.5088, device='cuda:0', grad_fn=<MseLossBackward>) tensor(23775.2578, device='cuda:0', grad_fn=<MseLossBackward>) tensor(18208.5820, device='cuda:0', grad_fn=<MseLossBackward>) tensor(10914.1836, device='cuda:0', grad_fn=<MseLossBackward>) tensor(7584.8901, device='cuda:0', grad_fn=<MseLossBackward>) tensor(10787.8301, device='cuda:0', grad_fn=<MseLossBackward>) tensor(5571.7715, device='cuda:0', grad_fn=<MseLossBackward>) tensor(8155.9551, device='cuda:0', grad_fn=<MseLossBackward>) tensor(42047.5703, device='cuda:0', grad_fn=<MseLossBackward>) tensor(7673.6113, device='cuda:0', grad_fn=<MseLossBackward>) tensor(8544.3506, device='cuda:0', grad_fn=<MseLossBackward>) tensor(8655.7227, device='cuda:0', grad_fn=<MseLossBackward>) tensor(12288.6465, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4577.2969, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 17208.3828 tensor(6606.2314, device='cuda:0') val Loss: 7722.5844 Epoch 2/2 ---------- tensor(6238.8237, device='cuda:0', grad_fn=<MseLossBackward>) tensor(7131.3369, device='cuda:0', grad_fn=<MseLossBackward>) tensor(10357.3066, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3803.6189, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3876.5813, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4091.0125, device='cuda:0', grad_fn=<MseLossBackward>) tensor(5196.7378, device='cuda:0', grad_fn=<MseLossBackward>) tensor(5517.7437, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4542.3687, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4858.5386, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4254.7061, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2551.3186, device='cuda:0', grad_fn=<MseLossBackward>) tensor(6101.1807, device='cuda:0', grad_fn=<MseLossBackward>) tensor(9434.6543, device='cuda:0', grad_fn=<MseLossBackward>) tensor(6069.4492, device='cuda:0', grad_fn=<MseLossBackward>) tensor(9532.6426, device='cuda:0', grad_fn=<MseLossBackward>) tensor(9280.9502, device='cuda:0', grad_fn=<MseLossBackward>) tensor(7547.0254, device='cuda:0', grad_fn=<MseLossBackward>) tensor(17375.4629, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 9178.6932 tensor(20476.8008, device='cuda:0') val Loss: 20709.3446 Training complete in 23m 29s
In [ ]:
num_epochs=3
optimizer =torch.optim.Adam(model.parameters(), lr=0.001)
criterion = nn.MSELoss(reduction='mean')
# Train and evaluate
model, hist = train_model(model, dataloaders_dict, criterion, optimizer, num_epochs=num_epochs)
torch.save(model.state_dict(), '/content/drive/MyDrive/weights_ef_net_adam1.txt')
Epoch 0/2 ---------- tensor(10052.4648, device='cuda:0', grad_fn=<MseLossBackward>) tensor(12248.3545, device='cuda:0', grad_fn=<MseLossBackward>) tensor(11553.3662, device='cuda:0', grad_fn=<MseLossBackward>) tensor(9536.5107, device='cuda:0', grad_fn=<MseLossBackward>) tensor(6805.2119, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3946.8884, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4307.1206, device='cuda:0', grad_fn=<MseLossBackward>) tensor(11555.7344, device='cuda:0', grad_fn=<MseLossBackward>) tensor(6081.4370, device='cuda:0', grad_fn=<MseLossBackward>) tensor(9544.8711, device='cuda:0', grad_fn=<MseLossBackward>) tensor(6038.4731, device='cuda:0', grad_fn=<MseLossBackward>) tensor(7909.7773, device='cuda:0', grad_fn=<MseLossBackward>) tensor(9623.5273, device='cuda:0', grad_fn=<MseLossBackward>) tensor(7681.7686, device='cuda:0', grad_fn=<MseLossBackward>) tensor(13583.6074, device='cuda:0', grad_fn=<MseLossBackward>) tensor(6766.0176, device='cuda:0', grad_fn=<MseLossBackward>) tensor(44085.5547, device='cuda:0', grad_fn=<MseLossBackward>) tensor(21406.6367, device='cuda:0', grad_fn=<MseLossBackward>) tensor(6607.8628, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 8830.4196 tensor(4242.4668, device='cuda:0') val Loss: 7569.7639 Epoch 1/2 ---------- tensor(2358.8093, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4619.2939, device='cuda:0', grad_fn=<MseLossBackward>) tensor(5366.6167, device='cuda:0', grad_fn=<MseLossBackward>) tensor(6242.6553, device='cuda:0', grad_fn=<MseLossBackward>) tensor(6365.0977, device='cuda:0', grad_fn=<MseLossBackward>) tensor(7157.5288, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4030.6250, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4272.6670, device='cuda:0', grad_fn=<MseLossBackward>) tensor(7887.9902, device='cuda:0', grad_fn=<MseLossBackward>) tensor(5049.1982, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3701.1465, device='cuda:0', grad_fn=<MseLossBackward>) tensor(9798.5518, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4727.8506, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2130.8606, device='cuda:0', grad_fn=<MseLossBackward>) tensor(6373.6592, device='cuda:0', grad_fn=<MseLossBackward>) tensor(6868.5479, device='cuda:0', grad_fn=<MseLossBackward>) tensor(6772.4072, device='cuda:0', grad_fn=<MseLossBackward>) tensor(52808.3047, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3554.6646, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 6916.7339 tensor(2898.0017, device='cuda:0') val Loss: 5475.4497 Epoch 2/2 ---------- tensor(5073.0679, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2010.2195, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2702.3062, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1689.2019, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1967.8163, device='cuda:0', grad_fn=<MseLossBackward>) tensor(15607.2031, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2933.4807, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2899.6685, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1216.6008, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3636.4316, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1919.4458, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4103.7808, device='cuda:0', grad_fn=<MseLossBackward>) tensor(10727.5586, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3671.7727, device='cuda:0', grad_fn=<MseLossBackward>) tensor(54774.6758, device='cuda:0', grad_fn=<MseLossBackward>) tensor(17357.9180, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3085.7314, device='cuda:0', grad_fn=<MseLossBackward>) tensor(5277.2822, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2220.0400, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 4921.7047 tensor(4845.0122, device='cuda:0') val Loss: 5700.8937 Training complete in 23m 31s
In [ ]:
model, hist = train_model(model, dataloaders_dict, criterion, torch.optim.Adam(model.parameters(), lr=0.001), num_epochs=3)
torch.save(model.state_dict(), '/content/drive/MyDrive/weights_ef_net_adam.txt')
Epoch 0/2 ---------- tensor(3756.8237, device='cuda:0', grad_fn=<MseLossBackward>) tensor(29475.7520, device='cuda:0', grad_fn=<MseLossBackward>) tensor(5094.0674, device='cuda:0', grad_fn=<MseLossBackward>) tensor(6295.3311, device='cuda:0', grad_fn=<MseLossBackward>) tensor(5429.8594, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4291.2773, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1631.7502, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3508.0024, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4047.3027, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2114.8535, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3424.5869, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3621.0417, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1410.2703, device='cuda:0', grad_fn=<MseLossBackward>) tensor(8485.8174, device='cuda:0', grad_fn=<MseLossBackward>) tensor(12587.5020, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1320.9062, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1703.9264, device='cuda:0', grad_fn=<MseLossBackward>) tensor(11241.8613, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1490.2014, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 4374.1856 tensor(1973.0693, device='cuda:0') val Loss: 4152.1512 Epoch 1/2 ---------- tensor(2710.7930, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1165.8480, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2882.6641, device='cuda:0', grad_fn=<MseLossBackward>) tensor(7165.0986, device='cuda:0', grad_fn=<MseLossBackward>) tensor(19554.3711, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3448.8792, device='cuda:0', grad_fn=<MseLossBackward>) tensor(6311.6880, device='cuda:0', grad_fn=<MseLossBackward>) tensor(11342.7393, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2990.7991, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4124.4336, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1306.5951, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3355.0696, device='cuda:0', grad_fn=<MseLossBackward>) tensor(9820.8252, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4014.2439, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1952.2258, device='cuda:0', grad_fn=<MseLossBackward>) tensor(21104.1738, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1686.4866, device='cuda:0', grad_fn=<MseLossBackward>) tensor(8742.1846, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1790.6304, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 5313.5044 tensor(1568.4878, device='cuda:0') val Loss: 4118.8562 Epoch 2/2 ---------- tensor(5556.8369, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2046.1384, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3694.8086, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2693.8354, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2635.4187, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2626.2944, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3698.8608, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2047.4744, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1072.7540, device='cuda:0', grad_fn=<MseLossBackward>) tensor(31679.4277, device='cuda:0', grad_fn=<MseLossBackward>) tensor(19148.7070, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2523.5242, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1582.9904, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3232.7832, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1898.1096, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1492.3350, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1769.9034, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1758.5740, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3277.4487, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 3968.3156 tensor(1649.3699, device='cuda:0') val Loss: 3766.6464 Training complete in 23m 37s
In [ ]:
model, hist = train_model(model, dataloaders_dict, criterion, torch.optim.Adam(model.parameters(), lr=0.001), num_epochs=3)
torch.save(model.state_dict(), '/content/drive/MyDrive/weights_ef_net_adam.txt')
Epoch 0/2 ---------- tensor(2893.8232, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2540.3550, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2793.8267, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1273.9226, device='cuda:0', grad_fn=<MseLossBackward>) tensor(10621.8945, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2795.8535, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2583.6465, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2635.6509, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2483.7656, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4025.1831, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3121.3335, device='cuda:0', grad_fn=<MseLossBackward>) tensor(5674.1597, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1881.5269, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1358.8624, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4545.9023, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2109.8594, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4350.8042, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4117.4824, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3638.9590, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 3465.6325 tensor(2120.0793, device='cuda:0') val Loss: 3390.0472 Epoch 1/2 ---------- tensor(2549.4038, device='cuda:0', grad_fn=<MseLossBackward>) tensor(5393.9834, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1171.4164, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4540.5283, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2510.9795, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2191.5342, device='cuda:0', grad_fn=<MseLossBackward>) tensor(15916.8555, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1048.5153, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4479.8350, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3556.8018, device='cuda:0', grad_fn=<MseLossBackward>) tensor(8044.9575, device='cuda:0', grad_fn=<MseLossBackward>) tensor(9865.1035, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3819.1584, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2176.3096, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4184.0244, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1476.4612, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3109.7568, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1412.5081, device='cuda:0', grad_fn=<MseLossBackward>) tensor(7317.6582, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 5286.6663 tensor(2603.8047, device='cuda:0') val Loss: 5209.3836 Epoch 2/2 ---------- tensor(1135.2831, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2307.5278, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1328.0654, device='cuda:0', grad_fn=<MseLossBackward>) tensor(5655.0630, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1303.7634, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1793.4749, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1842.1323, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1715.3237, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3047.8455, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4362.8013, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3897.4045, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2591.6873, device='cuda:0', grad_fn=<MseLossBackward>) tensor(14122.9609, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3938.4087, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4256.3164, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1841.2444, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2723.5317, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2049.1267, device='cuda:0', grad_fn=<MseLossBackward>) tensor(816.8058, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 3027.3775 tensor(2152.5381, device='cuda:0') val Loss: 4085.4634 Training complete in 23m 32s
In [ ]:
model, hist = train_model(model, dataloaders_dict, criterion, torch.optim.Adam(model.parameters(), lr=0.001), num_epochs=3)
torch.save(model.state_dict(), '/content/drive/MyDrive/weights_ef_net_adam.txt')
Epoch 0/2 ---------- tensor(944.5835, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2956.5759, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2518.6460, device='cuda:0', grad_fn=<MseLossBackward>) tensor(921.9926, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1383.4583, device='cuda:0', grad_fn=<MseLossBackward>) tensor(851.2384, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3022.0774, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1605.8302, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4411.5728, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2044.7336, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2150.5935, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3926.5879, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3243.9272, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3387.2542, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1822.9192, device='cuda:0', grad_fn=<MseLossBackward>) tensor(15316.8076, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2138.0022, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2559.8442, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3333.2493, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 3474.8711 tensor(820.5558, device='cuda:0') val Loss: 3297.7085 Epoch 1/2 ---------- tensor(1170.4021, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1383.9055, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1695.1956, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1008.1459, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3735.6287, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1142.3586, device='cuda:0', grad_fn=<MseLossBackward>) tensor(849.7682, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1390.8604, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1392.0842, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1132.6704, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4275.1284, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4095.9348, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2161.0684, device='cuda:0', grad_fn=<MseLossBackward>) tensor(7992.6460, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4371.9341, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3258.5166, device='cuda:0', grad_fn=<MseLossBackward>) tensor(933.4016, device='cuda:0', grad_fn=<MseLossBackward>) tensor(17391.5996, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1306.5269, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 2976.4617 tensor(3973.2588, device='cuda:0') val Loss: 3105.4816 Epoch 2/2 ---------- tensor(7788.2256, device='cuda:0', grad_fn=<MseLossBackward>) tensor(5264.4912, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1637.3533, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2236.3511, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1994.0592, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1547.5469, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1526.9318, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1101.6261, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2773.1975, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1560.7758, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1127.9885, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4598.1025, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1120.0510, device='cuda:0', grad_fn=<MseLossBackward>) tensor(998.5631, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2984.2393, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1079.8763, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1140.3999, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1013.5466, device='cuda:0', grad_fn=<MseLossBackward>) tensor(756.6921, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 2053.4642 tensor(5339.7554, device='cuda:0') val Loss: 2380.7804 Training complete in 23m 32s
In [ ]:
model, hist = train_model(model, dataloaders_dict, criterion, torch.optim.Adam(model.parameters(), lr=0.001), num_epochs=4)
torch.save(model.state_dict(), '/content/drive/MyDrive/weights_ef_net_adam__.txt')
Epoch 0/3 ---------- tensor(1369.0421, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2469.8008, device='cuda:0', grad_fn=<MseLossBackward>) tensor(816.6846, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1022.6912, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1558.9275, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1719.9197, device='cuda:0', grad_fn=<MseLossBackward>) tensor(870.5128, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1680.1464, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4071.4688, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1899.3228, device='cuda:0', grad_fn=<MseLossBackward>) tensor(6213.9316, device='cuda:0', grad_fn=<MseLossBackward>) tensor(841.3873, device='cuda:0', grad_fn=<MseLossBackward>) tensor(526.0975, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1590.5012, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1461.2119, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2190.3359, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1171.3905, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1528.1819, device='cuda:0', grad_fn=<MseLossBackward>) tensor(901.5940, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 1826.5465 tensor(983.1125, device='cuda:0') val Loss: 3399.9513 Epoch 1/3 ---------- tensor(1351.8992, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1063.3196, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2854.6934, device='cuda:0', grad_fn=<MseLossBackward>) tensor(795.0276, device='cuda:0', grad_fn=<MseLossBackward>) tensor(693.8467, device='cuda:0', grad_fn=<MseLossBackward>) tensor(600.2297, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1637.3705, device='cuda:0', grad_fn=<MseLossBackward>) tensor(918.0674, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1069.8854, device='cuda:0', grad_fn=<MseLossBackward>) tensor(913.1107, device='cuda:0', grad_fn=<MseLossBackward>) tensor(869.0249, device='cuda:0', grad_fn=<MseLossBackward>) tensor(762.0253, device='cuda:0', grad_fn=<MseLossBackward>) tensor(916.9150, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2927.7974, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2661.6038, device='cuda:0', grad_fn=<MseLossBackward>) tensor(6437.0967, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1943.6055, device='cuda:0', grad_fn=<MseLossBackward>) tensor(581.9954, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3897.4131, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 2101.2526 tensor(1901.5352, device='cuda:0') val Loss: 5335.1697 Epoch 2/3 ---------- tensor(939.7086, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2388.7153, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4081.3081, device='cuda:0', grad_fn=<MseLossBackward>) tensor(6266.9683, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2360.7947, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2506.6802, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1271.2316, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2022.6572, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1720.9237, device='cuda:0', grad_fn=<MseLossBackward>) tensor(826.6438, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3881.1245, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2798.7280, device='cuda:0', grad_fn=<MseLossBackward>) tensor(6873.8945, device='cuda:0', grad_fn=<MseLossBackward>) tensor(5100.8789, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3835.1125, device='cuda:0', grad_fn=<MseLossBackward>) tensor(9558.2217, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2525.8452, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2172.6116, device='cuda:0', grad_fn=<MseLossBackward>) tensor(957.8455, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 4171.2602 tensor(547.7123, device='cuda:0') val Loss: 3448.3992 Epoch 3/3 ---------- tensor(539.2725, device='cuda:0', grad_fn=<MseLossBackward>) tensor(639.7316, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1056.5583, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1610.9591, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1478.3091, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3785.8281, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1036.3065, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1049.7246, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1747.8057, device='cuda:0', grad_fn=<MseLossBackward>) tensor(869.0558, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2868.8157, device='cuda:0', grad_fn=<MseLossBackward>) tensor(764.2543, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1374.8667, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1191.6068, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1583.5319, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1037.2563, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1493.5741, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1078.2046, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2143.4243, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 1914.9701 tensor(6556.0278, device='cuda:0') val Loss: 4827.0036 Training complete in 31m 26s
In [ ]:
model, hist = train_model(model, dataloaders_dict, criterion, torch.optim.Adam(model.parameters(), lr=0.0003), num_epochs=8)
torch.save(model.state_dict(), '/content/drive/MyDrive/weights_ef5_net_adam2.txt')
Epoch 0/7 ---------- tensor(1021.0389, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1839.2002, device='cuda:0', grad_fn=<MseLossBackward>) tensor(974.2304, device='cuda:0', grad_fn=<MseLossBackward>) tensor(787.0370, device='cuda:0', grad_fn=<MseLossBackward>) tensor(957.6029, device='cuda:0', grad_fn=<MseLossBackward>) tensor(837.1511, device='cuda:0', grad_fn=<MseLossBackward>) tensor(605.2919, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1114.9446, device='cuda:0', grad_fn=<MseLossBackward>) tensor(810.6183, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1328.9108, device='cuda:0', grad_fn=<MseLossBackward>) tensor(821.8687, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1304.4150, device='cuda:0', grad_fn=<MseLossBackward>) tensor(844.1669, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1700.6825, device='cuda:0', grad_fn=<MseLossBackward>) tensor(775.7939, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1122.0782, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1049.3840, device='cuda:0', grad_fn=<MseLossBackward>) tensor(672.8219, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1247.9619, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 1217.7674 tensor(616.5368, device='cuda:0') val Loss: 1703.4802 Epoch 1/7 ---------- tensor(2461.0488, device='cuda:0', grad_fn=<MseLossBackward>) tensor(884.1342, device='cuda:0', grad_fn=<MseLossBackward>) tensor(827.2833, device='cuda:0', grad_fn=<MseLossBackward>) tensor(857.9091, device='cuda:0', grad_fn=<MseLossBackward>) tensor(598.8275, device='cuda:0', grad_fn=<MseLossBackward>) tensor(575.8914, device='cuda:0', grad_fn=<MseLossBackward>) tensor(872.0822, device='cuda:0', grad_fn=<MseLossBackward>) tensor(755.2753, device='cuda:0', grad_fn=<MseLossBackward>) tensor(476.9681, device='cuda:0', grad_fn=<MseLossBackward>) tensor(529.0455, device='cuda:0', grad_fn=<MseLossBackward>) tensor(706.0496, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1171.6233, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1085.1633, device='cuda:0', grad_fn=<MseLossBackward>) tensor(730.7100, device='cuda:0', grad_fn=<MseLossBackward>) tensor(995.0589, device='cuda:0', grad_fn=<MseLossBackward>) tensor(700.0463, device='cuda:0', grad_fn=<MseLossBackward>) tensor(593.3480, device='cuda:0', grad_fn=<MseLossBackward>) tensor(984.2280, device='cuda:0', grad_fn=<MseLossBackward>) tensor(670.2996, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 988.1423 tensor(4206.1934, device='cuda:0') val Loss: 1703.7215 Epoch 2/7 ---------- tensor(6867.0732, device='cuda:0', grad_fn=<MseLossBackward>) tensor(586.4392, device='cuda:0', grad_fn=<MseLossBackward>) tensor(667.1331, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1082.2075, device='cuda:0', grad_fn=<MseLossBackward>) tensor(400.5692, device='cuda:0', grad_fn=<MseLossBackward>) tensor(792.4590, device='cuda:0', grad_fn=<MseLossBackward>) tensor(549.9552, device='cuda:0', grad_fn=<MseLossBackward>) tensor(612.0175, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1472.9210, device='cuda:0', grad_fn=<MseLossBackward>) tensor(661.0810, device='cuda:0', grad_fn=<MseLossBackward>) tensor(434.9098, device='cuda:0', grad_fn=<MseLossBackward>) tensor(895.2047, device='cuda:0', grad_fn=<MseLossBackward>) tensor(484.2960, device='cuda:0', grad_fn=<MseLossBackward>) tensor(641.8995, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1248.6710, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3352.8350, device='cuda:0', grad_fn=<MseLossBackward>) tensor(382.3703, device='cuda:0', grad_fn=<MseLossBackward>) tensor(632.3837, device='cuda:0', grad_fn=<MseLossBackward>) tensor(771.4795, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 968.4276 tensor(2091.2371, device='cuda:0') val Loss: 1392.3256 Epoch 3/7 ---------- tensor(814.3128, device='cuda:0', grad_fn=<MseLossBackward>) tensor(517.4740, device='cuda:0', grad_fn=<MseLossBackward>) tensor(689.7659, device='cuda:0', grad_fn=<MseLossBackward>) tensor(642.2292, device='cuda:0', grad_fn=<MseLossBackward>) tensor(946.5966, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1230.7294, device='cuda:0', grad_fn=<MseLossBackward>) tensor(477.6659, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1273.8792, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1346.3496, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1803.0386, device='cuda:0', grad_fn=<MseLossBackward>) tensor(783.2508, device='cuda:0', grad_fn=<MseLossBackward>) tensor(915.9884, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1331.4509, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1211.1556, device='cuda:0', grad_fn=<MseLossBackward>) tensor(756.9850, device='cuda:0', grad_fn=<MseLossBackward>) tensor(858.8146, device='cuda:0', grad_fn=<MseLossBackward>) tensor(959.1106, device='cuda:0', grad_fn=<MseLossBackward>) tensor(886.3307, device='cuda:0', grad_fn=<MseLossBackward>) tensor(458.6521, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 892.6956 tensor(2397.1587, device='cuda:0') val Loss: 1431.3020 Epoch 4/7 ---------- tensor(361.2756, device='cuda:0', grad_fn=<MseLossBackward>) tensor(799.1633, device='cuda:0', grad_fn=<MseLossBackward>) tensor(439.0979, device='cuda:0', grad_fn=<MseLossBackward>) tensor(974.5688, device='cuda:0', grad_fn=<MseLossBackward>) tensor(749.9719, device='cuda:0', grad_fn=<MseLossBackward>) tensor(741.0807, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1098.5482, device='cuda:0', grad_fn=<MseLossBackward>) tensor(353.4684, device='cuda:0', grad_fn=<MseLossBackward>) tensor(758.0774, device='cuda:0', grad_fn=<MseLossBackward>) tensor(522.8766, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1405.2791, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1829.0254, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1301.3997, device='cuda:0', grad_fn=<MseLossBackward>) tensor(514.5048, device='cuda:0', grad_fn=<MseLossBackward>) tensor(904.0919, device='cuda:0', grad_fn=<MseLossBackward>) tensor(871.9928, device='cuda:0', grad_fn=<MseLossBackward>) tensor(490.3075, device='cuda:0', grad_fn=<MseLossBackward>) tensor(804.5316, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1133.4846, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 889.0178 tensor(194.1868, device='cuda:0') val Loss: 1736.2422 Epoch 5/7 ---------- tensor(481.0308, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1283.6229, device='cuda:0', grad_fn=<MseLossBackward>) tensor(677.6353, device='cuda:0', grad_fn=<MseLossBackward>) tensor(760.7186, device='cuda:0', grad_fn=<MseLossBackward>) tensor(779.5964, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1367.1487, device='cuda:0', grad_fn=<MseLossBackward>) tensor(938.6368, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1352.8971, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1010.0908, device='cuda:0', grad_fn=<MseLossBackward>) tensor(783.0493, device='cuda:0', grad_fn=<MseLossBackward>) tensor(824.5904, device='cuda:0', grad_fn=<MseLossBackward>) tensor(564.6649, device='cuda:0', grad_fn=<MseLossBackward>) tensor(595.4376, device='cuda:0', grad_fn=<MseLossBackward>) tensor(961.6151, device='cuda:0', grad_fn=<MseLossBackward>) tensor(687.7929, device='cuda:0', grad_fn=<MseLossBackward>) tensor(565.9882, device='cuda:0', grad_fn=<MseLossBackward>) tensor(611.3042, device='cuda:0', grad_fn=<MseLossBackward>) tensor(780.5587, device='cuda:0', grad_fn=<MseLossBackward>) tensor(724.2666, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 854.6765 tensor(28488.0293, device='cuda:0') val Loss: 1467.2215 Epoch 6/7 ---------- tensor(577.1844, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1276.5446, device='cuda:0', grad_fn=<MseLossBackward>) tensor(485.7233, device='cuda:0', grad_fn=<MseLossBackward>) tensor(634.8198, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1475.4982, device='cuda:0', grad_fn=<MseLossBackward>) tensor(979.6255, device='cuda:0', grad_fn=<MseLossBackward>) tensor(3637.7529, device='cuda:0', grad_fn=<MseLossBackward>) tensor(412.4253, device='cuda:0', grad_fn=<MseLossBackward>) tensor(674.1943, device='cuda:0', grad_fn=<MseLossBackward>) tensor(670.7603, device='cuda:0', grad_fn=<MseLossBackward>) tensor(635.9614, device='cuda:0', grad_fn=<MseLossBackward>) tensor(357.7485, device='cuda:0', grad_fn=<MseLossBackward>) tensor(736.0706, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1373.4121, device='cuda:0', grad_fn=<MseLossBackward>) tensor(769.7568, device='cuda:0', grad_fn=<MseLossBackward>) tensor(751.2875, device='cuda:0', grad_fn=<MseLossBackward>) tensor(582.5983, device='cuda:0', grad_fn=<MseLossBackward>) tensor(533.1037, device='cuda:0', grad_fn=<MseLossBackward>) tensor(821.2677, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 801.7934 tensor(206.9802, device='cuda:0') val Loss: 1561.1108 Epoch 7/7 ---------- tensor(509.8663, device='cuda:0', grad_fn=<MseLossBackward>) tensor(962.2800, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1014.6615, device='cuda:0', grad_fn=<MseLossBackward>) tensor(967.9111, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1116.2543, device='cuda:0', grad_fn=<MseLossBackward>) tensor(811.2045, device='cuda:0', grad_fn=<MseLossBackward>) tensor(512.4130, device='cuda:0', grad_fn=<MseLossBackward>) tensor(776.6465, device='cuda:0', grad_fn=<MseLossBackward>) tensor(324.5875, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1186.7432, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1423.6533, device='cuda:0', grad_fn=<MseLossBackward>) tensor(464.7187, device='cuda:0', grad_fn=<MseLossBackward>) tensor(808.8879, device='cuda:0', grad_fn=<MseLossBackward>) tensor(740.5737, device='cuda:0', grad_fn=<MseLossBackward>) tensor(359.7856, device='cuda:0', grad_fn=<MseLossBackward>) tensor(403.4703, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1341.2660, device='cuda:0', grad_fn=<MseLossBackward>) tensor(817.1873, device='cuda:0', grad_fn=<MseLossBackward>) tensor(566.6844, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 843.7192 tensor(248.4291, device='cuda:0') val Loss: 1408.3443 Training complete in 62m 50s
In [ ]:
model, hist = train_model(model, dataloaders_dict, criterion, torch.optim.Adam(model.parameters(), lr=0.0001), num_epochs=8)
torch.save(model.state_dict(), '/content/drive/MyDrive/weights_ef5_net_adam3.txt')
Epoch 0/7 ---------- tensor(1893.9414, device='cuda:0', grad_fn=<MseLossBackward>) tensor(491.1410, device='cuda:0', grad_fn=<MseLossBackward>) tensor(407.0396, device='cuda:0', grad_fn=<MseLossBackward>) tensor(451.3976, device='cuda:0', grad_fn=<MseLossBackward>) tensor(591.7089, device='cuda:0', grad_fn=<MseLossBackward>) tensor(620.9421, device='cuda:0', grad_fn=<MseLossBackward>) tensor(514.6200, device='cuda:0', grad_fn=<MseLossBackward>) tensor(685.8617, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1161.3086, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1103.7693, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1507.6575, device='cuda:0', grad_fn=<MseLossBackward>) tensor(894.1885, device='cuda:0', grad_fn=<MseLossBackward>) tensor(757.1544, device='cuda:0', grad_fn=<MseLossBackward>) tensor(611.4003, device='cuda:0', grad_fn=<MseLossBackward>) tensor(863.7394, device='cuda:0', grad_fn=<MseLossBackward>) tensor(524.9700, device='cuda:0', grad_fn=<MseLossBackward>) tensor(954.0220, device='cuda:0', grad_fn=<MseLossBackward>) tensor(510.5211, device='cuda:0', grad_fn=<MseLossBackward>) tensor(659.3458, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 849.5421 tensor(9896.1758, device='cuda:0') val Loss: 1439.7559 Epoch 1/7 ---------- tensor(1048.2744, device='cuda:0', grad_fn=<MseLossBackward>) tensor(357.3139, device='cuda:0', grad_fn=<MseLossBackward>) tensor(546.5439, device='cuda:0', grad_fn=<MseLossBackward>) tensor(549.2125, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1157.9297, device='cuda:0', grad_fn=<MseLossBackward>) tensor(822.9178, device='cuda:0', grad_fn=<MseLossBackward>) tensor(582.4576, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1010.5164, device='cuda:0', grad_fn=<MseLossBackward>) tensor(712.7899, device='cuda:0', grad_fn=<MseLossBackward>) tensor(707.3977, device='cuda:0', grad_fn=<MseLossBackward>) tensor(411.6434, device='cuda:0', grad_fn=<MseLossBackward>) tensor(654.1904, device='cuda:0', grad_fn=<MseLossBackward>) tensor(669.4155, device='cuda:0', grad_fn=<MseLossBackward>) tensor(925.7991, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1545.2308, device='cuda:0', grad_fn=<MseLossBackward>) tensor(949.5782, device='cuda:0', grad_fn=<MseLossBackward>) tensor(882.5598, device='cuda:0', grad_fn=<MseLossBackward>) tensor(454.5476, device='cuda:0', grad_fn=<MseLossBackward>) tensor(808.0522, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 800.4465 tensor(186.0694, device='cuda:0') val Loss: 1327.2884 Epoch 2/7 ---------- tensor(2975.1421, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1004.7396, device='cuda:0', grad_fn=<MseLossBackward>) tensor(731.6511, device='cuda:0', grad_fn=<MseLossBackward>) tensor(418.7595, device='cuda:0', grad_fn=<MseLossBackward>) tensor(830.8860, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1034.3137, device='cuda:0', grad_fn=<MseLossBackward>) tensor(638.9575, device='cuda:0', grad_fn=<MseLossBackward>) tensor(885.4769, device='cuda:0', grad_fn=<MseLossBackward>) tensor(729.4101, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1326.5658, device='cuda:0', grad_fn=<MseLossBackward>) tensor(538.0644, device='cuda:0', grad_fn=<MseLossBackward>) tensor(474.7394, device='cuda:0', grad_fn=<MseLossBackward>) tensor(443.2044, device='cuda:0', grad_fn=<MseLossBackward>) tensor(618.4781, device='cuda:0', grad_fn=<MseLossBackward>) tensor(729.8484, device='cuda:0', grad_fn=<MseLossBackward>) tensor(426.8477, device='cuda:0', grad_fn=<MseLossBackward>) tensor(406.6296, device='cuda:0', grad_fn=<MseLossBackward>) tensor(879.7312, device='cuda:0', grad_fn=<MseLossBackward>) tensor(444.2820, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 792.6099 tensor(123.6460, device='cuda:0') val Loss: 1356.0864 Epoch 3/7 ---------- tensor(746.5101, device='cuda:0', grad_fn=<MseLossBackward>) tensor(357.8317, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1103.2001, device='cuda:0', grad_fn=<MseLossBackward>) tensor(380.1281, device='cuda:0', grad_fn=<MseLossBackward>) tensor(676.4459, device='cuda:0', grad_fn=<MseLossBackward>) tensor(854.6159, device='cuda:0', grad_fn=<MseLossBackward>) tensor(562.9914, device='cuda:0', grad_fn=<MseLossBackward>) tensor(601.1359, device='cuda:0', grad_fn=<MseLossBackward>) tensor(821.3045, device='cuda:0', grad_fn=<MseLossBackward>) tensor(682.8887, device='cuda:0', grad_fn=<MseLossBackward>) tensor(899.6219, device='cuda:0', grad_fn=<MseLossBackward>) tensor(763.8655, device='cuda:0', grad_fn=<MseLossBackward>) tensor(987.2762, device='cuda:0', grad_fn=<MseLossBackward>) tensor(688.3177, device='cuda:0', grad_fn=<MseLossBackward>) tensor(838.7019, device='cuda:0', grad_fn=<MseLossBackward>) tensor(648.9834, device='cuda:0', grad_fn=<MseLossBackward>) tensor(588.0671, device='cuda:0', grad_fn=<MseLossBackward>) tensor(538.3985, device='cuda:0', grad_fn=<MseLossBackward>) tensor(839.3175, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 772.8894 tensor(106.5691, device='cuda:0') val Loss: 1420.0434 Epoch 4/7 ---------- tensor(649.2126, device='cuda:0', grad_fn=<MseLossBackward>) tensor(657.1530, device='cuda:0', grad_fn=<MseLossBackward>) tensor(572.6566, device='cuda:0', grad_fn=<MseLossBackward>) tensor(906.6370, device='cuda:0', grad_fn=<MseLossBackward>) tensor(801.1154, device='cuda:0', grad_fn=<MseLossBackward>) tensor(861.8510, device='cuda:0', grad_fn=<MseLossBackward>) tensor(439.6890, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2212.3347, device='cuda:0', grad_fn=<MseLossBackward>) tensor(807.9200, device='cuda:0', grad_fn=<MseLossBackward>) tensor(522.0710, device='cuda:0', grad_fn=<MseLossBackward>) tensor(278.9547, device='cuda:0', grad_fn=<MseLossBackward>) tensor(848.0947, device='cuda:0', grad_fn=<MseLossBackward>) tensor(588.2655, device='cuda:0', grad_fn=<MseLossBackward>) tensor(2505.2690, device='cuda:0', grad_fn=<MseLossBackward>) tensor(823.3610, device='cuda:0', grad_fn=<MseLossBackward>) tensor(899.9523, device='cuda:0', grad_fn=<MseLossBackward>) tensor(408.7603, device='cuda:0', grad_fn=<MseLossBackward>) tensor(843.5858, device='cuda:0', grad_fn=<MseLossBackward>) tensor(695.9468, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 744.6191 tensor(92.9794, device='cuda:0') val Loss: 1362.1422 Epoch 5/7 ---------- tensor(520.7456, device='cuda:0', grad_fn=<MseLossBackward>) tensor(579.5954, device='cuda:0', grad_fn=<MseLossBackward>) tensor(629.9301, device='cuda:0', grad_fn=<MseLossBackward>) tensor(466.5580, device='cuda:0', grad_fn=<MseLossBackward>) tensor(471.4761, device='cuda:0', grad_fn=<MseLossBackward>) tensor(703.1233, device='cuda:0', grad_fn=<MseLossBackward>) tensor(396.1379, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1097.0435, device='cuda:0', grad_fn=<MseLossBackward>) tensor(861.3225, device='cuda:0', grad_fn=<MseLossBackward>) tensor(748.8499, device='cuda:0', grad_fn=<MseLossBackward>) tensor(732.9205, device='cuda:0', grad_fn=<MseLossBackward>) tensor(656.7502, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1266.3496, device='cuda:0', grad_fn=<MseLossBackward>) tensor(380.9953, device='cuda:0', grad_fn=<MseLossBackward>) tensor(508.6892, device='cuda:0', grad_fn=<MseLossBackward>) tensor(902.3823, device='cuda:0', grad_fn=<MseLossBackward>) tensor(554.6791, device='cuda:0', grad_fn=<MseLossBackward>) tensor(876.3474, device='cuda:0', grad_fn=<MseLossBackward>) tensor(684.3326, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 732.9475 tensor(347.4896, device='cuda:0') val Loss: 1349.7754 Epoch 6/7 ---------- tensor(627.1010, device='cuda:0', grad_fn=<MseLossBackward>) tensor(681.6164, device='cuda:0', grad_fn=<MseLossBackward>) tensor(527.7015, device='cuda:0', grad_fn=<MseLossBackward>) tensor(849.9796, device='cuda:0', grad_fn=<MseLossBackward>) tensor(456.8881, device='cuda:0', grad_fn=<MseLossBackward>) tensor(623.4297, device='cuda:0', grad_fn=<MseLossBackward>) tensor(346.1087, device='cuda:0', grad_fn=<MseLossBackward>) tensor(509.2261, device='cuda:0', grad_fn=<MseLossBackward>) tensor(679.1948, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1266.7844, device='cuda:0', grad_fn=<MseLossBackward>) tensor(726.9844, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1187.8613, device='cuda:0', grad_fn=<MseLossBackward>) tensor(633.9756, device='cuda:0', grad_fn=<MseLossBackward>) tensor(825.3547, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1149.2385, device='cuda:0', grad_fn=<MseLossBackward>) tensor(747.3878, device='cuda:0', grad_fn=<MseLossBackward>) tensor(418.1491, device='cuda:0', grad_fn=<MseLossBackward>) tensor(777.0673, device='cuda:0', grad_fn=<MseLossBackward>) tensor(649.3445, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 702.8133 tensor(168.9305, device='cuda:0') val Loss: 1425.4564 Epoch 7/7 ---------- tensor(724.1061, device='cuda:0', grad_fn=<MseLossBackward>) tensor(981.2323, device='cuda:0', grad_fn=<MseLossBackward>) tensor(382.1414, device='cuda:0', grad_fn=<MseLossBackward>) tensor(477.4351, device='cuda:0', grad_fn=<MseLossBackward>) tensor(343.7090, device='cuda:0', grad_fn=<MseLossBackward>) tensor(506.9417, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1084.4341, device='cuda:0', grad_fn=<MseLossBackward>) tensor(611.7816, device='cuda:0', grad_fn=<MseLossBackward>) tensor(499.7361, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1124.6199, device='cuda:0', grad_fn=<MseLossBackward>) tensor(419.9083, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1132.3693, device='cuda:0', grad_fn=<MseLossBackward>) tensor(974.0189, device='cuda:0', grad_fn=<MseLossBackward>) tensor(910.7921, device='cuda:0', grad_fn=<MseLossBackward>) tensor(699.8044, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1127.5087, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1593.0378, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1084.7552, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1207.8103, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 711.7769 tensor(91.4766, device='cuda:0') val Loss: 1416.2851 Training complete in 62m 51s
In [ ]:
model, hist = train_model(model, dataloaders_dict, criterion, torch.optim.Adam(model.parameters(), lr=0.00003), num_epochs=8)
torch.save(model.state_dict(), '/content/drive/MyDrive/weights_ef5_net_adam4.txt')
Epoch 0/7 ---------- tensor(557.4513, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1275.9635, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1142.1570, device='cuda:0', grad_fn=<MseLossBackward>) tensor(701.0946, device='cuda:0', grad_fn=<MseLossBackward>) tensor(473.2738, device='cuda:0', grad_fn=<MseLossBackward>) tensor(707.8235, device='cuda:0', grad_fn=<MseLossBackward>) tensor(559.0863, device='cuda:0', grad_fn=<MseLossBackward>) tensor(639.2803, device='cuda:0', grad_fn=<MseLossBackward>) tensor(572.5847, device='cuda:0', grad_fn=<MseLossBackward>) tensor(675.5094, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1098.4247, device='cuda:0', grad_fn=<MseLossBackward>) tensor(776.4652, device='cuda:0', grad_fn=<MseLossBackward>) tensor(883.4783, device='cuda:0', grad_fn=<MseLossBackward>) tensor(604.8850, device='cuda:0', grad_fn=<MseLossBackward>) tensor(305.4053, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1144.2603, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1254.9189, device='cuda:0', grad_fn=<MseLossBackward>) tensor(715.5989, device='cuda:0', grad_fn=<MseLossBackward>) tensor(552.5999, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 779.6201 tensor(29519.7871, device='cuda:0') val Loss: 1324.6370 Epoch 1/7 ---------- tensor(550.5531, device='cuda:0', grad_fn=<MseLossBackward>) tensor(646.0627, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1208.2349, device='cuda:0', grad_fn=<MseLossBackward>) tensor(386.6393, device='cuda:0', grad_fn=<MseLossBackward>) tensor(973.1709, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1067.4120, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1206.8718, device='cuda:0', grad_fn=<MseLossBackward>) tensor(694.6691, device='cuda:0', grad_fn=<MseLossBackward>) tensor(605.0795, device='cuda:0', grad_fn=<MseLossBackward>) tensor(581.8184, device='cuda:0', grad_fn=<MseLossBackward>) tensor(872.8865, device='cuda:0', grad_fn=<MseLossBackward>) tensor(323.5513, device='cuda:0', grad_fn=<MseLossBackward>) tensor(479.0033, device='cuda:0', grad_fn=<MseLossBackward>) tensor(712.5197, device='cuda:0', grad_fn=<MseLossBackward>) tensor(848.7914, device='cuda:0', grad_fn=<MseLossBackward>) tensor(340.4974, device='cuda:0', grad_fn=<MseLossBackward>) tensor(751.3362, device='cuda:0', grad_fn=<MseLossBackward>) tensor(582.8821, device='cuda:0', grad_fn=<MseLossBackward>) tensor(762.6750, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 759.9753 tensor(105.5128, device='cuda:0') val Loss: 1379.5874 Epoch 2/7 ---------- tensor(1067.5759, device='cuda:0', grad_fn=<MseLossBackward>) tensor(718.7767, device='cuda:0', grad_fn=<MseLossBackward>) tensor(449.7345, device='cuda:0', grad_fn=<MseLossBackward>) tensor(564.8746, device='cuda:0', grad_fn=<MseLossBackward>) tensor(493.0490, device='cuda:0', grad_fn=<MseLossBackward>) tensor(641.6085, device='cuda:0', grad_fn=<MseLossBackward>) tensor(705.7722, device='cuda:0', grad_fn=<MseLossBackward>) tensor(545.9653, device='cuda:0', grad_fn=<MseLossBackward>) tensor(615.9921, device='cuda:0', grad_fn=<MseLossBackward>) tensor(769.7947, device='cuda:0', grad_fn=<MseLossBackward>) tensor(545.8456, device='cuda:0', grad_fn=<MseLossBackward>) tensor(704.1823, device='cuda:0', grad_fn=<MseLossBackward>) tensor(690.9713, device='cuda:0', grad_fn=<MseLossBackward>) tensor(424.7306, device='cuda:0', grad_fn=<MseLossBackward>) tensor(776.8778, device='cuda:0', grad_fn=<MseLossBackward>) tensor(756.0140, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1251.7290, device='cuda:0', grad_fn=<MseLossBackward>) tensor(516.2026, device='cuda:0', grad_fn=<MseLossBackward>) tensor(879.1694, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 764.1586 tensor(2414.6880, device='cuda:0') val Loss: 1367.2359 Epoch 3/7 ---------- tensor(982.4999, device='cuda:0', grad_fn=<MseLossBackward>) tensor(723.8063, device='cuda:0', grad_fn=<MseLossBackward>) tensor(479.2018, device='cuda:0', grad_fn=<MseLossBackward>) tensor(520.7103, device='cuda:0', grad_fn=<MseLossBackward>) tensor(638.4678, device='cuda:0', grad_fn=<MseLossBackward>) tensor(870.0285, device='cuda:0', grad_fn=<MseLossBackward>) tensor(361.3197, device='cuda:0', grad_fn=<MseLossBackward>) tensor(802.6613, device='cuda:0', grad_fn=<MseLossBackward>) tensor(359.7191, device='cuda:0', grad_fn=<MseLossBackward>) tensor(311.7228, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1256.6506, device='cuda:0', grad_fn=<MseLossBackward>) tensor(529.8437, device='cuda:0', grad_fn=<MseLossBackward>) tensor(813.0436, device='cuda:0', grad_fn=<MseLossBackward>) tensor(628.1933, device='cuda:0', grad_fn=<MseLossBackward>) tensor(401.0801, device='cuda:0', grad_fn=<MseLossBackward>) tensor(649.9983, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1080.6089, device='cuda:0', grad_fn=<MseLossBackward>) tensor(789.9310, device='cuda:0', grad_fn=<MseLossBackward>) tensor(963.3047, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 730.0842 tensor(103.3152, device='cuda:0') val Loss: 1347.2068 Epoch 4/7 ---------- tensor(848.6259, device='cuda:0', grad_fn=<MseLossBackward>) tensor(758.6728, device='cuda:0', grad_fn=<MseLossBackward>) tensor(418.3472, device='cuda:0', grad_fn=<MseLossBackward>) tensor(583.7791, device='cuda:0', grad_fn=<MseLossBackward>) tensor(649.3426, device='cuda:0', grad_fn=<MseLossBackward>) tensor(476.7333, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1369.1906, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1011.3356, device='cuda:0', grad_fn=<MseLossBackward>) tensor(587.9127, device='cuda:0', grad_fn=<MseLossBackward>) tensor(346.0956, device='cuda:0', grad_fn=<MseLossBackward>) tensor(761.3312, device='cuda:0', grad_fn=<MseLossBackward>) tensor(590.2959, device='cuda:0', grad_fn=<MseLossBackward>) tensor(905.4476, device='cuda:0', grad_fn=<MseLossBackward>) tensor(610.3942, device='cuda:0', grad_fn=<MseLossBackward>) tensor(706.6362, device='cuda:0', grad_fn=<MseLossBackward>) tensor(667.5505, device='cuda:0', grad_fn=<MseLossBackward>) tensor(713.1520, device='cuda:0', grad_fn=<MseLossBackward>) tensor(913.5762, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1111.5591, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 739.3353 tensor(399.8507, device='cuda:0') val Loss: 1322.6124 Epoch 5/7 ---------- tensor(785.0142, device='cuda:0', grad_fn=<MseLossBackward>) tensor(399.1312, device='cuda:0', grad_fn=<MseLossBackward>) tensor(446.4752, device='cuda:0', grad_fn=<MseLossBackward>) tensor(462.2217, device='cuda:0', grad_fn=<MseLossBackward>) tensor(310.0354, device='cuda:0', grad_fn=<MseLossBackward>) tensor(511.7921, device='cuda:0', grad_fn=<MseLossBackward>) tensor(414.0130, device='cuda:0', grad_fn=<MseLossBackward>) tensor(488.4758, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1072.3000, device='cuda:0', grad_fn=<MseLossBackward>) tensor(415.2217, device='cuda:0', grad_fn=<MseLossBackward>) tensor(459.3622, device='cuda:0', grad_fn=<MseLossBackward>) tensor(541.4727, device='cuda:0', grad_fn=<MseLossBackward>) tensor(378.8484, device='cuda:0', grad_fn=<MseLossBackward>) tensor(908.4778, device='cuda:0', grad_fn=<MseLossBackward>) tensor(559.8579, device='cuda:0', grad_fn=<MseLossBackward>) tensor(984.0187, device='cuda:0', grad_fn=<MseLossBackward>) tensor(849.6495, device='cuda:0', grad_fn=<MseLossBackward>) tensor(448.7370, device='cuda:0', grad_fn=<MseLossBackward>) tensor(602.8341, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 712.7997 tensor(127.7019, device='cuda:0') val Loss: 1327.9564 Epoch 6/7 ---------- tensor(875.3263, device='cuda:0', grad_fn=<MseLossBackward>) tensor(605.9894, device='cuda:0', grad_fn=<MseLossBackward>) tensor(313.8354, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1012.6488, device='cuda:0', grad_fn=<MseLossBackward>) tensor(471.2280, device='cuda:0', grad_fn=<MseLossBackward>) tensor(924.4403, device='cuda:0', grad_fn=<MseLossBackward>) tensor(603.6794, device='cuda:0', grad_fn=<MseLossBackward>) tensor(300.9218, device='cuda:0', grad_fn=<MseLossBackward>) tensor(663.2133, device='cuda:0', grad_fn=<MseLossBackward>) tensor(478.0306, device='cuda:0', grad_fn=<MseLossBackward>) tensor(887.3363, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1212.9299, device='cuda:0', grad_fn=<MseLossBackward>) tensor(765.0774, device='cuda:0', grad_fn=<MseLossBackward>) tensor(730.8415, device='cuda:0', grad_fn=<MseLossBackward>) tensor(549.9520, device='cuda:0', grad_fn=<MseLossBackward>) tensor(404.9858, device='cuda:0', grad_fn=<MseLossBackward>) tensor(863.1225, device='cuda:0', grad_fn=<MseLossBackward>) tensor(395.0944, device='cuda:0', grad_fn=<MseLossBackward>) tensor(910.2090, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 714.0271 tensor(131.5949, device='cuda:0') val Loss: 1311.3705 Epoch 7/7 ---------- tensor(692.8297, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1255.2542, device='cuda:0', grad_fn=<MseLossBackward>) tensor(612.1259, device='cuda:0', grad_fn=<MseLossBackward>) tensor(486.5991, device='cuda:0', grad_fn=<MseLossBackward>) tensor(664.1965, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1277.3528, device='cuda:0', grad_fn=<MseLossBackward>) tensor(924.0161, device='cuda:0', grad_fn=<MseLossBackward>) tensor(581.8583, device='cuda:0', grad_fn=<MseLossBackward>) tensor(887.3478, device='cuda:0', grad_fn=<MseLossBackward>) tensor(870.4787, device='cuda:0', grad_fn=<MseLossBackward>) tensor(425.2536, device='cuda:0', grad_fn=<MseLossBackward>) tensor(899.5668, device='cuda:0', grad_fn=<MseLossBackward>) tensor(712.8131, device='cuda:0', grad_fn=<MseLossBackward>) tensor(652.1807, device='cuda:0', grad_fn=<MseLossBackward>) tensor(429.2379, device='cuda:0', grad_fn=<MseLossBackward>) tensor(745.0189, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1241.1256, device='cuda:0', grad_fn=<MseLossBackward>) tensor(465.5525, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1108.3672, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 707.1351 tensor(66.7626, device='cuda:0') val Loss: 1347.7592 Training complete in 62m 36s
In [15]:
model, hist = train_model(model, dataloaders_dict, criterion, torch.optim.Adam(model.parameters(), lr=0.00001), num_epochs=5)
torch.save(model.state_dict(), '/content/drive/MyDrive/weights_ef5_net_adam5.txt')
Epoch 0/4 ---------- tensor(474.4205, device='cuda:0', grad_fn=<MseLossBackward>) tensor(520.7783, device='cuda:0', grad_fn=<MseLossBackward>) tensor(691.4502, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1915.4984, device='cuda:0', grad_fn=<MseLossBackward>) tensor(981.7435, device='cuda:0', grad_fn=<MseLossBackward>) tensor(660.6892, device='cuda:0', grad_fn=<MseLossBackward>) tensor(725.6124, device='cuda:0', grad_fn=<MseLossBackward>) tensor(658.2329, device='cuda:0', grad_fn=<MseLossBackward>) tensor(616.6148, device='cuda:0', grad_fn=<MseLossBackward>) tensor(665.7450, device='cuda:0', grad_fn=<MseLossBackward>) tensor(538.1611, device='cuda:0', grad_fn=<MseLossBackward>) tensor(398.0583, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1031.8822, device='cuda:0', grad_fn=<MseLossBackward>) tensor(390.9847, device='cuda:0', grad_fn=<MseLossBackward>) tensor(505.6226, device='cuda:0', grad_fn=<MseLossBackward>) tensor(610.5330, device='cuda:0', grad_fn=<MseLossBackward>) tensor(723.2950, device='cuda:0', grad_fn=<MseLossBackward>) tensor(615.5419, device='cuda:0', grad_fn=<MseLossBackward>) tensor(757.6169, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 699.4458 tensor(81.6333, device='cuda:0') val Loss: 1309.3945 Epoch 1/4 ---------- tensor(904.9321, device='cuda:0', grad_fn=<MseLossBackward>) tensor(679.7383, device='cuda:0', grad_fn=<MseLossBackward>) tensor(835.8688, device='cuda:0', grad_fn=<MseLossBackward>) tensor(737.5471, device='cuda:0', grad_fn=<MseLossBackward>) tensor(660.0707, device='cuda:0', grad_fn=<MseLossBackward>) tensor(538.5067, device='cuda:0', grad_fn=<MseLossBackward>) tensor(625.6597, device='cuda:0', grad_fn=<MseLossBackward>) tensor(345.7386, device='cuda:0', grad_fn=<MseLossBackward>) tensor(626.9946, device='cuda:0', grad_fn=<MseLossBackward>) tensor(785.5850, device='cuda:0', grad_fn=<MseLossBackward>) tensor(113.2039, device='cuda:0', grad_fn=<MseLossBackward>) tensor(429.0422, device='cuda:0', grad_fn=<MseLossBackward>) tensor(745.8002, device='cuda:0', grad_fn=<MseLossBackward>) tensor(587.3675, device='cuda:0', grad_fn=<MseLossBackward>) tensor(443.0291, device='cuda:0', grad_fn=<MseLossBackward>) tensor(672.5412, device='cuda:0', grad_fn=<MseLossBackward>) tensor(410.6427, device='cuda:0', grad_fn=<MseLossBackward>) tensor(551.1369, device='cuda:0', grad_fn=<MseLossBackward>) tensor(424.3795, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 692.0891 tensor(27287.0586, device='cuda:0') val Loss: 1327.4097 Epoch 2/4 ---------- tensor(617.1632, device='cuda:0', grad_fn=<MseLossBackward>) tensor(405.6962, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1203.4521, device='cuda:0', grad_fn=<MseLossBackward>) tensor(763.0702, device='cuda:0', grad_fn=<MseLossBackward>) tensor(777.2101, device='cuda:0', grad_fn=<MseLossBackward>) tensor(887.4746, device='cuda:0', grad_fn=<MseLossBackward>) tensor(4175.7495, device='cuda:0', grad_fn=<MseLossBackward>) tensor(721.0527, device='cuda:0', grad_fn=<MseLossBackward>) tensor(816.9531, device='cuda:0', grad_fn=<MseLossBackward>) tensor(534.9386, device='cuda:0', grad_fn=<MseLossBackward>) tensor(709.9523, device='cuda:0', grad_fn=<MseLossBackward>) tensor(613.1090, device='cuda:0', grad_fn=<MseLossBackward>) tensor(319.8206, device='cuda:0', grad_fn=<MseLossBackward>) tensor(498.8016, device='cuda:0', grad_fn=<MseLossBackward>) tensor(766.7651, device='cuda:0', grad_fn=<MseLossBackward>) tensor(605.1559, device='cuda:0', grad_fn=<MseLossBackward>) tensor(965.1078, device='cuda:0', grad_fn=<MseLossBackward>) tensor(649.0719, device='cuda:0', grad_fn=<MseLossBackward>) tensor(422.6246, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 698.7366 tensor(1426.4725, device='cuda:0') val Loss: 1332.8609 Epoch 3/4 ---------- tensor(810.1728, device='cuda:0', grad_fn=<MseLossBackward>) tensor(469.7860, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1107.9319, device='cuda:0', grad_fn=<MseLossBackward>) tensor(838.5443, device='cuda:0', grad_fn=<MseLossBackward>) tensor(393.2704, device='cuda:0', grad_fn=<MseLossBackward>) tensor(452.2599, device='cuda:0', grad_fn=<MseLossBackward>) tensor(864.8928, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1188.9148, device='cuda:0', grad_fn=<MseLossBackward>) tensor(689.8696, device='cuda:0', grad_fn=<MseLossBackward>) tensor(573.8225, device='cuda:0', grad_fn=<MseLossBackward>) tensor(654.5557, device='cuda:0', grad_fn=<MseLossBackward>) tensor(837.8419, device='cuda:0', grad_fn=<MseLossBackward>) tensor(690.6844, device='cuda:0', grad_fn=<MseLossBackward>) tensor(330.3257, device='cuda:0', grad_fn=<MseLossBackward>) tensor(496.6655, device='cuda:0', grad_fn=<MseLossBackward>) tensor(650.8167, device='cuda:0', grad_fn=<MseLossBackward>) tensor(554.5222, device='cuda:0', grad_fn=<MseLossBackward>) tensor(676.4238, device='cuda:0', grad_fn=<MseLossBackward>) tensor(578.6619, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 689.0244 tensor(162.0581, device='cuda:0') val Loss: 1317.7533 Epoch 4/4 ---------- tensor(398.6326, device='cuda:0', grad_fn=<MseLossBackward>) tensor(548.9901, device='cuda:0', grad_fn=<MseLossBackward>) tensor(580.1676, device='cuda:0', grad_fn=<MseLossBackward>) tensor(533.1057, device='cuda:0', grad_fn=<MseLossBackward>) tensor(504.1043, device='cuda:0', grad_fn=<MseLossBackward>) tensor(588.8623, device='cuda:0', grad_fn=<MseLossBackward>) tensor(655.1155, device='cuda:0', grad_fn=<MseLossBackward>) tensor(644.3137, device='cuda:0', grad_fn=<MseLossBackward>) tensor(983.9981, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1054.2576, device='cuda:0', grad_fn=<MseLossBackward>) tensor(560.1362, device='cuda:0', grad_fn=<MseLossBackward>) tensor(835.9634, device='cuda:0', grad_fn=<MseLossBackward>) tensor(892.5691, device='cuda:0', grad_fn=<MseLossBackward>) tensor(915.7896, device='cuda:0', grad_fn=<MseLossBackward>) tensor(839.8195, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1182.6750, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1090.5099, device='cuda:0', grad_fn=<MseLossBackward>) tensor(521.6654, device='cuda:0', grad_fn=<MseLossBackward>) tensor(490.2231, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 691.3856 tensor(3767.6106, device='cuda:0') val Loss: 1326.4231 Training complete in 72m 37s
In [16]:
model, hist = train_model(model, dataloaders_dict, criterion, torch.optim.Adam(model.parameters(), lr=0.000003), num_epochs=5)
torch.save(model.state_dict(), '/content/drive/MyDrive/weights_ef5_net_adam6.txt')
Epoch 0/4 ---------- tensor(459.1544, device='cuda:0', grad_fn=<MseLossBackward>) tensor(959.9667, device='cuda:0', grad_fn=<MseLossBackward>) tensor(755.1567, device='cuda:0', grad_fn=<MseLossBackward>) tensor(504.5496, device='cuda:0', grad_fn=<MseLossBackward>) tensor(783.6774, device='cuda:0', grad_fn=<MseLossBackward>) tensor(405.2543, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1557.6870, device='cuda:0', grad_fn=<MseLossBackward>) tensor(697.2979, device='cuda:0', grad_fn=<MseLossBackward>) tensor(538.7745, device='cuda:0', grad_fn=<MseLossBackward>) tensor(662.6500, device='cuda:0', grad_fn=<MseLossBackward>) tensor(566.3698, device='cuda:0', grad_fn=<MseLossBackward>) tensor(542.2593, device='cuda:0', grad_fn=<MseLossBackward>) tensor(756.8997, device='cuda:0', grad_fn=<MseLossBackward>) tensor(754.3699, device='cuda:0', grad_fn=<MseLossBackward>) tensor(590.2037, device='cuda:0', grad_fn=<MseLossBackward>) tensor(314.1022, device='cuda:0', grad_fn=<MseLossBackward>) tensor(851.4091, device='cuda:0', grad_fn=<MseLossBackward>) tensor(682.4498, device='cuda:0', grad_fn=<MseLossBackward>) tensor(914.5414, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 715.5992 tensor(143.4515, device='cuda:0') val Loss: 1318.2458 Epoch 1/4 ---------- tensor(608.9588, device='cuda:0', grad_fn=<MseLossBackward>) tensor(453.4824, device='cuda:0', grad_fn=<MseLossBackward>) tensor(743.2053, device='cuda:0', grad_fn=<MseLossBackward>) tensor(604.9442, device='cuda:0', grad_fn=<MseLossBackward>) tensor(890.0687, device='cuda:0', grad_fn=<MseLossBackward>) tensor(680.3484, device='cuda:0', grad_fn=<MseLossBackward>) tensor(692.8219, device='cuda:0', grad_fn=<MseLossBackward>) tensor(900.7077, device='cuda:0', grad_fn=<MseLossBackward>) tensor(522.8530, device='cuda:0', grad_fn=<MseLossBackward>) tensor(292.3668, device='cuda:0', grad_fn=<MseLossBackward>) tensor(496.1448, device='cuda:0', grad_fn=<MseLossBackward>) tensor(454.4898, device='cuda:0', grad_fn=<MseLossBackward>) tensor(846.2621, device='cuda:0', grad_fn=<MseLossBackward>) tensor(690.9159, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1043.9785, device='cuda:0', grad_fn=<MseLossBackward>) tensor(411.4757, device='cuda:0', grad_fn=<MseLossBackward>) tensor(782.8490, device='cuda:0', grad_fn=<MseLossBackward>) tensor(407.6249, device='cuda:0', grad_fn=<MseLossBackward>) tensor(423.6574, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 696.7618 tensor(223.3346, device='cuda:0') val Loss: 1313.4023 Epoch 2/4 ---------- tensor(688.7407, device='cuda:0', grad_fn=<MseLossBackward>) tensor(388.4598, device='cuda:0', grad_fn=<MseLossBackward>) tensor(705.8300, device='cuda:0', grad_fn=<MseLossBackward>) tensor(558.8851, device='cuda:0', grad_fn=<MseLossBackward>) tensor(825.5137, device='cuda:0', grad_fn=<MseLossBackward>) tensor(537.5566, device='cuda:0', grad_fn=<MseLossBackward>) tensor(464.5101, device='cuda:0', grad_fn=<MseLossBackward>) tensor(700.2145, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1293.0186, device='cuda:0', grad_fn=<MseLossBackward>) tensor(452.5471, device='cuda:0', grad_fn=<MseLossBackward>) tensor(969.3383, device='cuda:0', grad_fn=<MseLossBackward>) tensor(274.2343, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1131.5858, device='cuda:0', grad_fn=<MseLossBackward>) tensor(373.1921, device='cuda:0', grad_fn=<MseLossBackward>) tensor(882.5729, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1240.2336, device='cuda:0', grad_fn=<MseLossBackward>) tensor(490.7855, device='cuda:0', grad_fn=<MseLossBackward>) tensor(797.4427, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1054.9485, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 695.4912 tensor(4549.1040, device='cuda:0') val Loss: 1322.9624 Epoch 3/4 ---------- tensor(988.7066, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1485.9719, device='cuda:0', grad_fn=<MseLossBackward>) tensor(987.4691, device='cuda:0', grad_fn=<MseLossBackward>) tensor(593.1062, device='cuda:0', grad_fn=<MseLossBackward>) tensor(704.2724, device='cuda:0', grad_fn=<MseLossBackward>) tensor(388.2776, device='cuda:0', grad_fn=<MseLossBackward>) tensor(443.7773, device='cuda:0', grad_fn=<MseLossBackward>) tensor(373.9128, device='cuda:0', grad_fn=<MseLossBackward>) tensor(715.3870, device='cuda:0', grad_fn=<MseLossBackward>) tensor(896.2750, device='cuda:0', grad_fn=<MseLossBackward>) tensor(742.5311, device='cuda:0', grad_fn=<MseLossBackward>) tensor(753.6001, device='cuda:0', grad_fn=<MseLossBackward>) tensor(483.8671, device='cuda:0', grad_fn=<MseLossBackward>) tensor(747.6678, device='cuda:0', grad_fn=<MseLossBackward>) tensor(698.3331, device='cuda:0', grad_fn=<MseLossBackward>) tensor(598.5282, device='cuda:0', grad_fn=<MseLossBackward>) tensor(732.1938, device='cuda:0', grad_fn=<MseLossBackward>) tensor(696.3341, device='cuda:0', grad_fn=<MseLossBackward>) tensor(649.3403, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 699.6668 tensor(166.1459, device='cuda:0') val Loss: 1324.3819 Epoch 4/4 ---------- tensor(695.8196, device='cuda:0', grad_fn=<MseLossBackward>) tensor(351.6412, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1042.9663, device='cuda:0', grad_fn=<MseLossBackward>) tensor(433.0515, device='cuda:0', grad_fn=<MseLossBackward>) tensor(708.3881, device='cuda:0', grad_fn=<MseLossBackward>) tensor(896.5139, device='cuda:0', grad_fn=<MseLossBackward>) tensor(492.7296, device='cuda:0', grad_fn=<MseLossBackward>) tensor(703.6683, device='cuda:0', grad_fn=<MseLossBackward>) tensor(685.7272, device='cuda:0', grad_fn=<MseLossBackward>) tensor(773.7471, device='cuda:0', grad_fn=<MseLossBackward>) tensor(423.0814, device='cuda:0', grad_fn=<MseLossBackward>) tensor(507.9961, device='cuda:0', grad_fn=<MseLossBackward>) tensor(546.0551, device='cuda:0', grad_fn=<MseLossBackward>) tensor(714.9069, device='cuda:0', grad_fn=<MseLossBackward>) tensor(649.1038, device='cuda:0', grad_fn=<MseLossBackward>) tensor(673.2117, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1611.0286, device='cuda:0', grad_fn=<MseLossBackward>) tensor(377.8221, device='cuda:0', grad_fn=<MseLossBackward>) tensor(489.1741, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 703.2385 tensor(241.1334, device='cuda:0') val Loss: 1317.6488 Training complete in 72m 40s
In [17]:
model, hist = train_model(model, dataloaders_dict, criterion, torch.optim.Adam(model.parameters(), lr=0.000001), num_epochs=5)
torch.save(model.state_dict(), '/content/drive/MyDrive/weights_ef5_net_adam6.txt')
Epoch 0/4 ---------- tensor(374.0327, device='cuda:0', grad_fn=<MseLossBackward>) tensor(840.9709, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1089.3677, device='cuda:0', grad_fn=<MseLossBackward>) tensor(248.8978, device='cuda:0', grad_fn=<MseLossBackward>) tensor(857.8651, device='cuda:0', grad_fn=<MseLossBackward>) tensor(619.9756, device='cuda:0', grad_fn=<MseLossBackward>) tensor(376.7505, device='cuda:0', grad_fn=<MseLossBackward>) tensor(801.1678, device='cuda:0', grad_fn=<MseLossBackward>) tensor(324.1050, device='cuda:0', grad_fn=<MseLossBackward>) tensor(639.1308, device='cuda:0', grad_fn=<MseLossBackward>) tensor(641.3213, device='cuda:0', grad_fn=<MseLossBackward>) tensor(631.1763, device='cuda:0', grad_fn=<MseLossBackward>) tensor(369.9946, device='cuda:0', grad_fn=<MseLossBackward>) tensor(892.0739, device='cuda:0', grad_fn=<MseLossBackward>) tensor(471.9645, device='cuda:0', grad_fn=<MseLossBackward>) tensor(931.4033, device='cuda:0', grad_fn=<MseLossBackward>) tensor(542.2040, device='cuda:0', grad_fn=<MseLossBackward>) tensor(263.9974, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1062.1045, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 707.5611 tensor(173.5317, device='cuda:0') val Loss: 1325.3995 Epoch 1/4 ---------- tensor(789.0584, device='cuda:0', grad_fn=<MseLossBackward>) tensor(656.5427, device='cuda:0', grad_fn=<MseLossBackward>) tensor(878.6625, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1635.8201, device='cuda:0', grad_fn=<MseLossBackward>) tensor(423.2480, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1028.0320, device='cuda:0', grad_fn=<MseLossBackward>) tensor(395.3095, device='cuda:0', grad_fn=<MseLossBackward>) tensor(779.4139, device='cuda:0', grad_fn=<MseLossBackward>) tensor(641.7259, device='cuda:0', grad_fn=<MseLossBackward>) tensor(466.6595, device='cuda:0', grad_fn=<MseLossBackward>) tensor(836.4004, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1159.6316, device='cuda:0', grad_fn=<MseLossBackward>) tensor(422.2290, device='cuda:0', grad_fn=<MseLossBackward>) tensor(519.6357, device='cuda:0', grad_fn=<MseLossBackward>) tensor(500.0964, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1351.2784, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1635.4469, device='cuda:0', grad_fn=<MseLossBackward>) tensor(403.1234, device='cuda:0', grad_fn=<MseLossBackward>) tensor(986.4741, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 699.6118 tensor(69.8518, device='cuda:0') val Loss: 1326.8070 Epoch 2/4 ---------- tensor(410.2752, device='cuda:0', grad_fn=<MseLossBackward>) tensor(775.6812, device='cuda:0', grad_fn=<MseLossBackward>) tensor(922.7560, device='cuda:0', grad_fn=<MseLossBackward>) tensor(909.2839, device='cuda:0', grad_fn=<MseLossBackward>) tensor(577.0360, device='cuda:0', grad_fn=<MseLossBackward>) tensor(715.2623, device='cuda:0', grad_fn=<MseLossBackward>) tensor(444.8695, device='cuda:0', grad_fn=<MseLossBackward>) tensor(699.2564, device='cuda:0', grad_fn=<MseLossBackward>) tensor(861.4148, device='cuda:0', grad_fn=<MseLossBackward>) tensor(891.0555, device='cuda:0', grad_fn=<MseLossBackward>) tensor(377.3217, device='cuda:0', grad_fn=<MseLossBackward>) tensor(395.7995, device='cuda:0', grad_fn=<MseLossBackward>) tensor(456.5496, device='cuda:0', grad_fn=<MseLossBackward>) tensor(705.6714, device='cuda:0', grad_fn=<MseLossBackward>) tensor(971.9423, device='cuda:0', grad_fn=<MseLossBackward>) tensor(517.2520, device='cuda:0', grad_fn=<MseLossBackward>) tensor(627.2681, device='cuda:0', grad_fn=<MseLossBackward>) tensor(947.3318, device='cuda:0', grad_fn=<MseLossBackward>) tensor(457.4852, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 696.4207 tensor(205.4798, device='cuda:0') val Loss: 1319.8425 Epoch 3/4 ---------- tensor(517.2361, device='cuda:0', grad_fn=<MseLossBackward>) tensor(529.9508, device='cuda:0', grad_fn=<MseLossBackward>) tensor(995.4133, device='cuda:0', grad_fn=<MseLossBackward>) tensor(529.1346, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1294.5020, device='cuda:0', grad_fn=<MseLossBackward>) tensor(374.8243, device='cuda:0', grad_fn=<MseLossBackward>) tensor(675.4749, device='cuda:0', grad_fn=<MseLossBackward>) tensor(545.4537, device='cuda:0', grad_fn=<MseLossBackward>) tensor(826.8840, device='cuda:0', grad_fn=<MseLossBackward>) tensor(421.8890, device='cuda:0', grad_fn=<MseLossBackward>) tensor(857.6523, device='cuda:0', grad_fn=<MseLossBackward>) tensor(286.3632, device='cuda:0', grad_fn=<MseLossBackward>) tensor(604.4668, device='cuda:0', grad_fn=<MseLossBackward>) tensor(770.2843, device='cuda:0', grad_fn=<MseLossBackward>) tensor(739.8259, device='cuda:0', grad_fn=<MseLossBackward>) tensor(725.3483, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1055.2581, device='cuda:0', grad_fn=<MseLossBackward>) tensor(751.5704, device='cuda:0', grad_fn=<MseLossBackward>) tensor(911.6670, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 686.7360 tensor(114.2301, device='cuda:0') val Loss: 1322.7420 Epoch 4/4 ---------- tensor(582.1881, device='cuda:0', grad_fn=<MseLossBackward>) tensor(776.2437, device='cuda:0', grad_fn=<MseLossBackward>) tensor(429.7910, device='cuda:0', grad_fn=<MseLossBackward>) tensor(840.8004, device='cuda:0', grad_fn=<MseLossBackward>) tensor(865.9403, device='cuda:0', grad_fn=<MseLossBackward>) tensor(517.0648, device='cuda:0', grad_fn=<MseLossBackward>) tensor(345.3684, device='cuda:0', grad_fn=<MseLossBackward>) tensor(600.4232, device='cuda:0', grad_fn=<MseLossBackward>) tensor(361.8311, device='cuda:0', grad_fn=<MseLossBackward>) tensor(1064.9000, device='cuda:0', grad_fn=<MseLossBackward>) tensor(740.4256, device='cuda:0', grad_fn=<MseLossBackward>) tensor(538.8967, device='cuda:0', grad_fn=<MseLossBackward>) tensor(275.7245, device='cuda:0', grad_fn=<MseLossBackward>) tensor(582.6667, device='cuda:0', grad_fn=<MseLossBackward>) tensor(471.8724, device='cuda:0', grad_fn=<MseLossBackward>) tensor(418.3882, device='cuda:0', grad_fn=<MseLossBackward>) tensor(667.7050, device='cuda:0', grad_fn=<MseLossBackward>) tensor(716.4990, device='cuda:0', grad_fn=<MseLossBackward>) tensor(653.1924, device='cuda:0', grad_fn=<MseLossBackward>) train Loss: 697.6444 tensor(185.6894, device='cuda:0') val Loss: 1318.7830 Training complete in 72m 38s
Prediction and data uploading¶
In [ ]:
A=[[i for i in range(10000)],[0]*10000]
df=pd.DataFrame(A).transpose()
df.columns=['ImageID','label']
i=0
for f in os.listdir('data/test/'):
im=Image.open('data/test/'+f)
a=trainTrans(im)
tens=torch.reshape(a,(1,3,a.size(1),a.size(2)))
inputs = tens.to(device)
outputs = (model(inputs).detach().cpu().numpy())
df.iloc[int(f.split('.')[0]),1]=outputs[0][0]
df.to_csv('/content/drive/MyDrive/Colab Notebooks/submission.csv',index=False)
!aicrowd submission create -c f1-speed-recognition -f '/content/drive/MyDrive/Colab Notebooks/submission.csv'
Content
Comments
You must login before you can post a comment.