Data Science Machine Learning Challenge
Starter Notebook For Dwell Time Prediction Challenge
A getting started code for the challenge.
What is the notebook about?¶
Use historical dwell time data at customer locations and create a predictive model to estimate dwell time (DWELL_TIME).
How to use this notebook? 📝¶
Instructions
- The notebook follows a particular format, please stick to it.
- Do not delete the header of the cell.
- Use ony environment variables in AIcrowd Runtime Configuration Section.
- During Evaluation, the notebook will be run as it is, so any error in notebook will cause errors while evaluation
Update the config parameters. You can define the common variables here
Variable | Description |
---|---|
AICROWD_TRAIN_DATASET_PATH |
Path to the file containing test data. |
AICROWD_TEST_DATASET_PATH |
Path to the file containing test data. |
AICROWD_PREDICTIONS_PATH |
Path to write the output to. |
AICROWD_API_KEY |
In order to submit your code to AIcrowd, you need to provide your account's API key. This key is available at https://www.aicrowd.com/participants/me |
- Installing packages. Please use the Install packages π section to install the packages
Setup AIcrowd Utilities 🛠¶
We use this to bundle the files for submission and create a submission on AIcrowd. Do not edit this block.
!pip install -U aicrowd-cli
AIcrowd Runtime Configuration 🧷¶
Define configuration parameters. Please include any files needed for the notebook to run under ASSETS_DIR
. We will copy the contents of this directory to your final submission file π
import os
# Please use the absolute for the location of the dataset.
# Or you can use relative path with `os.getcwd() + "test_data/test.csv"`
AICROWD_TRAIN_DATASET_PATH = os.getenv("AICROWD_TRAIN_DATASET_PATH", "train.csv")
AICROWD_TEST_DATASET_PATH = os.getenv("AICROWD_TEST_DATASET_PATH", "test.csv")
AICROWD_PREDICTIONS_PATH = os.getenv("AICROWD_PREDICTIONS_PATH", "predictions.csv")
AICROWD_API_KEY = "" # Get your key from https://www.aicrowd.com/participants/me
Download datasets from AIcrowd¶
!aicrowd login --api-key $AICROWD_API_KEY
!aicrowd dataset download --challenge xpo-logistics-data-science-machine-learning-challenge
Install packages 🗃¶
Please add all pacakage installations in this section
!pip install numpy pandas sklearn
Define preprocessing code 💻¶
The code that is common between the training and the prediction sections should be defined here. During evaluation, we completely skip the training section. Please make sure to add any common logic between the training and prediction sections here.
Import common packages¶
Please import packages that are common for training and prediction phases here.
import numpy as np
import pandas as pd
# some precessing code
Training phase ⚙️¶
You can define your training code here. This sections will be skipped during evaluation.
# model = define_your_model
Load training data¶
# load your data
Train your model¶
# model.fit(train_data)
# some custom code block
Prediction phase 🔎¶
Please make sure to save the weights from the training section in your assets directory and load them in this section
# some custom code
Load test data¶
test_data = pd.read_csv(AICROWD_TEST_DATASET_PATH)
test_data.head()
Generate predictions¶
predictions = {
"DWELL_TIME": np.random.rand(len(test_data["DL"]))
}
predictions_df = pd.DataFrame.from_dict(predictions)
Save predictions 📨¶
predictions_df.to_csv(AICROWD_PREDICTIONS_PATH, index=False)
Submit to AIcrowd 🚀¶
NOTE: PLEASE SAVE THE NOTEBOOK BEFORE SUBMITTING IT (Ctrl + S)
!aicrowd login --api-key $AICROWD_API_KEY
!aicrowd notebook submit --dry-run \
--assets-dir "assets" \
--no-verify \
--challenge data-science-machine-learning-challenge
Content
Comments
You must login before you can post a comment.