Data Science Machine Learning Challenge
Hidden๐ Introduction
Less-Than-Truckload (LTL) is a shipping business that excels in the transportation of small freights. As part of operations, trips are created to deliver or pick up shipments from the customer locations. Effective trip planning is key for the optimal operation of the LTL business.
For better trip planning, we need to estimate how long our drivers will spend at each customer location. This is done using Dwell Time. Dwell time usually defined as a measure of the time elapsed between the time the cargo arrives and the time the goods leave the premises. It is a key indicator of how efficiently shipping operation is and how quickly cargo is flowing through its stops.
Our aim is to estimate the Dwell Time as it would optimise the operation. This is what allows your parcels to reach your doorstep just a few short days after you order them!
๐ Task
Given the historical data for Dwell time at customer locations can you create a predictive model to estimate dwell time (DWELL_TIME) at a customer stop? Your solution should include the following steps.
- Data Preparation
- Choosing a model
- Training
- Evaluation
- Model performance improvement techniques
Conclusion - Explain your approach and findings. How can you further improve the modelsโ accuracy? Through this task, we want to better understand your problem-solving approach and critical thinking. Please outline any other approach youโd like to try that isnโt mentioned in your current solution.
๐พ Dataset
- Anchor Dataset attached with the email contains over 13,000 records.
- Each record in the dataset represents a customer stop.
- A detailed description of columns can be found here.
๐ Evaluation Criteria
Root Mean Square Error (RMSE) will be used as a metric to validate your model.
Do not spend too much time on improving RMSE. We are more interested to see your approach, methodology, understanding and skills in Machine Learning.
๐ Submission
- This challenge accepts the notebook as a submission.
- During the evaluation, the notebook will be run as it is, so please make sure the notebook runs without any errors before submitting.
- The notebook follows a particular format, please stick to it.
- Do not delete the header of the cells in the notebook.
- Timeout for the training phase and the inference phase are X and Y respectively.
Check out the starter notebook here.
All the best for your submission! :)
๐ Constraints
๐ Language:
You can use Python or R, whichever you are comfortable with. Please check the starter notebook for more instructions.
โฒ๏ธTime Duration
The final submission needs to be made in under 3.5 hours since you first receive this assignment.