Suppose you are a YouTuber, and you have an employee to edit your videos. At the beginning of each month, you decide his salary for that month. It can be either a low salary (2300/-) or a high salary (3000/-). Knowing his salary, the employee can decide to continue for that month or quit immediately. The probability that the employee will quit on a low salary is 0.6 and on a high salary is 0.2.
When the employee quits, you have to edit the videos yourself, and which in turn will cost you 4000/- per month. If you don’t have an employee, you will advertise each month for a new employee. The new employee can start at the beginning of the following month with the same salary conditions mentioned above. You choose an advertising budget, which can be either a low budget (300/-) or a high budget (600/-). The probability that you will find a new employee is 0.7 for the low advertising budget and is 0.9 for the high advertising budget.
Each month you have to decide which salary is offered to an employee, and if the employee quits then you have to choose the advertising budget.
Implement the following: (4 marks)
- Formulate this problem as a Markov decision process in which your objective is to minimize the total expected cost for the next 1 year.
- Using dynamic programming, find out the optimal policy and optimal value for each month.
Answer the following (based on the data given above): (1 mark)
- Consider a policy where you always pay the employee a low income and allocate a high advertising budget. Is it optimal? Justify your answer.
You will be writing your solutions & making a submission through a notebook. You can follow the instructions in the starter notebook.
Resources section you will find data files that contain parameters for the environment for this problem.
- Submissions will be made through a notebook following the instructions in the starter notebook.
- Each Team can make 5 successful submissions and 5 failed submissions in a day. Once the limit of failed submission is reached, the submission will be counted in the successful submission.
- The submission limit will reset at 5:30 AM IST every day.
- At the end of the challenge, you will have to select 1 submission as the final one. You can select that here.
- RL TAs
siddharthaOver 1 year ago