(a) Formulate this problem as a Markov decision process by identifying the states and decisions and then finding the Cik. (b) Identify all the (stationary deterministic) policies. For each one, find the transition matrix and write an expression for the (long-run) expected average cost per period in terms of the unknown steady-state probabilities (To, T₁, ,πM). (c) Find these steady-state probabilities (To, T₁, ,TM) for each policy. Then evaluate the expression obtained in part (b) to find the optimal policy by exhaustive enumeration.

A First Course in Probability (10th Edition)
10th Edition
ISBN:9780134753119
Author:Sheldon Ross
Publisher:Sheldon Ross
Chapter1: Combinatorial Analysis
Section: Chapter Questions
Problem 1.1P: a. How many different 7-place license plates are possible if the first 2 places are for letters and...
icon
Related questions
Question
4
5
12. Every Saturday night a man plays poker at his home with the same group of friends. If he
provides refreshments for the group (at an expected cost of $30) on any given Saturday night,
the group will begin the following Saturday night in a good mood with probability and in a bad
mood with probability. However, if he fails to provide refreshments, the group will begin the
following Saturday night in a good mood with probability and in a bad mood with probability,
regardless of their mood this Saturday. Furthermore, if the group begins the night in a bad mood
and then he fails to provide refreshments, the group will gang up on him so that he incurs
1
expected poker losses of $70. Under other circumstances, he averages no gain or loss on his
poker play. The man wishes to find the policy regarding when to provide refreshments that will
minimize his (long-run) expected average cost per week.
(a) Formulate this problem as a Markov decision process by identifying the states and decisions
and then finding the Cik.
(b) Identify all the (stationary deterministic) policies. For each one, find the transition matrix
and write an expression for the (long-run) expected average cost per period in terms of the
unknown steady-state probabilities (To,₁,TM).
(c) Find these steady-state probabilities (To, ₁,,TM) for each policy. Then evaluate the
expression obtained in part (b) to find the optimal policy by exhaustive enumeration.
Transcribed Image Text:4 5 12. Every Saturday night a man plays poker at his home with the same group of friends. If he provides refreshments for the group (at an expected cost of $30) on any given Saturday night, the group will begin the following Saturday night in a good mood with probability and in a bad mood with probability. However, if he fails to provide refreshments, the group will begin the following Saturday night in a good mood with probability and in a bad mood with probability, regardless of their mood this Saturday. Furthermore, if the group begins the night in a bad mood and then he fails to provide refreshments, the group will gang up on him so that he incurs 1 expected poker losses of $70. Under other circumstances, he averages no gain or loss on his poker play. The man wishes to find the policy regarding when to provide refreshments that will minimize his (long-run) expected average cost per week. (a) Formulate this problem as a Markov decision process by identifying the states and decisions and then finding the Cik. (b) Identify all the (stationary deterministic) policies. For each one, find the transition matrix and write an expression for the (long-run) expected average cost per period in terms of the unknown steady-state probabilities (To,₁,TM). (c) Find these steady-state probabilities (To, ₁,,TM) for each policy. Then evaluate the expression obtained in part (b) to find the optimal policy by exhaustive enumeration.
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 7 steps with 6 images

Blurred answer
Similar questions
Recommended textbooks for you
A First Course in Probability (10th Edition)
A First Course in Probability (10th Edition)
Probability
ISBN:
9780134753119
Author:
Sheldon Ross
Publisher:
PEARSON
A First Course in Probability
A First Course in Probability
Probability
ISBN:
9780321794772
Author:
Sheldon Ross
Publisher:
PEARSON