5. A store manages its inventory of a single product using a controlled Markov Chain (Markov Decision Process). The store can decide whether to order new stock or not at the end of each day. The inventory level can be between 0 and 3 units. Each day, the demand for the product follows a random process. If the demand exceeds the available inventory, the excess demand is lost (i.e., it results in lost sales). On the other hand, there is also a storage cost on unsold items at the end of each day. The goal is to study the limiting distribution of the inventory levels under a specific ordering policy, and the average daily cost. States and Actions ⚫ States: The state X of the system at time n is the inventory level, which can be 0, 1, 2, or 3. There is an additional state called L for lost sale. This corresponds to an inventory of 0 but it denotes that on that day, a sale has been lost due to an inventory that was understocked. The state space is then given by S = = {L, 0, 1, 2, 3}. • Actions: At the end of each day, the store can either: Order one unit of the product. Do not order any new stock. The action space is given by A = {0, 1}. Demand Distribution: The daily demand D for the product follows the distribution: P(D = 0) P(D = 1) P(D = 2) = 0.5 = 0.25 = 0.25 Cost: We ignore the profit from sale and only consider the cost from lost sales and storage. We assume that lost sales (independent on the number of lost sales), costs 10 Euro. In addition there is an overnight storage cost of 2 Euro whenever an item is hold in stock over night (independent of the number). We assume that items ordered at night, only arrive next morning and account for storage cost only the next night in case they are still in stock. The cost only depends on Xn and is then as follows: • Lost sale: C(L) = 10 • End of day inventory: C(1) = C(2) = C(3) = 2 Note that the two different states 0 and L are needed to ensure that the cost can be calculated only from the state, not requiring knowledge of the demand D. Transition Probabilities: Lets be the current inventory level and a be the action taken (0 for not ordering, 1 for ordering one unit). The next state s' is determined by the current state, the action taken, and the demand D. For example: P(s' or P(s' L|s = 0, a = = 1) = P(D = 2) = 0.25 = 3|s = 2, a = 1) = P(D = 0) = 0.5 Ordering Policy: Consider a simple policy where, at the end of the day, the store orders one unit whenever the inventory level is 0 or L, and does not order otherwise. (a) Define the Transition Matrix: Construct the transition matrix for the Markov Chain under the given policy. (b) Find the Limiting Distribution: Calculate the limiting distribution of the inventory levels under this policy. (c) Expected cost: For the above policy calculate C(X1) + ….. + C(Xn) lim n-∞ n