APAI1001 Assignment 1 (Due: Oct 18 (Friday), 23:59pm) Q1 Breath first search (Path Searching) Franklin Philips, a professor of the University of Los Santos (B5), is going to drive to Vice City (E3) to attend a conference today. The map of this region with district division is showed as follows. However, Prof. Philips woke up late in the morning and he must arrive at Vice City as soon as possible. Answer the following questions to help Prof. Philips finding the shortest path. (Adjacent districts are interconnected if and only if they are divided by dotted line). 5 4 3 2 1 A Los Santos B Vice City E Consider using breath first search (under the graph search framework) to find the shortest path. Note that each district on the map has at most 4 successors, and most locations have less than 4 successors due to road blocks and map boundaries. We consider a default left-bottom-right- top order of the successors (when adding them to the frontier). (a) [CiC] Write down the adjacency lists of district (i) C4; (ii) D1. (b)[CiC] Complete the following search tree for shortest paths from Los Santos to Vice. A3 A5 A5 A4 B2 62 Los Santos (B5) 64 B4 C5 95 B3 15 D5 (c) Fill in the following table with first several steps of BFS to find the shortest paths from Los Santos to Vice City with respect to your search tree. Step Visiting node Set of explored nodes Frontier set 1 B5 {} {A5, B4, C5} 2 A5 {B5} {B4, C5,A4} 3 B4 4 C5 {B5,A5} {B5,A5,B4} {C5,A4,B3} {A4,B3,D5} 5 6 7 8 A* Search Just before his departure, Prof. Philips found a map in his pocket with more geographic information of this region. 5 4 3 2 1 A Los Santos A B C D Vice City A E Suppose the time cost to go through area with different geographic features are: Plain: 10 mins Bridge over river : 20 mins Mountain 40 mins Define function g(n) as the time cost from Los Santos to the current district and heuristic function h(n) as the Manhattan distance from the current district to Vice City times 10 mins and f(n) = g(n)+h(n). (d) Fill in the following table with first several steps of A* search to find the quickest path from Los Santos to Vice City with respect to your search tree. Step Visiting node Set of explored nodes (you can omit g, h, f) 1 B5 {} 2 B4 {B5} 3 4 5 6 (e) Write down the quickest path given by A* search. Frontier set with values of (g(n), h(n), f(n)) {A5:(10,60,70), B4:(10,40,50), C5:(40,40,80)} {A5:(10,60,70), C5:(40,40,80), B3:(20,30,50), A4:(20,50,70)} Q2 We have introduced in class that in the 8-puzzle, whether an initial state is solvable or unsolvable depends on the parity of the number of inversions, but we did not explain why. Here let us consider a general state of the form below: A1 A2 A3 A₂ АА A6 A7 A8 Here A1, A2,..., Ag can be an arbitrary permutation/arrangement of 1,2,...,8. Answer the following questions. (a) In the given state, the piles in row major order are apparently order(initial) = A1, A2, A3, A4, A5, A6, A7, A8. If the blank pile moves up (i.e., A₂ moves down), we can write the corresponding piles in row major order (we name the result order(up)) as order(up) = A₁, A3, A4, A2, A5, A6, A7, A8. Following this idea, give the corresponding piles in row major order when the blank pile moves down/left/right (we name the results order(down), order(left), order(right)). ...9 , A8, we cannot calculate the (b) Since we do not know the exact values of each of A1, A2, number of inversions for order(initial), order(up), order(down), order(left), order(right). However, we can compare them and try to figure out the differences between the number ons. Compare your results in (a) with order(initial). For each case (up/down/left/ right), discuss what is the key part in the sequence that determines the difference between the number of inversions compared to order(initial). of inver (c) Based on the above arguments, discuss why the parity of the number of inversions does not change in the 8-puzzle problem. Q3 (You are not required to give rigorous proofs in (b), (c),(e),(f), give some explanations of your reasoning is enough.) Consider using gradient descent to minimize the function f(x) = |x|, x ЄR. The derivative ('gradient') of f(x) is given as follows: d +1, if x* > 0, -f(x) = for all x ER. dx -1, if x* <0 x=x* The derivative of f(x) at x = 0 is undefined and you can manually use 0 as the "derivative" value if you need it. Now suppose that you use gradient descent d = x(t) — Nt⋅ -f(x) dx x=x(t) x(t+1) starting from the initial state x (0) = 2.5 to minimize f(x). (a) Suppose that the with learning rate n = 1. Write out the first 8 iterates x(1), x(2), x(3), x(4), x(5)¸ x(6), x(7), x(8) produced by gradient descent. Will x (1) converge to a specific value? Will f(x) be minimized? (b)Now suppose that the learning rate at the t-th iteration is set as 1/(t + 1), i.e., 1 d x(1) = x (0) • 0+1 dx = f(x) | x=x(0) etc. , Write out the first 8 iterates (use a calculator, round your result to 3 significant digits, e.g., 0.1234 0.123) x(1), x(2), x(3), x(4), x(5), x (6), x(7), x(8). Can you guess, will x (t) converge to a specific value? Will f(x) be minimized? Briefly explain your reasoning. Hint: ∞ Σ = lim 1 t→∞ t = 0. t=1 (c) What if we set the learning rate at the t-th iteration as 1/(t + 1)², i.e., x (1) = x(0) 1 d (0+1)2 dx etc? " x=x(0) Write out the first 8 iterates (use a calculator, round your result to 3 significant digits, e.g., 0.1234 ≈ 0.123) x(1), x(2), x(3), x(4), x(5), x(6), x(7), x(8). Can you guess, will ✗(t) converge to a specific value? Will f(x) be minimized? Briefly explain your reasoning. Hint: 2 1 Π ≈ 1.6449. t2 6 t=1 Now consider using gradient descent to minimize the function f(x) = 1—1—1 + x², x The derivative of f(x) is given as follows: d -f(x) dx x=x* x², x ЄR. = x* for all x* E R. Suppose that you use gradient descent to minimize f(x). (d)Consider gradient descent with constant learning rate n = 2. If we set the initial state x(0) = 2.5, write out the first 4 iterates x(¹), x(2), x(3), x (4); If we set the initial state x(0) = 1.73234, write out the first 4 iterates x (1), x (2), x(3), x(4). (e) Consider gradient descent with constant learning rate 0 < n < 2. Can you guess, will x(t) converge to a specific value? Will f(x) be minimized? Briefly explain your reasoning. (f) Consider gradient descent with constant learning rate ŋ > 2. Can you guess, will x(t) converge to a specific value? Will f(x) be minimized at that value? Briefly explain your reasoning.

APAI1001 Assignment 1 (Due: Oct 18 (Friday), 23:59pm) Q1 Breath first search (Path Searching) Franklin Philips, a professor of the University of Los Santos (B5), is going to drive to Vice City (E3) to attend a conference today. The map of this region with district division is showed as follows. However, Prof. Philips woke up late in the morning and he must arrive at Vice City as soon as possible. Answer the following questions to help Prof. Philips finding the shortest path. (Adjacent districts are interconnected if and only if they are divided by dotted line). 5 4 3 2 1 A Los Santos B Vice City E Consider using breath first search (under the graph search framework) to find the shortest path. Note that each district on the map has at most 4 successors, and most locations have less than 4 successors due to road blocks and map boundaries. We consider a default left-bottom-right- top order of the successors (when adding them to the frontier). (a) [CiC] Write down the adjacency lists of district (i) C4; (ii) D1. (b)[CiC] Complete the following search tree for shortest paths from Los Santos to Vice. A3 A5 A5 A4 B2 62 Los Santos (B5) 64 B4 C5 95 B3 15 D5 (c) Fill in the following table with first several steps of BFS to find the shortest paths from Los Santos to Vice City with respect to your search tree. Step Visiting node Set of explored nodes Frontier set 1 B5 {} {A5, B4, C5} 2 A5 {B5} {B4, C5,A4} 3 B4 4 C5 {B5,A5} {B5,A5,B4} {C5,A4,B3} {A4,B3,D5} 5 6 7 8 A* Search Just before his departure, Prof. Philips found a map in his pocket with more geographic information of this region. 5 4 3 2 1 A Los Santos A B C D Vice City A E Suppose the time cost to go through area with different geographic features are: Plain: 10 mins Bridge over river : 20 mins Mountain 40 mins Define function g(n) as the time cost from Los Santos to the current district and heuristic function h(n) as the Manhattan distance from the current district to Vice City times 10 mins and f(n) = g(n)+h(n). (d) Fill in the following table with first several steps of A* search to find the quickest path from Los Santos to Vice City with respect to your search tree. Step Visiting node Set of explored nodes (you can omit g, h, f) 1 B5 {} 2 B4 {B5} 3 4 5 6 (e) Write down the quickest path given by A* search. Frontier set with values of (g(n), h(n), f(n)) {A5:(10,60,70), B4:(10,40,50), C5:(40,40,80)} {A5:(10,60,70), C5:(40,40,80), B3:(20,30,50), A4:(20,50,70)} Q2 We have introduced in class that in the 8-puzzle, whether an initial state is solvable or unsolvable depends on the parity of the number of inversions, but we did not explain why. Here let us consider a general state of the form below: A1 A2 A3 A₂ АА A6 A7 A8 Here A1, A2,..., Ag can be an arbitrary permutation/arrangement of 1,2,...,8. Answer the following questions. (a) In the given state, the piles in row major order are apparently order(initial) = A1, A2, A3, A4, A5, A6, A7, A8. If the blank pile moves up (i.e., A₂ moves down), we can write the corresponding piles in row major order (we name the result order(up)) as order(up) = A₁, A3, A4, A2, A5, A6, A7, A8. Following this idea, give the corresponding piles in row major order when the blank pile moves down/left/right (we name the results order(down), order(left), order(right)). ...9 , A8, we cannot calculate the (b) Since we do not know the exact values of each of A1, A2, number of inversions for order(initial), order(up), order(down), order(left), order(right). However, we can compare them and try to figure out the differences between the number ons. Compare your results in (a) with order(initial). For each case (up/down/left/ right), discuss what is the key part in the sequence that determines the difference between the number of inversions compared to order(initial). of inver (c) Based on the above arguments, discuss why the parity of the number of inversions does not change in the 8-puzzle problem. Q3 (You are not required to give rigorous proofs in (b), (c),(e),(f), give some explanations of your reasoning is enough.) Consider using gradient descent to minimize the function f(x) = |x|, x ЄR. The derivative ('gradient') of f(x) is given as follows: d +1, if x* > 0, -f(x) = for all x ER. dx -1, if x* <0 x=x* The derivative of f(x) at x = 0 is undefined and you can manually use 0 as the "derivative" value if you need it. Now suppose that you use gradient descent d = x(t) — Nt⋅ -f(x) dx x=x(t) x(t+1) starting from the initial state x (0) = 2.5 to minimize f(x). (a) Suppose that the with learning rate n = 1. Write out the first 8 iterates x(1), x(2), x(3), x(4), x(5)¸ x(6), x(7), x(8) produced by gradient descent. Will x (1) converge to a specific value? Will f(x) be minimized? (b)Now suppose that the learning rate at the t-th iteration is set as 1/(t + 1), i.e., 1 d x(1) = x (0) • 0+1 dx = f(x) | x=x(0) etc. , Write out the first 8 iterates (use a calculator, round your result to 3 significant digits, e.g., 0.1234 0.123) x(1), x(2), x(3), x(4), x(5), x (6), x(7), x(8). Can you guess, will x (t) converge to a specific value? Will f(x) be minimized? Briefly explain your reasoning. Hint: ∞ Σ = lim 1 t→∞ t = 0. t=1 (c) What if we set the learning rate at the t-th iteration as 1/(t + 1)², i.e., x (1) = x(0) 1 d (0+1)2 dx etc? " x=x(0) Write out the first 8 iterates (use a calculator, round your result to 3 significant digits, e.g., 0.1234 ≈ 0.123) x(1), x(2), x(3), x(4), x(5), x(6), x(7), x(8). Can you guess, will ✗(t) converge to a specific value? Will f(x) be minimized? Briefly explain your reasoning. Hint: 2 1 Π ≈ 1.6449. t2 6 t=1 Now consider using gradient descent to minimize the function f(x) = 1—1—1 + x², x The derivative of f(x) is given as follows: d -f(x) dx x=x* x², x ЄR. = x* for all x* E R. Suppose that you use gradient descent to minimize f(x). (d)Consider gradient descent with constant learning rate n = 2. If we set the initial state x(0) = 2.5, write out the first 4 iterates x(¹), x(2), x(3), x (4); If we set the initial state x(0) = 1.73234, write out the first 4 iterates x (1), x (2), x(3), x(4). (e) Consider gradient descent with constant learning rate 0 < n < 2. Can you guess, will x(t) converge to a specific value? Will f(x) be minimized? Briefly explain your reasoning. (f) Consider gradient descent with constant learning rate ŋ > 2. Can you guess, will x(t) converge to a specific value? Will f(x) be minimized at that value? Briefly explain your reasoning.