Suppose we have a set of points in 2D (2-feature vectors), drawn i.i.d. from some unknown distribution D, each labelled 0 (-) or 1 (+). Suppose the origin of the labels are unknown (agnostic setting). Consider a hypothesis class H formed by 2-dimensional convex quadrilaterals such that for a disk h = H, we have h(x) = 1 if the point is a part of h (inside or on the circumference of the quadrilateral h) and h(x) = 0 otherwise. Here are some examples of convex quadrilaterals: (a) Describe a polynomial time algorithm that runs in polynomial-time (in m) and outputs a hypothesis from H that is most consistent with the labelled data. That is, a convex quadrilateral with minimum empirical error. Describe the asymptotic running time of your algorithm. (b) What is the VC dimension of a convex quadrilateral? Provide a proof of your answer. This proof requires two parts: a proof that at least d points can be shattered with a disk, and a proof that no more than d points can be shattered for some d. (c) Use the result from part (b) to specify an upper bound for the sample size to achieve PAC guarantees using H with given accuracy and confidence parameters and б.
Suppose we have a set of points in 2D (2-feature vectors), drawn i.i.d. from some unknown distribution D, each labelled 0 (-) or 1 (+). Suppose the origin of the labels are unknown (agnostic setting). Consider a hypothesis class H formed by 2-dimensional convex quadrilaterals such that for a disk h = H, we have h(x) = 1 if the point is a part of h (inside or on the circumference of the quadrilateral h) and h(x) = 0 otherwise. Here are some examples of convex quadrilaterals: (a) Describe a polynomial time algorithm that runs in polynomial-time (in m) and outputs a hypothesis from H that is most consistent with the labelled data. That is, a convex quadrilateral with minimum empirical error. Describe the asymptotic running time of your algorithm. (b) What is the VC dimension of a convex quadrilateral? Provide a proof of your answer. This proof requires two parts: a proof that at least d points can be shattered with a disk, and a proof that no more than d points can be shattered for some d. (c) Use the result from part (b) to specify an upper bound for the sample size to achieve PAC guarantees using H with given accuracy and confidence parameters and б.
Related questions
Question
100%
Expert Solution
This question has been solved!
Explore an expertly crafted, step-by-step solution for a thorough understanding of key concepts.
Step by step
Solved in 2 steps with 1 images