Suppose we have a set of points in 2D (2-feature vectors), drawn i.i.d. from some unknown distribution D, each labelled 0 (-) or 1 (+). Suppose the origin of the labels are unknown (agnostic setting). Consider a hypothesis class H formed by 2-dimensional convex quadrilaterals such that for a disk h = H, we have h(x) = 1 if the point is a part of h (inside or on the circumference of the quadrilateral h) and h(x) = 0 otherwise. Here are some examples of convex quadrilaterals: (a) Describe a polynomial time algorithm that runs in polynomial-time (in m) and outputs a hypothesis from H that is most consistent with the labelled data. That is, a convex quadrilateral with minimum empirical error. Describe the asymptotic running time of your algorithm. (b) What is the VC dimension of a convex quadrilateral? Provide a proof of your answer. This proof requires two parts: a proof that at least d points can be shattered with a disk, and a proof that no more than d points can be shattered for some d. (c) Use the result from part (b) to specify an upper bound for the sample size to achieve PAC guarantees using H with given accuracy and confidence parameters and б.

icon
Related questions
Question
100%
Suppose we have a set of points in 2D (2-feature vectors), drawn i.i.d. from some unknown distribution D, each labelled 0
(-) or 1 (+). Suppose the origin of the labels are unknown (agnostic setting). Consider a hypothesis class H formed by
2-dimensional convex quadrilaterals such that for a disk h = H, we have h(x) = 1 if the point is a part of h (inside or on
the circumference of the quadrilateral h) and h(x) = 0 otherwise.
Here are some examples of convex quadrilaterals:
(a) Describe a polynomial time algorithm that runs in polynomial-time (in m) and outputs a hypothesis from H that is
most consistent with the labelled data. That is, a convex quadrilateral with minimum empirical error. Describe the
asymptotic running time of your algorithm.
(b) What is the VC dimension of a convex quadrilateral? Provide a proof of your answer. This proof requires two parts:
a proof that at least d points can be shattered with a disk, and a proof that no more than d points can be shattered
for some d.
(c) Use the result from part (b) to specify an upper bound for the sample size to achieve PAC guarantees using H with
given accuracy and confidence parameters and б.
Transcribed Image Text:Suppose we have a set of points in 2D (2-feature vectors), drawn i.i.d. from some unknown distribution D, each labelled 0 (-) or 1 (+). Suppose the origin of the labels are unknown (agnostic setting). Consider a hypothesis class H formed by 2-dimensional convex quadrilaterals such that for a disk h = H, we have h(x) = 1 if the point is a part of h (inside or on the circumference of the quadrilateral h) and h(x) = 0 otherwise. Here are some examples of convex quadrilaterals: (a) Describe a polynomial time algorithm that runs in polynomial-time (in m) and outputs a hypothesis from H that is most consistent with the labelled data. That is, a convex quadrilateral with minimum empirical error. Describe the asymptotic running time of your algorithm. (b) What is the VC dimension of a convex quadrilateral? Provide a proof of your answer. This proof requires two parts: a proof that at least d points can be shattered with a disk, and a proof that no more than d points can be shattered for some d. (c) Use the result from part (b) to specify an upper bound for the sample size to achieve PAC guarantees using H with given accuracy and confidence parameters and б.
Expert Solution
steps

Step by step

Solved in 2 steps with 1 images

Blurred answer