Exercise 1. A connected component in an undirected graph is a subgraph C with these two properties: C is connected, and no edges exist between nodes in C and nodes outside C. We consider the following problem of splitting a graph into small pieces by deleting some nodes: Given a graph G = (V, E) and an integer c, delete a subset UC V of nodes (and all their incident edges) from G such that, in the remaining graph, every connected component has at most c nodes, and the size |U| is as small as possible. The problem appears, e.g., in data analysis, where the nodes represent data items, an edge means similarity, and the data shall be partitioned into small clusters, thereby neglecting as few data as possible. Give a polynomial-time algorithm with approximation ratio no worse than c+ 1. That is, if there exists a solution U with |U| = k, your algorithm should delete at most (c + 1)k nodes. The approximation ratio is generous, but make sure that you accurately prove it, and argue why you need only polynomial time. Advice: It is tempting to iteratively delete nodes with highest degrees in a greedy fashion, since this deletes many edges. However, this approach will fail, since the number of deleted edges is not quite related to the sizes of the remaining connected components. (This trap is not obvious, therefore we mention it here, to avoid frustration.) Instead, the following way is recommended: First study the special case c = 1 for a while, and then try to generalize your observations.

Related questions

Question

Exercise 1.
A connected component in an undirected graph is a subgraph C with these
two properties: C is connected, and no edges exist between nodes in C and
nodes outside C.
We consider the following problem of splitting a graph into small pieces by
deleting some nodes: Given a graph G = (V, E) and an integer c, delete a
subset UC V of nodes (and all their incident edges) from G such that, in
the remaining graph, every connected component has at most c nodes, and
the size |U| is as small as possible.
The problem appears, e.g., in data analysis, where the nodes represent data
items, an edge means similarity, and the data shall be partitioned into small
clusters, thereby neglecting as few data as possible.
Give a polynomial-time algorithm with approximation ratio no worse than
c+ 1. That is, if there exists a solution U with |U| = k, your algorithm
should delete at most (c + 1)k nodes. The approximation ratio is generous,
but make sure that you accurately prove it, and argue why you need only
polynomial time.
Advice: It is tempting to iteratively delete nodes with highest degrees in a
greedy fashion, since this deletes many edges. However, this approach will
fail, since the number of deleted edges is not quite related to the sizes of
the remaining connected components. (This trap is not obvious, therefore
we mention it here, to avoid frustration.) Instead, the following way is
recommended: First study the special case c = 1 for a while, and then try
to generalize your observations.

Expert Solution

Step by step

Solved in 3 steps with 3 images

SEE SOLUTION Check out a sample Q&A here

Follow-up Questions

Read through expert solutions to related follow-up questions below.

Follow-up Question

For the answer you provided me with was rejected and commented as follows: You still insist on this approach, basically disregarding the feedback. I could almost literally repeat my earlier comments. "The approximation ratio c+1 is based on the observation ...", right, but then this observation is not used at all. There is no real analysis, you only claim the desired ratio, but it does not follow from anything that is said.

Solution

by Bartleby Expert

SEE SOLUTION