Midterm Exam Solutions for ROB501H1 Computer Vision

UNIVERSITY OF TORONTO Faculty of Applied Science and Engineering ROB501H1 F – Computer Vision for Robotics Midterm Exam Instructor: Prof. Jonathan Kelly October 31, 2022 Name : Student Number : Time allowed: 1 hour 50 minutes This exam contains 12 pages (including this cover page) and 4 questions. Answer each question in the spaces provided on the exam paper; some spaces span more than one page. You may use the back of the page for scratch work or additional space, provided you indicate clearly whether it is part of your final solution. A double-sided 8.5” × 11” aid sheet is permitted. Good luck! Distribution of Marks Question Points Score Multiple Choice 10 Short Answers 20 Long Answer – Least Squares & RANSAC 25 Long Answer – Panorama Construction 20 Total: 75 1

ROB501H1 F Computer Vision for Robotics October 31, 2022 QUESTION 1: MULTIPLE CHOICE (10 points) Select one answer only for each question. Each question is worth 2 points. (a) The SAE Levels of Driving Automation guidelines define Level 4 as which of the following? ⃝ Full Automation ⃝ Conditional Automation √ High Automation (b) An affine transformation preserves which of the following? ⃝ Angle √ Parallelism ⃝ Length ⃝ Orientation (c) Bayes’ Theorem can be stated as: p ( x | y ) = p ( y | x ) p ( x ) p ( y ) . The term p ( y ) is referred to as which of the following? ⃝ Prior √ Evidence ⃝ Posterior ⃝ Likelihood (d) Orthographic projection is a suitable approximation/model for which of the following real- world lens types only ? ⃝ Fisheye ⃝ Macro √ Telephoto ⃝ Wide-Angle (e) Choose the feature detection algorithm with the lowest computational cost. ⃝ SURF √ BRISK ⃝ SIFT Page 2 of 12

ROB501H1 F Computer Vision for Robotics October 31, 2022 QUESTION 2: SHORT ANSWERS (20 points) Answer each question briefly but with sufficient detail (write on the back of the page if necessary). (a) (3 points) We reviewed homogeneous coordinates (for 2D and 3D points) in the lectures (i.e., in 2D, ˜ p = [˜ x ˜ y ˜ w ] T ). Why are homogeneous coordinates so useful (in the context of projective geometry), that is, what do they allow us to do? Solution: All points along a ray are equivalent in projective space. Homogenous coor- dinates allow us to easily handle points at an infinite distance from the ‘camera’ (with ˜ w = 0 ) just like any other points. (b) (4 points) Describe the cause of chromatic aberration and its effect on camera images. Solution: The index of refraction of all real-world lenses varies with the wavelength of the incident light. In turn, each wavelength is ‘bent’ (refracted) by the lens by a slightly dif- ferent amount. The result is that different wavelengths are focused at different distances (relative to the optical centre), causing blurring or smearing of colours in the image. (c) (4 points) Explain why using the inverse warping geometric transform yields better results in terms of image quality for the transformed output (compared to forward warping). Solution: Using the inverse transform allows for principled interpolation of the reference image (e.g., bilinear interpolation), while the use of the forward transform requires a choice of ‘pixel correspondence’ that is ambiguous. The forward transform typically results in aliasing and blur. Page 3 of 12

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

ROB501H1 F Computer Vision for Robotics October 31, 2022 (d) (5 points) What is the aperture problem ? When considering image features, what character- istic(s) should we look for in an effort to avoid this problem? Solution: When viewing only a small portion of an image (through a small aperture), textureless regions and linear edges do not fully constrain the position of the region (i.e., the neighbourhood of a point). Instead, we should look for regions that have strong gradients in two directions in image space. (e) (4 points) What, concisely, is the universal approximation theorem for MLPs? Solution: Feed-forward neural networks are universal approximators. A single hidden layer is enough to approximate (almost) any function to arbitrary accuracy given enough neurons (width). Page 4 of 12

ROB501H1 F Computer Vision for Robotics October 31, 2022 QUESTION 3: LONG ANSWER – LEAST SQUARES & RANSAC (25 points) Fitting an ellipse to a set of image plane points is a common task in robotic vision. In this question, you will walk through the problem of ellipse-fitting from end to end. (a) (4 points) To fit an ellipse, you much first choose a model. You decide to assume that the major and minor axes of the ellipse will always be aligned with the image plane x and y axes. However, the origin of the ellipse might be anywhere on the image plane. Write down your model equation(s). Solution: A possible model equation for an ellipse centred at ( c x , c y ) is a ( x − c x ) 2 + b ( y − c y ) 2 = 1 Expanding, grouping terms, and rearranging, we obtain ax 2 − 2 c x x + c 2 x + by 2 − 2 c y y + c 2 y = 1 , ax 2 + by 2 + dx + ey + f = 0 , which is the equation for a general conic section (almost, the xy term is missing). Alternatively, a model formulation that enforces positive coefficients (and hence the shape) could be a 2 ( x − c x ) 2 + b 2 ( y − c y ) 2 = 1 (b) (3 points) You choose to solve the fitting problem using least squares, given that you have a series of ( x i , y i ) , i = 1 , . . . , m , pairs as measurements, where the x i values are treated as being noise-free. Is this a linear or nonlinear least squares problem and why? Solution: If we choose the (general) conic section formulation, then this is a linear least squares problem because the model function is linear with respect to the parameters. If we choose the formulation where the parameters (coefficients) are squared, then this is a nonlinear least squares problem. Page 5 of 12

ROB501H1 F Computer Vision for Robotics October 31, 2022 (c) (5 points) Write down the form of (i.e., entries in) the Jacobian matrix and also clearly show the size of the Jacobian matrix. Also write down the parameter vector θ or the parameter update vector δθ , depending on your answer to (b) above. Solution: The exact form of the Jacobian matrix depends upon the model. A simple linear algorithm might attempt to minimize the sum of the squares of the distances from zero (see the linear equation above). The Jacobian is n × 5 in this case and the parameter vector is θ = [ a b d e f ] ] . Alternatively, there are four parameters in the nonlinear case and so the Jacobian is n × 4 and the parameter update vector is δθ = [ a b c d ] ] . The Jacobian gets a bit messy, however. (d) (4 points) You are happily working with your new ellipse-fitting routine when a co-worker stops by and provides you with the data shown in the plot below. What problem(s), if any, might this cause for your fitting function? How would you solve it/them? x y x x x x x x x x x x x x Solution: The data are sparse and clustered in one region only. If the linear algorithm is selected, a parabola or hyperbola might be recovered—the solution is underconstrained and therefore, roughly, not unique (the result will depend entirely on the error function). If a nonlinear algorithm is applied, the result will be similar, a bad fit and a distorted ellipse. Also, the nonlinear algorithm will be sensitive to the initial guess (since there are not enough data to ‘lock things down’). Possible solutions: ask for more data or impose some additional constraints. Perhaps, for example, the maximum and minimum sizes of any possible ellipse are known—this information could be used to solve a constrained optimization problem. Page 6 of 12

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

ROB501H1 F Computer Vision for Robotics October 31, 2022 (e) (2 points) Given your choice of model, what is the minimum number of ( x i , y i ) pairs required to fit the model and why? Solution: In general, one equation (pair) is required per unknown (to solve the linear problem or for the Jacobian to be full rank in the nonlinear case). The answer depends on the model that has been chosen. (f) (2 points) Given an inlier ratio of w = 50% (yikes!), how many iterations of RANSAC would be required to ensure with z = 87.5% probability that an outlier-free set of points is selected? A closed-form solution is not needed, you may simply write down the required equation, but be sure to fill in all of the values! Solution: Making use of the equation given in the RANSAC paper (and the lecture notes), assuming that four points are required: k = log (1 − z ) log (1 − w n ) = log (1 − 0 . 875) log (1 − 0 . 5 4 ) = − 3 − 0 . 0931 ≈ 33 (g) (5 points) Frustrated by outliers, but bound by limited resources, you decide to tolerate one potential outlier when implementing a modified version of RANSAC. Derive a general expres- sion for the expected number of trials k required to select a set of points with at most one outlier (in terms of w and n ). Simplify the expression as much you can. Solution: This problem has some subtleties. Since the expectation is desired, we first need to determine the probability of the event “ n number of samples were selected, con- taining at most one outlier.” The probability of drawing n samples with at most one outlier is the sum of the probabil- ities of the ways this can happen. We might draw n outlier-free samples consecutively, or draw one outlier, then draw n − 1 outlier-free samples, etc. The overall probability is b = w n + ( n 1 ) (1 − w ) w n − 1 = w n + n (1 − w ) w n − 1 The same expression for the power series that appears in the RANSAC paper can be used again, hence we have E [ k ] = 1 b = 1 w n + n (1 − w ) w n − 1 Page 7 of 12

ROB501H1 F Computer Vision for Robotics October 31, 2022 Extra space to show your work if needed. Page 8 of 12

ROB501H1 F Computer Vision for Robotics October 31, 2022 QUESTION 4: LONG ANSWER – PANORAMA CONSTRUCTION (20 points) Your PanoMania ® startup is doing well in the online photography business! With the goal of increasing revenue, you decide to revisit some of the parts of your panorama-prototype, to better understand where improvements might be made. (a) (4 points) Your software currently uses the Harris corner detector, which your COO believes is offset-invariant (i.e., its response does not change when the intensity of all image pixels is adjusted up or down by a fixed amount). Is the COO correct? Why or why not (briefly)? Solution: The COO is correct! The Harris detector relies on image gradients —the gradi- ents do not change if the intensity of all image pixels is adjusted up or down by a fixed amount. (b) (3 points) The DLT algorithm used to solve for a perspective homography yields solutions that lie within the 1D nullspace of the matrix A . Explain why this is the case. Solution: There are nine entries in the homography matrix, but only eight separate con- straints (i.e., in A ) are provided, so the DLT algorithm cannot provide a unique solution. Instead, any solution that lies in the nullspace of A is acceptable, because the homography is defined up to an arbitrary scaling only. Page 9 of 12

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

ROB501H1 F Computer Vision for Robotics October 31, 2022 (c) (4 points) Given more than four point correspondences between images, ( x i , x ′ i ) , you decide to use least squares (again) to fit your model homography H . Considering the error or distance function only, your VP of Marketing suggests that a suitable cost is C = ∑ i d ( x ′ i , Hx i ) where d ( · , · ) is the Euclidean distance between pixel coordinates. What is wrong, potentially, with the VP’s suggestion? Solution: The VP has failed to consider that the ‘distance’ metric being applied is not symmetric. If x i and x ′ i are swapped (and the homography is inverted), the result will be different, potentially quite significantly. (d) (4 points) One of your employees argues that the best way to build panoramas is always to assemble the sub-images starting from the upper left corner of the complete panorama (given a known, final size), working sequentially from left to right. Another employee asserts that this is not the correct approach and that, instead, the maximum number of transformations applied consecutively should always be minimized. Is the second employee right or wrong? Why? Your argument should be numerical, but you do not need to use actual numbers, unless this is helpful. Solution: A simple numerical argument is sufficient. The problem is noise: feature corre- spondences are never exact, so any homography will also have error. To build a panorama, images are typically warped in sequence—each additional warping (application of an H matrix) adds distortion. In turn, applying long chains of transforms tends to yield terrible results. Therefore, the second employee is right, using the least number of consecutive transforms should give better outputs (panorama quality). Page 10 of 12

ROB501H1 F Computer Vision for Robotics October 31, 2022 Extra space to show your work if needed. (e) (5 points) Buyers tend to dislike vignetting in mosaicked images. Provide high-level pseu- docode for a simple method to reduce vignetting. Python-style code is not required—bullets are sufficient, as long as variables are clearly defined. Hint: where is there the least vignetting? Solution: A wide variety of solutions exist. The answer should make use of the fact that pixels closer to the centre of an image have less vignetting and can be used to determine what (nonlinear) correction to apply to image brightness in regions where two or more images overlap. Page 11 of 12

ROB501H1 F Computer Vision for Robotics October 31, 2022 Extra space to show your work if needed. Page 12 of 12

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

ROB501H1_20229_621670879777rob501_fall_2022_midterm_soln

Related Documents