Fall 2024 CS543/ECE549

Assignment 4: Single-view reconstruction

Due date: Tuesday, November 19, 11:59:59PM

Contents


Part 1: Shape from shading


The goal of this part is to implement shape from shading as described in
this lecture (see also Section 2.2.4 of Forsyth & Ponce 2nd edition).
  1. Download the data and part 1 starter code. The data consists of 64 images each of four subjects from the Yale Face database. The light source directions are encoded in the file names. We have provided utilities to load the input data and display the output. Your task will be to implement the functions preprocess, photometric_stereo and get_surface in the ipython notebook, as explained below.

  2. For each subject (subdirectory in croppedyale), read in the images and light source directions. The function LoadFaceImages returns the images for the 64 light source directions and an ambient image (i.e., image taken with all the light sources turned off). The LoadFaceImages function is completed and provided to you in the starter code.

  3. Preprocess the data: subtract the ambient image from each image in the light source stack, set any negative values to zero, rescale the resulting intensities to between 0 and 1 (they are originally between 0 and 255). Complete the preprocess function.

  4. Estimate the albedo and surface normals. For this, you need to fill in code in photometric_stereo, which is a function taking as input the image stack corresponding to the different light source directions and the matrix of the light source directions, and returning an albedo image and surface normal estimates. The latter should be stored in a three-dimensional matrix. That is, if your original image dimensions are h x w, the surface normal matrix should be h x w x 3, where the third dimension corresponds to the x-, y-, and z-components of the normals. To solve for the albedo and the normals, you will need to set up a linear system. To get the least-squares solution of a linear system, use numpy.linalg.lstsq function. Complete the photometric_stereo function.

  5. If you directly implement the formulation from the lecture, you will have to loop over every image pixel and separately solve a linear system in each iteration. There is a way to get all the solutions at once by stacking the unknown g vectors for every pixel into a 3 x npix matrix and getting all the solutions with a single call to numpy solver.

    You will most likely need to reshape your data in various ways before and after solving the linear system. Useful numpy functions for this include reshape, expand_dims and stack.

  6. Compute the surface height map by integration. More precisely, instead of continuous integration of the partial derivatives over a path, you will simply be summing their discrete values. Your code implementing the integration should go in the get_surface function. As stated in the slide, to get the best results, you should compute integrals over multiple paths and average the results. Complete the get_surface function.

    You should implement the following variants of integration:
    1. Integrating first the rows, then the columns (the "row" method). That is, your path first goes along the same row as the pixel along the top, and then goes vertically down to the pixel. It is possible to implement this without nested loops using the cumsum function.
    2. Integrating first along the columns, then the rows (the "column" method).
    3. Average of the first two options (the "average" method).
    4. Average of multiple random paths (the "random" method). For this, it is fine to use nested loops. You should determine the number of paths experimentally.

  7. Display the results using functions display_output and plot_surface_normals included in the notebook.

Part 1 Extra Credit

On this part of the assignment, there are not too many opportunities for "easy" extra credit. This said, here are some ideas for exploration:
  • Generate synthetic input data using a 3D model and a graphics renderer and run your method on this data. Do you get better results than on the face data? How close do you get to the ground truth (i.e., the true surface shape and albedo)?
  • Investigate more advanced methods for shape from shading or surface reconstruction from normal fields.
  • Try to detect and/or correct misalignment problems in the initial images and see if you can improve the solution.
  • Using your initial solution, try to detect areas of the original images that do not meet the assumptions of the method (shadows, specularities, etc.). Then try to recompute the solution without that data and see if you can improve the quality of the solution.
If you complete any work for extra credit, be sure to clearly mark that work in your report.

Part 2: Single-view metrology

You will be working with the following image of the ECE Building (save it to get the full-resolution version):


  1. Estimate the three major orthogonal vanishing points. Use at least three manually selected lines to solve for each vanishing point. The part 2 starter code provides an interface for selecting and drawing the lines, but the code for computing the vanishing point needs to be inserted. For details on estimating vanishing points, see Derek Hoiem's book chapter (section 4). You should also refer to this chapter and the single-view metrology lecture for details on the subsequent steps. In your report, you should:
    1. Plot the VPs and the lines used to estimate them on the image plane using the provided code.
    2. Specify the VP pixel coordinates.
    3. Plot the ground horizon line and specify its parameters in the form a * x + b * y + c = 0. Normalize the parameters so that: a^2 + b^2 = 1.

  2. Using the fact that the vanishing directions are orthogonal, solve for the focal length and optical center (principal point) of the camera. Show all your work.

  3. Compute the rotation matrix for the camera, setting the vertical vanishing point as the Y-direction, the right-most vanishing point as the X-direction, and the left-most vanishing point as the Z-direction.

  4. Estimate the following heights and show all the lines and measurements used to perform the calculation. As a reference measurement, assume that the person in the picture is 5ft 6in tall.
    1. The left side of the ECE building (the leftmost edge of the metal structure). You should use (371, 315) as the pixel coordinate of the top left corner, and carefully mark the corresponding bottom point yourself.
    2. The right side of the ECE building. Use (1870, 281) as the coordinate of the top right corner, and find the bottom yourself.
    3. The lamp posts. You should carefully choose the best poles to use and mark their top and bottom points. We recommend choosing multiple poles and averaging the measurements.

  5. Recompute the answers in a-c above assuming that the person is 6ft tall.

Part 2 Extra Credit

  • Go to the Beckman quad and physically measure the height of the low stone wall, and use it as a reference height to get more accurate height estimates for the person, ECE building, and lamp posts.
  • Perform additional measurements on the image, such as other parts of the ECE building, windows, trees, etc. Show all your work.
  • Attempt to fit lines to the image and estimate vanishing points automatically either using your own code or code found on the Web.
  • Attempt to create a simple texture-mapped 3D model of the ground plane and the ECE building.
  • Find or take other images with three prominently visible orthogonal vanishing points and demonstrate varions measurements on those images.

Submission Instructions

You must upload the following files on Canvas:

  1. Your code in two separate files for part 1 and part 2. The filenames should be lastname_firstname_a4_p1.py and lastname_firstname_a4_p2.ipynb. For python notebooks, you should also output an exported PDF.
  2. A report in a single PDF file with all your results and discussion for both parts following this template. The filename should be lastname_firstname_a4.pdf.
  3. All your output images and visualizations in a single zip file. The filename should be lastname_firstname_a4.zip. Note that this zip file is for backup documentation only, in case we cannot see the images in your PDF report clearly enough. You will not receive credit for any output images that are part of the zip file but are not shown (in some form) in the report PDF.

Please refer to course policies on late submission, academic integrity, etc.