Relationship of aspect ratio to the homography matrices between a rectangle and an arbitrary quadrilateral

I’ve been reading everything I can on the perspective mapping between a 2D rectangle and the projection onto the plane in 3D space of a rectangle.

I’ve learned that any such quadrilateral resulting from the projection can be mapped to any rectangle.

I’ve learned that the only constraints are that the quadrilateral must be convex. No three points may lie on the same line. If the points have an ordering, no two edges may cross each other.

I don’t understand all the maths unfortunately but it seems that such a mapping ignores the aspect ratio of the width and height of the rectangles. This is most apparent when mapping between a portrait and landscape rectangle of the same dimensions. Perspective without relative scaling of the width vs height can never produce such a result.

So far I have not been able to read anywhere how this aspect ratio is encoded in the transformation matrix.

I need to understand this so that I can:

  • Find the aspect ratio of a projected quad
  • Decide whether a quad can be a mapping of a rect of of given aspect ratio
  • When projecting a rect to a quad, let either of length or height be provided and calculate the other

What is the missing ingredient? (If my assumptions are wrong, explaining how is also an acceptable answer of course.)

Solutions Collecting From Web of "Relationship of aspect ratio to the homography matrices between a rectangle and an arbitrary quadrilateral"

The short answer is that you can’t do any of the things that you list near the end of your question without some information about the plane being mapped or the mapping itself beyond the quadrilateral image of a rectangle. You mention the barrier to doing so earlier in the question:

I’ve learned that any such quadrilateral resulting from the projection can be mapped to any rectangle.

Information is lost when a general projective transformation is applied to a rectangle: neither lengths of line segments, ratios of lengths nor angles are preserved by such a transformation.

[Most of what follows is taken from Hartley and Zisserman’s Multiple View Geometry In Computer Vision, which is well worth reading if you’re going to be pursuing these topics.]

It’s useful here to view the homography $H$ between the rectangle’s plane and the image plane as a cascade of ever more restrictive transformations. That is, $H=H_PH_AH_S$, where $H_S$ is a similarity, $H_A$ is an affine transformation, and $H_P$ is a sort of “primitive” projection known as an elation. Undoing each stage of this cascade requires progressively more information about the source or image. Looking at $H$ in this way tells you how much you need to undo, and so also how much you need to know about the source and image.

Finding the aspect ratio of a rectangle requires comparing lengths of line segments on non-parallel lines. This is not an affine property: even knowing one of the source side lengths doesn’t allow you to find the other. So, to do this you’ll need to rectify the image up to a similarity, i.e., you effectively need to undo both $H_P$ and $H_A$.

Knowing that the quad is the image of a rectangle allows you to undo $H_P$ (actually it’s enough to know that it’s the image of a parallelogram): you can use pairs of opposite sides of the quad to find the images of two vanishing points of the rectangle’s plane, which then gives you the image of the plane’s vanishing line. $H_P$ is the only part of the transformation cascade which moves the line at infinity, so you can now undo the last part of the cascade by mapping this line in the image to the line at infinity. With this rectification, you can measure affine properties of the image, but as noted above, this isn’t enough to recover the rectangle’s aspect ratio. You’ve eliminated general convex quadrilaterals as pre-images of the quad, but it can still be the image of an arbitrary parallelogram.

There are various ways to continue on to a metric rectification of the image from here. Mathematically, they all come down to identifying the image of the conic at infinity or its dual $C_\infty^*$ (or, equivalently, identifying the images of the circular points). Since you know that the quad is the image of a rectangle, a minimal approach is to find the images of two pairs of orthogonal lines. Two lines $\mathbf l$ and $\mathbf m$ are orthogonal if $\mathbf l^TC_\infty^*\mathbf m=0$, so each such pair of lines produces a linear constraint on the elements of the matrix $C_\infty^*$. This matrix is symmetric and its last row and column are zero, so there are only three independent elements, and so two independent pairs of lines are enough to identify $C_\infty^*$. The “independent” part of that conclusion prevents us from using the quad to do this: we have four pairs of perpendicular lines, but they’re parallel to each other, so taking a second pair of sides from the quad doesn’t add any new constraints on $C_\infty^*$. You need to find some other pair of lines that aren’t parallel to the rectangle’s sides. Other methods of metric rectification, including ones that get you to this point directly without constructing the line at infinity first, also require information beyond that provided by the image quad.

That’s the theory, at any rate. In practice, because of roundoff and other errors in the projection that produces the image, you might actually be able to get something useable from the pairs of edges that meet at diagonally opposite corners of the quad. I’d be concerned about the numerical stability and accuracy of such a solution, though. It would be better to perform the metric rectification using more data points from the image.