Why do we believe the equation $ax+by+c=0$ represents a line?

I’m going for quite a weird question here. As we know, the equation in Cartesian coordinates for a line in 2-dimensional Euclidean geometry is of the form $ax+by+c=0$. I’m wondering why do we “believe” the plotted graph is the same “line” as in our intuition.

It might sound crazy, but think of the time when there were still no coordinates, no axes, no analytic geometry. When Descartes started to grasp the concept that equations represented geometric figures (or more accurately, loci) he would have tried plotting easy forms first, and what else could be easier than $y=x$ or $y=2x+3$ etc. Plotting those revealed something evidently a line to his (and our) naked eyes, but it wouldn’t be appropriate for a mathematician to conclude from that alone that the figure is actually a “line”, right?

So jumping back to our own time, if we forget for once that $ax+by+c=0$ “is” a line, looking at it with fresh eyes, by what criteria are we using to say it is so. I’ve tried some regular characterizations of the line (especially the geometrical ones) and haven’t yet found a satisfactory answer yet. Here are some:

  • The line is the shortest path wrt. the Euclidean distance between two points: sounds OK except that the Euclidean distance is based on our intuition that the “distance” is the length of the line connecting two points (or more accurately, the arc length measure along the path of the straight line.) Of course, we could argue that it’s OK by itself to define the distance as $\sqrt{(x_1-x_2)^2+(y_1-y_2)^2}$ without referring to the line beneath, but this sounds to be somewhat an ad hoc claim.
  • Given two points, there is exactly one line passing through both: I interpret this as “Given a family of curves, if for any two given points there is exactly one curve in the family passing through both points, then the family is appropriately understood as the family of ‘lines’ and we call the individual curves ‘lines.'” At first glance this holds up to scrutiny, as many families of circles or parabolas or other “curves” don’t satisfy the property (because any family satisfying must have as a corollary the following property: any two curves in the family intersect at 1 point at most.) But then consider the family $\{ax^3+by^3+c=0|a,b,c \in \mathbb{R}\}$ (or more generally, $\{ax^m+by^n+c=0|a,b,c \in \mathbb{R}\}$ with $m,n$ odd positive integers), this family not only satisfies the 2-point-1-line property, but also the parallel postulate: “given a ‘line’ and a point not on the ‘line’, there is exactly one ‘line’ parallel to the given ‘line’ passing through the given point (where ‘parallel’ means ‘having no intersection’)”
  • A line is a curve convex and concave at the same time: This too suffers from the fact that that intuition underlying the definition of “convex” and “concave” contain reference to the straight line. For example we defined a curve to be convex if for any two points on the curve it lies below the straight line connecting the two points, hence $f(tx+(1-t)y) \leq tf(x)+(1-t)f(y), \forall t \in [0,1]$
  • Somehow I’m not at ease with characterizations of the line as endpoints of vectors (for example, the line connecting points $A,B$ as the collection of endpoints of $l(t)=ta+(1-t)b, \forall t \in \mathbb{R}$ where $a,b$ are the position vectors of $A,B$ respectively) either. It seems to rely on representing vectors in 2-D Euclidean space (or 2-D affine space) by rays, which to me is just half a line, making the description circular.

One approach I’m thinking of is looking at the real-life construction of the line. In Euclidean geometry there are two basic tools, the straightedge and the compass. The “definition” (or characterization) of a circle in mechanical terms is the collection of points traced out by the pencil tip while the compass radius is hold still. This translates to the geometrical definition of the circle as the loci of points equidistance from a given point. Following this train of thought, the line is the loci of… what?

Sorry for the long post and confusing personal criteria of deciding what amounts to a “proper explanation” of a straight line. I’m running out of ideas now, so if you have any, please tell me.

Thank you in advance.

PS. This is not my homework or school research project. I’m just doing this for fun and wanted to hear other people’s views on the topic.

Solutions Collecting From Web of "Why do we believe the equation $ax+by+c=0$ represents a line?"

Here is something which I’ve learned in the past few months that I think answers your question, despite being a little esoteric. I’ll try to keep it brief.

Start with reasonable axioms defining a plane in geometry (say, Hilbert’s axioms), and to make things nice, we’ll additionally require the geometry to be ordered and that it satisfy Archimedes’ axiom.

It turns out that with diligence, you can construct a field $\Bbb {F}$ such that every point in the geometry can be identified with an ordered pair in $\Bbb F\times \Bbb F$, and by virtue of the way $\Bbb F$ was constructed, it can be proven that each line we started with is exactly the set of points of $\Bbb F\times \Bbb F$ satisfying some equation $ax+by+c=0$ with $a,b,c\in \Bbb F$, where at least one of $a,b$ is nonzero. (The actual construction would be fairly space consuming, and there are at least two different constructions.)

In summary, from a plane with a satisfactory geometry, one can build a field “coordinatizing” it, and it’s built in such a way that the lines look like $ax+bx+c=0$ for some $a,b,c$.

For an Archimedean geometry, this field is necessarily a subfield of $\Bbb R$, and if you add yet another “completeness” type axiom to the geometry, then the field will be all of $\Bbb R$.

Understandably, we do not herd schoolchildren through all of this stuff, but we just begin with our famous field $\Bbb R$ and work the other direction 🙂 We just say “lines look like that” and “Hey, look, this is a model of Euclidean geometry. Now go prove stuff.”

If you are really interested in the details of this, you can find complete proofs in Artin’s Geometric algebra and in Hilbert’s Foundations of geometry. I bet it’s in more modern texts too, but these are the ones I happen to know. Hilbert proves lines have the form $ax+by+c=0$ in theorem 34.

Please keep in mind, though, that the general construction deals with more than just subfields of $\Bbb R$. Some of the geometries produce finite fields, and some of the geometries produce noncommutative fields. And above all, some planar geometries are not suitable at all even to produce a division ring. (Geometries which fail Desargues’s theorem or Pascal’s theorem are examples of geometries which fail the construction.)

If one defines a line to be the locus of points satisfying an equation of the form $ax+by+c=0$, then one can verify that Euclid’s axioms hold and therefore that one has a good model of euclidean geometry. So you can start with your intuition, use that intuition to motivate the definition, and then verify that the definition has the properties you want it to have.

This is not different in principle than, for example, wanting to invent “negative numbers” so that all subtraction problems involving natural numbers have solutions. One has an intuition that expressions of the form $-n$, with certain addition and multiplication rules, are likely to fil the bill. Thus one tentatively takes this as a definition, then verifies that this definition does indeed serve the desired purpose, and that’s all one can reasonably ask for.

In Euclidean geometry, take two perpendicular lines through a point $O$, call one of them the “$x$ axis” and the other the “$y$ axis”. Drawing lines parallel to the axes gives coordinate functions $x(P)$ and $y(P)$ on points $P$ of the plane.

The statement that $ax(P)+by(P)$ is constant for points $P$ on a line $L$ is the same as the statement that (oriented) area of the triangle cut off by $L$ and the coordinate axes (triangle’s vertices are $O$ and $A = (0,a)$ and $B=(b,0)$) can be decomposed as sum of areas of the subtriangles,

Area $OPA$ + Area $OBP$ = Area $OBA$ ,

which (multiplied by $2$) says

$ax(P) + by(P) = 2 \times$ Area $OBA$

To be very precise, these coordinate functions take values that are directed lengths of segments, the terms in the equation are oriented areas, and to get dimensionless numbers $x$ and $y$ one would have to introduce a segment $u$ as unit of measurement and then $x = \frac{x(P)}{u}$, and $y = \frac{y(P)}{u}$. The equation as a whole would be divided by $u^2$ to get an equation of real numbers (which for present purposes means “elements of the field of segment-ratios in the geometry at hand”).

Here’s a video proof in one direction using some geometry and trigonometry ideas. I’m inferring that the definition of a line being used must be a straight figure that connects two points (in the video, B and P).