What dx Actually Means
Copyright (c) 1996 by Kenny Felder
When you were first learning calculus, you learned how to calculate a
derivative and how to calculate an integral. You also learned some
notation for how to represent those things: f'(x) meant the
derivative, and so did dy/dx, and the integral was represented by
the exaggerated S symbol. None of this notation was particularly
meaningful, but you sort of knew what it meant, and eventually life
was comfortable.
And then one day, your professor said something like "In this case, we
divide dx by 2." Or worse, "let's multiply both sides of the equation
by dx." It made no sense. It was like saying "we can ignore the part
that doesn't fit on the page." And your mind spun with questions. "Can
he do that? Does dx really mean something? How and when do you
manipulate it? Is this going to be on the test?"
The answer to the last question, I'm afraid, is yes. Understanding
what dx actually means is not just something you do to feel
philosophically pure. There are certain problems where you really need
to understand dx: not to take a derivative or an integral, but to
figure out what derivative or integral to take! So there is your
motivation for sifting through the rest of this paper. By the time
you're done, I hope you will have a pretty good understanding of what
dx actually means, and be somewhat ready to start applying your
understanding to solve problems.
What dx Actually Means
You remember talking about Dx in a precalculus course. It represents a
distance along the x-axis; or, to put it another way, the difference
between any two values of x. Well, dx means exactly the same thing,
with one key difference: it is a differential distance, which is a
fancy way of saying very, very, very small. In technical terms, dx is
what happens to Dx in the limit when Dx approaches zero.
Now, when you have a quantity whose value is virtually zero, there's
not much you can do with it. 2+dx is pretty much, well, 2. Or to take
another example, 2/dx blows up to infinity. Not much fun there, right?
But there are two circumstances under which terms involving dx can
yield a finite number. One is when you divide two differentials; for
instance, 2dx/dx=2, and dy/dx can be just about anything. Since the
top and the bottom are both close to zero, the quotient can be some
reasonable number. The other case is when you add up an almost
infinite number of differentials: which is kind of like an almost
infinite number of atoms, each of which has an almost zero size,
adding up to a basketball. In both of these cases, differentials can
wind up giving you a number greater than zero and less than infinity:
an actually interesting number. As you may have guessed, those two
cases describe the derivative and the integral, respectively. So let's
talk a bit more about those, one at a time.
The Slope of a Curve
Most of us learned about derivatives in terms of the slope of a curve,
so that is where I'm going to start; but I may take a slightly
different approach than the one you remember.
To start off, remember how you define the slope of a line. You take
any two points on the line, and define the slope of the line as Dy/Dx:
the change in y divided by the change in x, or "the rise over the
run." The slope physically represents how fast the graph is going up.
The great thing about lines is, it doesn't matter where you pick your
two points, the slope will always be the same.
Now, when you want the slope of a curve, you might try to define it
the same way. The problem is, the slope varies from point to point. In
the curve below, I have labeled three points; and you can see that if
we calculated Dy/Dx from A to B we would get a negative slope, from B
to C would give us a positive slope, and from A to C might give us
zero!
A C
* * * * * *
* * *
* * *
* * *
B
So it's meaningless to talk about "the slope" of that curve. On the
other hand, you can certainly talk about the slope at A: it's going
down. To quantify this idea, we might pick two points relatively near
A--one just above, and one just below--and calculate Dy/Dx on those
two points. The closer those points are to A (and to each other), the
more accurately they would describe the slope at that point.
So, we're going to invoke a limit, to get "infinitely close" to A. We
will talk about Dy/Dx at points very close to A and see what happens
to that ratio when Dx approaches 0. Dy also approaches 0, of course,
and the ratio of these two tiny numbers approaches the exact slope at
that point.
Since we now have differential intervals (that is, they approach 0),
we designate them with a d instead of a D. So we have dy/dx.
The slope is given by the fraction dy/dx, which is how you have always
written the derivative. But now you see that this is not just an
arbitrary notation; it is actually a fraction, just as it appears.
This may also help explain the chain rule: when they say dz/dx = dz/dy
* dy/dx, they really are just multiplying fractions!
The Area Under a Curve
I've talked about half of the notation of beginning calculus, which is
the dx that keeps popping up. The other half is that integral sign part,
which looks like an elongated "S". In fact, that's kind of what it is:
it stands for sum, as in adding things. In particular, it means you
are adding a virtually infinite number of things. Of course, if the
"things" had finite values, that would always give you an infinite
answer. But the things are our differential values--each one is
roughly zero by itself--so they can actually add up to a reasonable
finite number. Whenever you see the integral sign you should assume
that you are about to add up an infinite number of differential-sized
things.
So to start with, consider everybody's favorite integral problem, the
area under a curve. Just for fun, let's look at the same curve we had
above, and think about the area under A-to-C.
A C
* * * * * *
* * *
* * *
* * *
B
Now, if this were a rectangle, we could find the area easily: the area
equals the width times the height, and you're done. The problem is,
the darn height keeps changing on us, as we move from A to C.
So in order to minimize this problem--the problem of the height
changing all the time--we're going to focus on a very small region of
the graph, where the height is relatively stable. Start by picking a
point x somewhere in our graph, and another point just beyond it:
x+dx. Drawing vertical lines at these two points, we get a little
region, shaded dark in the drawing below.
If we treat this region as a rectangle, its area is trivial to
compute. The height is f(x) and the width is dx. Of course, you can
see that the region isn't a rectangle, and the height is only f(x) at
the far left. But as dx becomes smaller--as we bring the right side
toward the left--the height change becomes less significant, and the
region more closely resembles a rectangle. As dx approaches zero, this
approximation becomes perfect: the area of the shaded region is
f(x)dx.
So if the area of that region is f(x)dx, what is the total area under
the curve between A and C? Clearly, it is the sum of the areas of all
the regions between those two points. And that is what the integral
means: in this case,it means we add up all those little regions
between A and C.
Of course, you already knew that. Without understanding what dx or
the integral sign means at all, you knew that the integral would give
you the area under the curve. So let me move on to a problem that you
can't figure out without working pretty directly with dx.
An Example Where You Really Need to Know This
What is the volume of the cone described below?
The height of the cone is h.
The base is a circle.
The height and radius of the circular base form two legs of a 45
degree right triangle. The hypotenuse of this triangle lies in the
surface of the cone, and connects the circumference of the circle with
the vertex of the cone.
You're not allowed to look it up: all you know is that the area of a
circle is pi r squared. Because the cone goes up at a 45 degree angle,
you can see that the radius of the circle at the bottom is h, the height
of the cone.
But what is the volume?
The general way to solve problems like this is to break the object up
into small differential chunks. In this case, the chunk would be a
circular disk, at a distance x from the vertex. The height of the disk
is a differential dx.
As we did in the area-under-the-curve problem, we're going to make a
key approximation here. The width of the disk is not uniform: it is
wider on top than on the bottom. But as dx approaches zero, this
difference becomes irrelevant, so we are going to treat this region as
a uniform circular disk. At that point, finding its volume is not too
tough. The radius is x (again, because of our 45 degree angle, the
radius is always the same as the distance from the ground). So its
area is pi x squared.
Its volume is the area times the height, which you can see is pi x x dx.
As you would expect, the volume is close to zero, since dx itself is
so close to zero.
The total volume is an infinite number of those zero-volume disks,
added as we go up the disk from x=0 at the vertex to x=h at the base.
So we have reached the point where we want to sum up an infinite
number of differential amounts, which is when we integrate. The
expression to be integrated from 0 to h is pi x^2 dx, which you can work
out to be 1/3 pi h^3.
We're Done!
If you followed that last example, you have gotten out of this paper
exactly what I wanted you to get. A whole host of problems in math and
physics follow that same approach:
* Divide the problem into differential amounts
* Solve the problem for each differential amount
* Integrate to sum up all the differential amounts, and get your
answer
Of course, there are a lot of things I haven't explained. The biggest
one is why you sum up things by taking an antiderivative: maybe I'll
write another paper on that some day (once I understand it). But once
you do a few problems like this, you will find that a whole world of
previously insoluble problems are now within your reach.
_________________________________________________________________
Important Note Added by Those Wiser Than I
Since I first posted this paper, two different people have emailed me
to tell me that Real Mathematicians don't do this. Playing with dx in
the ways described in this paper is apparantly one of those smarmy
tricks that physicists use to give headaches to mathematicians.
I didn't even realize I was preaching something nonstandard, because
most of my mathematical background comes from physics classes. So, be
warned. If you are taking physics classes, the stuff in this paper
will be very useful to you. If you are taking math classes, it may
help you to gain some intuition, but use it cautiously: you may be
expected to master more rigorous methods.
_________________________________________________________________