## What Is Science?

https://www.nasa.gov/multimedia/imagegallery/iotd.html

Before I begin posting more discussions about scientific concepts, I first need to lay out some preliminaries which serve as the bedrock for future discussions.

Anti-intellectual fashions such as postmodernism and instrumentalism, combined with media hype, may leave many people rightly confused about just what science purports to be. Numerous tomes have been written about the philosophy of science, so I can only hope to scratch the surface of some deep issues here. My intent is to provide a compact summary of the contemporary concept of “science”. This concept is largely prescriptive rather than descriptive. In other words, it outlines an ideal of science in its most effective form, rather than describing how science is actually done by real scientists. (Real science is messy, prone to biases, mistakes, social group pressures, and so on; these undesirable conditions inject errors into the system, making science less effective.)

In this discussion I assume that reality exists and that humans are capable of explaining it to some degree of accuracy. The denial of the first assumption is self-refuting. The antithesis of the second assumption is also incoherent: inexplicability does not mean we are practically unable to explain something due to lack of knowledge or excessive complexity–it means something does not have a rational explanation in principle, which admits logical contradictions. But contradictions are the very thing which tell us what is not real. What if humans just don’t have the cognitive capacity to understand some things? That is not the case. As David Deutsch points out, if you claim that humans cannot qualitatively understand a concept, that is equivalent to the above claim of inexplicability, which amounts to belief in magic. If you claim that we lack the quantitative capacity to understand some things, then this is a claim about computability, not rationality.

I also take the point of view that knowledge is fallible, since no other description of knowledge has been satisfactorily demonstrated to be coherent. “Fallible” means potentially able to be false. We cannot be perfectly, absolutely certain that any particular belief is true; however, as I will mention later, we can get extremely close to certain. Fallibilism does not imply skepticism; in other words, knowledge is not impossible. Non-fallibilist views hold a cartoonish, idealistic definition of knowledge which does not exist in real life. Certainty is like a mathematical asymptote, not an attainable value.

Science answers the “how?” and “why?” of things: it is in the business of explaining reality, although bad models of science have attempted to deny this. “Science is what scientists do” is a bad definition. Not all methodologies used now or in the past are equally capable of explaining reality. (People often say that science cannot answer “why” questions because the “why” is often over-interpreted to mean “for what purpose”, but this is not the correct meaning. “Why” means “for what reason”: it is a prompt for a good explanation–see below.)

A scientific theory is a type of knowledge structure which contains hypotheses, observational data, deductions, and predictions bound together by an explanatory framework. It has the following necessary properties: it must be coherent; it must correspond to observation; it must be explanatory. None of these properties alone are sufficient to allow knowledge. Much of the error about reasoning what is true can be traced to neglecting the importance of one or more of these properties. I shall now address them in turn.

The First Criterion: Coherence

That a theory must be coherent (does not contain contradictions) is (or should be) obvious. The antithesis of coherence is self-refuting, and therefore I shall expend no further energy addressing it. This criterion effectively rules out ideologies such as postmodernism and cultural relativism. Self-consistency in the the form of mathematical argument, pure deduction, is required for rigor in science; unfortunately, it is possible to be both rigorous and wrong. Hence, coherentism itself is insufficient; forgetting this is to make the error that pure deduction can result in knowledge. Consider that mathematics is a system of pure deduction. But mathematics alone cannot indicate whether the patterns it describes exist in nature. For that, we turn to the second property of a theory.

The Second Criterion: Falsifiability

Hypotheses, conjectures, and guesses need to be tested. It does not matter how parsimonious or beautiful the idea is if it is false. If the hypothesis is consistent with relevant observational tests, then it might be true, or it might not. If the hypothesis is not consistent with relevant observational tests, then it is false (assuming that the observations are accurate and understood). This is a matter of logic, not of method. I feel this point needs to be emphasized because I have seen professional scientists reject the falsifiability criterion based on a misunderstanding of its definition or manner of usage. So let us be clear: “falsifiability” does not mean that a single observation contradicting a theory necessitates the trashing of the entire theory, no questions asked. It might be that other assumptions, unrelated to the theory, are incorrect. Or it might be that the theory needs only a slight modification. There are good reasons to be hesitant about throwing out a well-established theory.

Hypotheses and theories may postulate the existence of unobserved entities. But the existence of these entities implies an observable phenomenon, which is a prediction. Scientists make an attempt to show that the prediction is false by observation, which in turn implies that the postulated entities do not exist. Any unobserved entity must be deductively linked to the observation. Without sound implication, unobservables might be mere chimeras. We should also be skeptical of unobservable entities dangling at the end of long chains of deductions. The longer and more complex the deduction, the greater the chance for error. On the other hand, if an observed prediction immediately and soundly implies (logically) the presence of an unseen entity, we should have high confidence of that entity’s existence.

But you may notice an interesting fact here. For some conjectures, we are able to test them and potentially falsify them; for other conjectures, we have no such capability. Falsifiable conjectures are scientific hypotheses; non-falsifiable conjectures are non-scientific speculation. (This is Karl Popper’s famous falsification criterion for demarcating science from non-science.) The non-falsifiable speculations can be broken down into two further categories: those which are practically non-falsifiable because we lack the technology or knowledge to test them, and those which are logically never falsifiable no matter what. Regarding the former, then, as our technology advances, the non-falsifiable speculations become scientific hypotheses. The latter type of speculation often includes very deep and fundamental questions; they will always be topics of interest amongst arm-chairs and late-night wine glasses, but unfortunately they have no place in a serious scientific research program. (Note that just because the truth-value of a claim cannot be determined by science does not imply that the claim has no truth-value.) But here we need to sound a note of caution. Trouble arises when people mistake a non-falsifiable speculation for a scientific hypothesis, which can result in rigorous fantasies and wasted time. Trouble also results from the converse: it is a shame when scientific hypotheses are hastily dismissed as unfalsifiable conjectures. It might be difficult to figure out how to test a hypothesis, and until you do determine how to test it, you would not know whether it is falsifiable or not. (Please note that not knowing whether a conjecture can be tested or not is not equivalent to knowing that the conjecture cannot be tested.)

If the hypothesis is consistent with observation, then that is good news, but it does not prove the theory to be true. There is no such thing as “proof” except in mathematics (and those proofs are not about the physical world). Passing observational tests is a necessary, not a sufficient condition for a theory.

The Third Criterion: Explanatory Power

So a coherent and correspondent theory is on the right track, but there is still a potential problem with it. Each set of observations can be described by an infinite number of coherent descriptions. This is why mere description is not sufficient for science. As Barrow puts it, “It is always possible to find a system of laws which will give rise to any set of observed outcomes.” (Note that here, a “law” is a mathematical description of a pattern in nature.) Consider an example from my previous blog post–Kepler’s Laws. Kepler’s accomplishment was to reveal a mathematical pattern in Brahe’s data. This was a description, not an explanation: not until Newton was there some reason given why the planets moved as Kepler described. Hence, Kepler’s Laws could not logically be used to perform predictions. The Keplerian description contains no reason why planetary motion should be the same before or after the window of observation time. Before Newton, there was no logical means to determine if Kepler’s Laws will be obeyed tomorrow, or if they were obeyed yesterday. Scientific induction does not exist (in terms of predicting sequences of events). Induction is a legitimate method in certain types of mathematical proof, but it is an error to apply it to physical science. Observed statistical regularities like Kepler’s Laws are valuable because they may inform our intuitions and lead us to pose hypotheses, but they are not the sole content of the explanatory portion of a theory.

How do you choose the correct description? If you take a moment, you can probably think of many absurd propositions which are falsifiable and thus technically are scientific hypotheses, but they should never be included in any scientific theory. Most hypotheses, such as infinite regressions, can be dismissed without testing because they are non-explanatory. Good explanations are difficult to find in the sense that they are a very small subset of testable hypotheses.

An explanation gives an answer as to why a proposition is true. Carl Hempel claimed that a scientific explanation has two parts. One, there must be a correct deduction from true premises to a true conclusion. Two, at least one of the premises must be a law of nature whose removal would invalidate the argument. This is known as the Deductive-Nomological Model of explanation, and whilst it is on the right track, it is not perfect. Explanations should contain causal relations and be hard to vary; the DN Model is only useful if “law of nature” is carefully defined as having a causal role (which is not what Hempel intended it to mean). A deep understanding of explanation requires some understanding of causation, but I don’t have the space to cover that here, as it is a vast and subtle topic. For the reader who is skeptical or curious about the relevance of causation, I have included some links at the bottom. In the first half of the Twentieth Century, the Logical Positivists rejected causal explanations as meaningless metaphysics which should be removed from science. I am adducing here that causal explanations form the very core of most science, without which we could not and would not do science. (It may be that causality is meaningless on a reductionist, microscopic level, but as of now, this issue remains unclear. At the minimum, causality is important on an emergent, macroscopic level of explanation. It could be that, at the most general level, we should not be using the language of cause and effect; that, instead, we should refer to reasons, which would include mathematical patterns as “pushy explainers” in nature.)

Broadly speaking, you can put explanations into one of two types: reductionist and emergentist. Reductionism is the technique of explaining a phenomenon in terms of its simpler constituent parts. The claim that “all legitimate explanations are reductionist” is false, and it has been falsified by counterexamples. The claim that “science is reductionist” is also false, but it is only propagated in public by the truly ignorant or profoundly confused. Science uses reductive arguments in cases where they are appropriate. But expressing causal explanations does not entail reductionism. In some cases, theory needs to explain a phenomenon in terms of high-level interactions; merely looking at tiny components of the system reveals no useful information. Emergent phenomena or laws are not necessarily less fundamental than laws describing small components, such as subatomic entities; thinking otherwise is a confusion of scale with fundamentalism (fundamental laws are deep truths which feature in many explanations). Interestingly, the progress of science itself is a result of emergence. Each theory is a layer of explanation; new theories reveal new layers of explanation as they refine their old versions. (Please note that successive refinement is not infinite regression.)

There are other models of explanation, such as the Inductive-Statistical Model, which has led to the misconception that modern science is fundamentally probabilistic and therefore cannot impart any knowledge. This is false, because not all scientific explanations rely on probabilistic arguments–and those that do might be able to be recast in non-probabilistic terms. But more importantly, “certain knowledge” is an oxymoron; the emotional desire for certainty has led models of knowledge to dead-ends (e.g., the JTB and JTB+X models of knowledge). Second of all, unqualified statements such as “probabilistic theories cannot generate certainty” fail to consider the degree of certainty of the claims that science can generate. Scientific theories, by adding corrections, can approach the truth to arbitrarily high precision. Each successively better theory, though each is imperfect, is what we call knowledge.

The instrumentalist might object in the following way: “Explanation is for head-in-the-clouds philosophers, not practical, down-to-earth scientists. The only job of science is to make predictions. Don’t attach your ‘interpretation’ to purely predictive theory.” My response: Descriptions describe things; that is all they do. Descriptions cannot predict. Predictions are generated from proposed explanations. Of course we want science to make predictions. What is the thing that is making the prediction? Only hypotheses and theories generate predictions. Not only is it undesirable to strip science of explanation because it would strip science of its purpose, it is actually impossible to do so. It is fantasy to imagine that you can extract an “interpretation” from a theory and keep it separate, for reasons I explain below to the empiricist that everything in science is theory-laden. If you think your theory is interpretation-free, that is a sign that it contains unexamined or hidden assumptions. Is it your intention to hide your theory’s assumptions from criticism? There are practical tools called “effective theories”, which are not really theories in the sense that I mean here. They are descriptions of effects, not causes. (Note that an approximate theory does not entail that it is effective, and vice versa.) They may refer to black boxes, or even causes which are explicitly stated to be fictional entities. An instrumentalist might claim that every scientific theory is actually an effective theory. Effective theories may be useful in certain predictive tasks applied to an appropriate domain, but it is hard to see that they could all be incorporated into a consistent world view, since each effective theory contains fictional entities which might be inconsistent with the entities of other theories. (A collection of fictional stories, mostly inconsistent with each other, meant to describe the world through metaphor–does that sound familiar?) Since these domain-restricted theories are predictive but not explanatory, and effective but not causal, an Instrumentalist stance is indeed appropriate within their context. Here I have argued only that it is not an appropriate stance for causal theories which purport to provide explanations.

The empiricist might object in the following way: “We need to prevent science from bloating beyond the bounds of its applicability. We do that by making sure it is firmly grounded in observation. Anything not directly observable should not be considered real.” My response: I agree that we desire to prevent science from becoming bloated, but the pure Empiricist’s cited criterion then rules out knowledge of things which are real, including observation itself. We already have a criterion for demarcating science from non-science: falsifiability (which is intimately connected with observation, you’ll be glad to note). Furthermore, there is no such thing as a pure or direct observation: the simplest observation is performed in a framework of theory, without which observation has no meaning. Human senses report observation to human consciousness after the brain performs a great amount of data processing, including filtering, the details of which are not observed–and this is complicated by the fact that consciousness only ever perceives memories of sense impressions, not the impressions themselves: thus, with regard to human sense-based observation (and thus to machine-based observation, since we observe the output of machines with our senses), pure Empiricism rules itself out, which is an incoherency. Observations are important because they can falsify explanations and spur conjecture, but observations themselves are insufficient for any meaningful activity such as science.

The inductivist might object in the following way: “Statistical regularities can make predictions. If I observe that an event happens every day for a billion years, I can have high confidence that it will happen tomorrow. That would be my prediction based on the principle of induction.” My response: On what grounds? That is a mere assertion; there’s no logic behind it. Newton’s Laws make predictions because they have explanatory power. Kepler’s Laws predict nothing because they are mere descriptions. The imaginary regular event you describe will keep happening until it doesn’t. If each observed event is probabilistically independent, then the occurrence of an event has no affect on the probability of a subsequent event. But you won’t know if they’re independent, you won’t be able to predict if or when it will stop, and if or when it does, you won’t know why. You’ve explained nothing. Inductivism not only fails to lend support to a proposition, it also fails to provide probabilities of truth-values.

The Incompleteness Skeptic might object in the following way: “Scientific knowledge is impossible because Gödel’s Incompleteness Theorem shows that there are undecidable statements in mathematics. Since science relies on mathematics, deductions in science are unreliable.” My response: That is a misunderstanding. Gödel’s Theorem is about incompleteness in formal systems whose axioms are at least as complicated as Peano Arithmetic. Arithmetic on real numbers and other systems used in physics are decidable. Where is your proof that that type of undecidable axiomatic system is necessary to describe physical reality? There are calculations which are too difficult, so that we cannot find solutions to some specific questions, but this does not seem to be related to Gödel Incompleteness. Skepticism is a tool which should be wielded appropriately. Failure to use it leads to all manner of credulity and quackery. On the other extreme, radical uninformed skepticism about knowledge and rationality is intellectual laziness which hinders progress.

The postmodernist and cultural relativist might object in the following way: “What makes you think rationality is valid? It’s just a social construction! And doesn’t quantum mechanics reveal that the Universe is illogical? And who are you to tell me what’s true? How dare you marginalize my personal truth!” My response: That is profound incoherence. If rationality is a social construction, your theory that rationality is a social construction is itself a social construction, which means your theory is not objectively true. Therefore your claim disproves itself. And if you think any theory of science demonstrates that the Universe is illogical, you have not understood the theory, as theories require logical coherence by definition. If a theory actually purports to demonstrate something illogical, that means the theory has been falsified. And you clearly don’t know what truth is. Other people can point you in the direction of truth because they are thinking beings; truth, by definition, is objective, not personal. There is a world existing outside your head, and it is unaware of your desires or feelings.

Conclusion:

Although science is an essential system for attaining knowledge, you cannot draw the unfounded implication that other areas of human endeavor should be de-valued. Art, philosophy, mathematics, and so on are enormously valuable and important human activities; but obtaining knowledge about the physical world is outside their area of application.

Is this discussion entirely a matter of philosophical opinion? No: the methods of science are themselves subject to the cycle of hypothesis testing. It’s not a matter of opinion; it’s a matter of what works. How do you know the characterization of science I have given here is the correct one? You don’t know for certain because nothing is certain. If it is correct, this form of science will cause progress by means of the expansion of knowledge, and consequently of technology. It is not reasonable to think that it is perfect now or ever will be; experience will introduce corrections, improving it further. Bad characterizations of science, which disallow criticism, will stymie progress. Definitions of science based on the historical practice of it are not helpful. We need to look to the future, to what science could become, to discover the most reliable methods for obtaining knowledge. Can you think of ways humanity’s science can be improved, in theory or in practice?

— Ander Nesser, the 30th of May, 2017. Last updated on the 10th of July, 2018.

References:

Barrow, John D. New Theories of Everything. Oxford University Press, 2007.

(Note that is book is not about specific theories, but about how physical theories are constructed generally. The haughty-sounding phrase “theory of everything” is merely the idea that it should be possible to demonstrate that all the laws of physics are consistent with each other. A “theory of everything” is a necessary but not a sufficient condition for understanding nature.)

Deutsch, David. The Beginning of Infinity: Explanations That Transform the World. Viking, 2011.

(Especially relevant to the above discussion are chapters one and twelve.)

Hetherington, Stephen. “Fallibilism”: http://www.iep.utm.edu/fallibil/

Ichikawa, Jonathan Jenkins and Steup, Matthias. “The Analysis of Knowledge”. The Stanford Encyclopedia of Philosophy (Spring 2017 Edition), Edward N. Zalta (ed.): https://plato.stanford.edu/archives/spr2017/entries/knowledge-analysis/

(This article explains, in great detail, the inviability of models of knowledge based on “justified true belief”.)

Liston, Michael. “Scientific Realism and Antirealism”: http://www.iep.utm.edu/sci-real/

(I accept the five postulates of strong scientific realism. Though informative, this article spends much verbiage on highly dubious philosophy.)

Mayes, G. Randolph. “Theories of Explanation”: http://www.iep.utm.edu/explanat/

Dowe, Phil. “Causal Processes”. The Stanford Encyclopedia of Philosophy (Fall 2008 Edition), Edward N. Zalta (ed.): https://plato.stanford.edu/archives/fall2008/entries/causation-process/

Faye, Jan. “Backward Causation”. The Stanford Encyclopedia of Philosophy (Spring 2017 Edition), Edward N. Zalta (ed.): https://plato.stanford.edu/archives/spr2017/entries/causation-backwards/

Hitchcock, Christopher. “Probabilistic Causation”. The Stanford Encyclopedia of Philosophy (Winter 2016 Edition), Edward N. Zalta (ed.): https://plato.stanford.edu/archives/win2016/entries/causation-probabilistic/

Hoefer, Carl. “Causal Determinism”. The Stanford Encyclopedia of Philosophy (Spring 2016 Edition), Edward N. Zalta (ed.): https://plato.stanford.edu/archives/spr2016/entries/determinism-causal/

Klein, Peter D. and John Turri. “Infinitism in Epistemology”: http://www.iep.utm.edu/inf-epis/

Melamed, Yitzhak Y. and Lin, Martin. “Principle of Sufficient Reason”. The Stanford Encyclopedia of Philosophy(Spring 2018 Edition): https://plato.stanford.edu/entries/sufficient-reason/

Poston, Ted. “Foundationalism”: http://www.iep.utm.edu/found-ep/

Schaffer, Jonathan. “The Metaphysics of Causation”. The Stanford Encyclopedia of Philosophy (Fall 2016 Edition), Edward N. Zalta (ed.): https://plato.stanford.edu/archives/fall2016/entries/causation-metaphysics/

https://www.nasa.gov/multimedia/imagegallery/iotd.html

## Why Is Kepler’s Second Law True?

In the previous post, I mentioned sharing some things I have learned whilst preparing notes for my novels. Today’s topic relates to the orbital motion of planets.

Consider a very simple model of a point moving along the circumference of a circle. We give no reason for the point to change its speed, so let us assume it stays constant. Now, a circle seems an unlikely thing, a very special type of ellipse. So let us alter the model so that the ellipse has two foci instead of one. Is the point still moving at constant speed? Why would it not? Intuition might say that if this were a physical system where the point is pulled to a focus, the point would be moving at a constant speed (unless some external force were altering the system). So this simple model, one might have expected, applies to the orbits of planets around a sun. But when Johannes Kepler analyzed Tycho Brahe’s big data set on planetary motion, he discovered that the planets travel faster near the Sun and slower when far from the Sun. This is called Kepler’s Second Law of Planetary Motion.

Isaac Newton was able to prove that this law is true, and it holds for all two-body systems bound together in gravitational orbits. But wait, how could Newton deductively prove an observational discovery which seems dependent on the contingent nature of a physical system? Here is an attempt to outline why Kepler’s Second Law is a matter of deductive reasoning, largely independent from the exceptional physical nature of gravity, or planets, or stars.

But first let us acknowledge that we shall be talking about a toy model of a planetary system; we shall not be considering relativistic effects, quantum effects (i.e., a system of objects with planet-like orbits cannot exist at subatomic scales), or more mysterious galactic-scale effects. Additionally, for simplicity we can assume that a low-mass planet is orbiting a high-mass star so that one focus of the elliptical orbit aligns with the star’s center of mass. (In reality, this is never the case. Observe the diagrams below.

The size of the white dots indicates the relative mass of the objects in orbit. The objects orbit the center of mass of the system, the barycenter, not the center of mass of the star. As the difference in mass between the two objects increases, the barycenter approaches the center of mass of the more massive object. In discussing Kepler’s law, the important thing is that we place the origin of our coordinate system at this barycenter, which is one focus of an ellipse.)

With the preliminary qualifications of the model out of the way, let us turn to the argument itself. (Please note I am keeping the tone informal and conversational; this is not a formal mathematical proof.) Every argument begins with a set of assumptions (not to be restricted to the colloquial usage of “assumption”, as argumentative assumptions may be empirically sourced facts). We shall assume that angular momentum is conserved, a fact which can be explained in the following way. First, note that the mass scalar ($m$) multiplied by the acceleration vector ($\vec{a}$) equals the force vector ($\vec{F}$) (Newton’s Second Law of Motion):

$\vec{F}=m\vec{a}$

Second, recall Newton’s Law of Gravity, which states that the force equals the gravitational constant multiplied by the two objects’ masses, divided by the square of the distance, and then multiplied by the position vector:

$\vec{F}=-\frac{GMm}{r^{2}}\hat{r}$

So these two expressions of force are equal to each other:

$m\vec{a}=-\frac{GMm}{r^{3}}\vec{r}$

Divide both sides of the equation by the mass of the planet to simplify the formula:

$\vec{a}=-\frac{GM}{r^{3}}\vec{r}$

Now it is easy to see that you can take that quotient to the left of the position vector and collapse it into a constant. So the acceleration vector equals the position vector times a constant:

$\vec{a}=c\vec{r}$

And vectors which are multiples of each other are parallel. The cross product of parallel vectors equals a zero vector:

$\vec{a}\times\vec{r}=\vec{0}$

which is

$\ddot{\vec{r}}\times\vec{r}=\vec{0}$

Now, take a moment to consider the derivative of the cross product of the position vector and the velocity vector:

$D_{t}[\vec{r}\times\dot{\vec{r}}]$

This derivative equals two summands: velocity cross velocity, and position cross acceleration:

$\dot{\vec{r}}\times\dot{\vec{r}}+\vec{r}\times\ddot{\vec{r}}$

The cross product of identical vectors is a zero vector, so the augend is zero:

$\vec{0}+\vec{r}\times\ddot{\vec{r}}$

And we just concluded above that the addend is zero. So the sum is a zero vector:

$\vec{0}+\vec{0}=\vec{0}$

And if a function’s derivative is zero, then it is a constant function. We shall call this constant vector $\vec{L}$: because it is their cross product, $\vec{L}$ is perpendicular to the position vector and the velocity vector. So the orbiting objects move orthogonally to $\vec{L}$. Since $\vec{L}$ never changes, the objects’ movements are restricted to a plane. The orbit is never warped in a third dimension; in other words, angular momentum is conserved.

With the establishment of the momentum conservation assumption, we can enter the heart of the argument. The vector function denoting the orbiting object’s position can be expressed in polar coordinates as cosine of the angle ($\varphi$) times the unit vector $\hat{i}$  plus sine of the angle times the unit vector $\hat{j}$ all multiplied by the length of that position vector:

$r(\cos\varphi\hat{i}+\sin\varphi\hat{j})$

Next, we find velocity: $\dot{\vec{r}}=r(-\sin\varphi\hat{i}+\cos\varphi\hat{j})\dot{\varphi}$. Then find the cross product of the position and velocity vectors. If you do the algebra, you should see that it equals the distance squared times the time derivative of the angle times unit vector $\hat{k}$:

$\vec{r}\times\dot{\vec{r}}=r^{2}\dot{\varphi}\hat{k}$

Well, earlier we had already decided that this cross product equals $\vec{L}$, so this expression also equals $\vec{L}$:

$\vec{L}=r^{2}\dot{\varphi}\hat{k}$

Because $\hat{k}$ is just a unit vector, the distance of $\vec{L}$ is $r$ squared, times the derivative of the angle:

$L=r^{2}\dot{\varphi}$

Now consider the variable angle $\varphi$ at two specific angles, $\alpha$ and $\beta$. We know that the area bounded by these angles should be half the integral of $r$ squared times the differential of $\varphi$ from $\alpha$ to $\beta$:

$A=\frac{1}{2}\intop_{\alpha}^{\beta}r^{2}d\varphi$

If we set our clock to zero when the angle $\varphi$ equals $\alpha$, and one time unit later the angle reaches $\beta$, we can re-write the integral like this: half the integral of $r$  squared times the time-derivative of the angle times the differential of time from $t_{0}$  to $t_{1}$:

$A=\frac{1}{2}\intop_{t_{0}}^{t_{1}}r^{2}\frac{d\varphi}{dt}dt$

We already learned that the integrand is the length of $\vec{L}$:

$\frac{1}{2}\intop_{t_{0}}^{t_{1}}Ldt$

So this integral equals half of $L$  multiplied by the time difference:

$\frac{1}{2}L(t_{1}-t_{0})=A$

Now you can easily see that for any two time intervals of equal length, the area is the same, as swept out by the distance line from the orbiting object to its barycenter. Hence the famous refrain “equal areas in equal times.” But equal areas mean unequal arc lengths along the ellipse traveled in equal times, and hence the planets change their orbital speeds.

So how much of this argument relies on empirical observation, and how much on armchair reasoning? You can see that the main body of the argument relies on properties of vectors. So let us go back further, to the assumptions. Again, we used vector properties to obtain conservation of momentum, and the main argument’s deductions are a consequence of this conservation, but we really started with Newton’s Second Law of Motion and his Law of Gravity. Note that most of the empirical details, such as the masses of objects and the value of the gravitational constant ($G$), disappear into the constant we labeled $c$. It is not only that the values of the variables and constant are irrelevant; even much of the detail in the law itself is abstracted into one simple term. I did not expect that so much of Kepler’s Second Law relies not on the contingent properties of gravitation, but on the geometry of vectors, which is basically logic.

— Ander Nesser, the 29th of April, 2017

References:

https://plato.stanford.edu/entries/kepler/#CopRefThrPlaLaw

Newton’s original proof is in Book 1, Section 2 of his Principia: http://www.17centurymaths.com/contents/newtoncontents.html

The mathematics of the above argument is based on Lecture 14 of the course Understanding Multivariable Calculus by Prof. Bruce Edwards, University of Florida: http://www.thegreatcourses.com/courses/understanding-multivariable-calculus-problems-solutions-and-tips.html

The images of orbits are drawn with Celestia: https://celestiaproject.net/

The barycenter animations are provided by Wikipedia: https://en.wikipedia.org/wiki/Barycenter

The gif animating Kepler’s Second Law was made by Antonio González Fernández of the Engineering School of the University of Seville: https://en.wikipedia.org/wiki/File:Kepler-second-law.gif