Here is a more understandable description of the semi-satire that follows: math.stackexchange.com/questions/53969/what-does-formal-mean/3297537#3297537

You start with a very small list of:

- certain arbitrarily chosen initial strings, which mathematicians call "axioms"
- rules of how to obtain new strings from old strings, called "rules of inference" Every transformation rule is very simple, and can be verified by a computer.

Using those rules, you choose a target string that you want to reach, and then try to reach it. Before the target string is reached, mathematicians call it a "conjecture".

Mathematicians call the list of transformation rules used to reach a string a "proof".

Since every step of the proof is very simple and can be verified by a computer automatically, the entire proof can also be automatically verified by a computer very easily.

Finding proofs however is undoubtedly an uncomputable problem.

Most mathematicians can't code or deal with the real world in general however, so they haven't created the obviously necessary: website front-end for a mathematical formal proof system.

The fact that Mathematics happens to be the best way to describe physics and that humans can use physical intuition heuristics to reach the NP-hard proofs of mathematics is one of the great miracles of the universe.

Once we have mathematics formally modelled, one of the coolest results is Gödel's incompleteness theorems, which states that for any reasonable proof system, there are necessarily theorems that cannot be proven neither true nor false starting from any given set of axioms: those theorems are independent from those axioms. Therefore, there are three possible outcomes for any hypothesis: true, false or independent!

Some famous theorems have even been proven to be independent of some famous axioms. One of the most notable is that the Continuum Hypothesis is independent from Zermelo-Fraenkel set theory! Such independence proofs rely on modelling the proof system inside another proof system, and forcing is one of the main techniques used for this.

Much of this section will be dumped at Section "Website front-end for a mathematical formal proof system" instead.

If Ciro Santilli ever becomes rich, he's going to solve this with: website front-end for a mathematical formal proof system, promise.

A proof in some system for the formalization of mathematics.

One of the first formal proof systems. This is actually understandable!

This is Ciro Santilli-2020 definition of the foundation of mathematics (and the only one he had any patience to study at all).

TODO what are its limitations? Why were other systems created?

It seems to implement Zermelo-Fraenkel set theory.

A set of axioms is consistent if they don't lead to any contradictions.

When a set of axioms is not consistent, false can be proven, and then everything is true, making the set of axioms useless.

It or its negation could therefore be arbitrarily added to the set of axioms.

An easy to prove theorem that follows from a harder to prove theorem.

Intuitively: unordered container where all the values are unique, just like C++

`std::set`

.More precisely for set theory formalization of mathematics:

- everything is a set, including the elements of sets
- string manipulation wise:
`{}`

is an empty set. The natural number`0`

is defined as`{}`

as well.`{{}}`

is a set that contains an empty set`{{}, {{}}}`

is a set that contains two sets:`{}`

and`{{}}`

`{{}, {}}`

is not well formed, because it contains`{}`

twice

The size of a set.

For finite sizes, the definition is simple, and the intuitive name "size" matches well.

But for infinity, things are messier, e.g. the size of the real numbers is strictly larger than the size of the integers as shown by Cantor's diagonal argument, which is kind of what justifies a fancier word "cardinality" to distinguish it from the more normal word "size".

The key idea is to compare set sizes with bijections.

Set of ordered pairs. That's it! This is illustrated at: math.stackexchange.com/questions/1480651/is-fx-x-1-x-2-a-function/1481099#1481099

Mnemonic: in means into. So we are going into a codomain that is large enough so that we can have a different image for every input.

Mnemonic: sur means over. So we are going over the codomain, and covering it entirely.

Vs: image: the codomain is the set that the function might reach.

The image is the exact set that it actually reaches.

E.g. the function:
could have:

$f(x)=x_{2}$

- codomain $R$
- image $R_{+}$

Note that the definition of the codomain is somewhat arbitrary, e.g. $x_{2}$ could as well technically have codomain:
even though it will obviously never reach any value in $R_{2}$.

$R⋃R_{2}$

The exact image is in general therefore harder to characterize.

This section is about functions that operates on arbitrary sets.

A function that maps two sets to a third set.

A Cartesian product that carries over some extra structure of the input groups.

E.g. the direct product of groups carries over group structure on both sides.

This section is about functions that operate on numbers such as the integers or real numbers.

We define this as the functional equation:
It is a bit like cauchy's functional equation but with multiplication instead of addition.

$f(x,y)=f(x)f(y)$

The differential equation that is solved by the exponential function:
with initial condition:

$y_{′}(x)=y(x)$

$y(0)=1$

TODO find better name for it, "linear homogenous differential equation of degree one" almost fully constrainst it except for the exponent constant and initial value.

The Taylor series expansion is the most direct definition of the expontial as it obviously satisfies the exponential function differential equation:

- the first constant term dies
- each other term gets converted to the one before
- because we have infinite many terms, we get what we started with!

$e_{x}=∑_{n=0}n!x_{n} =1+1x +2x_{2} +2×3x_{3} +2×3×4x_{4} +…$

$e_{x}=lim_{n→∞}(1+nx )_{n}$

The basic intuition for this is to start from the origin and make small changes to the function based on its known derivative at the origin.

More precisely, we know that for any base b, exponentiation satisfies:And we also know that for $b=e$ in particular that we satisfy the exponential function differential equation and so:
One interesting fact is that the only thing we use from the exponential function differential equation is the value around $x=0$, which is quite little information! This idea is basically what is behind the importance of the ralationship between Lie group-Lie algebra correspondence via the exponential map. In the more general settings of groups and manifolds, restricting ourselves to be near the origin is a huge advantage.

- $b_{x+y}=b_{x}b_{y}$.
- $b_{0}=1$.

$dxde_{x} (0)=1$

Now suppose that we want to calculate $e_{1}$. The idea is to start from $e_{0}$ and then then to use the first order of the Taylor series to extend the known value of $e_{0}$ to $e_{1}$.

E.g., if we split into 2 parts, we know that:
or in three parts:
so we can just use arbitrarily many parts $e_{1/n}$ that are arbitrarily close to $x=0$:
and more generally for any $x$ we have:

$e_{1}=e_{1/2}e_{1/2}$

$e_{1}=e_{1/3}e_{1/3}e_{1/3}$

$e_{1}=(e_{1/n})_{n}$

$e_{x}=(e_{x/n})_{n}$

Let's see what happens with the Taylor series. We have near $y=0$ in little-o notation:
Therefore, for $y=x/n$, which is near $y=0$ for any fixed $x$:
and therefore:
which is basically the formula tha we wanted. We just have to convince ourselves that at $lim_{n→∞}$, the $o(1/n)$ disappears, i.e.:

$e_{y}=1+y+o(y)$

$e_{x/n}=1+x/n+o(1/n)$

$e_{x}=(e_{x/n})_{n}=(1+x/n+o(1/n))_{n}$

$(1+x/n+o(1/n))_{n}=(1+x/n)_{n}$

Is the solution to a system of linear ordinary differential equations, the exponential function is just a 1-dimensional subcase.

Note that more generally, the matrix exponential can be defined on any ring.

The matrix exponential is of particular interest in the study of Lie groups, because in the case of the Lie algebra of a matrix Lie group, it provides the correct exponential map.

en.wikipedia.org/wiki/Logarithm_of_a_matrix#Existence mentions it always exists for all invertible complex matrices. But the real condition is more complicated. Notable counter example: -1 cannot be reached by any real $e_{tk}$.

The Lie algebra exponential covering problem can be seen as a generalized version of this problem, because

- Lie algebra of $GL(n)$ is just the entire $M_{n}$
- we can immediately exclude non-invertible matrices from being the result of the exponential, because $e_{tM}$ has inverse $e_{−tM}$, so we already know that non-invertible matrices are not reachable

Most notable example: $L_{2}$.

What do you prefer,

`1 \times 10^{10}`

or `1E10`

.A good definition is by using Dedekind cuts.

An ordered pair of two real numbers with the complex addition and multiplication defined.

Forms both a:

- division algebra if thought of $R_{2}$ with complex multiplication as the bilinear map of the algebra
- field

Constructs the quaternions from complex numbers, octonions from quaternions, and keeps doubling like this indefinitely.

Kind of extends the complex numbers.

Some facts that make them stand out:

- one of the only three real associative division algebras in addition to the real numbers and complex numbers, according to the classification of associative real division algebras
- the simplest non-commutative division algebra. Contrast for example with complex numbers where multiplication is commutative

Unlike the quaternions, it is non-associative.

This is the part of the formalization of mathematics that deals only with the propositions.

In some systems, e.g. including Metamath, modus ponens alone tends to be enough, everything else can be defined based on it.

Builds on top of propositional logic, adding notably existential quantification.

Models existence in the context of the formalization of mathematics.

Existence and uniqueness results are fundamental in mathematics because we often define objects by their properties, and then start calling them "the object", which is fantastically convenient.

But calling something "the object" only makes sense if there exists exactly one, and only one, object that satisfies the properties.

One particular context where these come up very explicitly is in solutions to differential equations, e.g. existence and uniqueness of solutions of partial differential equations.