I recently saw a proof that the real number field is not interpretable in the complex number field. But this required the axiom of choice, namely the existence of wild automorphisms of the complex numbers. Is there a way to prove it in ZF alone?

Kyle Sutton
2022-07-07
Answered

I recently saw a proof that the real number field is not interpretable in the complex number field. But this required the axiom of choice, namely the existence of wild automorphisms of the complex numbers. Is there a way to prove it in ZF alone?

You can still ask an expert for help

billyfcash5n

Answered 2022-07-08
Author has **17** answers

In the comments, I mentioned the following argument that R is not interpretable in C: C is stable, R is unstable, and stability is preserved under interpretations.

The word "stable" may look a bit scary here - indeed, stability theory is a rather technical subject - but the argument above actually doesn't need any complicated ideas from stability theory. It just boils down to this: the order on the real field is definable, but no infinite linear order is interpretable in the complex field.

I'm taking your question as a challenge to produce as elementary a proof of this as possible, in particular without using the word "stable" and without using any choice. For those in the know, I'm trading in the order property (a theory is stable if it does not have the order property) for the strict order property, to make the argument a bit more transparent.

Suppose for contradiction that the real field R is interpretable in the complex field C.

First, note that the standard order $x\le y$ on R is definable by the formula $\phi (x,y):\mathrm{\exists}z\phantom{\rule{thinmathspace}{0ex}}(x+{z}^{2}=y)$.

Since the complex field interprets the real field, and the real field interprets the real order, we can compose these interpretations to conclude that the complex field interprets the real order. More precisely: As part of the data of the given interpretation, we have a definable set $X\subseteq {\mathbb{C}}^{n}$ and a surjective map $\pi :X\to \mathbb{R}$. Each real number $r\in \mathbb{R}$ is represented by an equivalence class ${X}_{r}={\pi}^{-1}(\{r\})$ for a definable equivalence relation on X. Pulling back the formula $\phi $ to C, there is a formula $\psi (x,y)$ (where now x and y are tuples of length n) such that for all $a\in {X}_{r}$ and $b\in {X}_{s}$, $\mathbb{C}\models \psi (a,b)$ if and only if $r\le s$.

3. In particular, if we write ${Y}_{b}$ for the subset of X defined by $\psi (x,b)$, then $({Y}_{b}{)}_{b\in X}$ is a family of definable sets which is linearly preordered by $\subseteq $, and such that the quotient linear order is isomorphic to the standard order on R. To get a contradiction, we would like to show that the complex field does not admit any such family of definable sets.

4. To understand the definable sets in C, we use quantifier elimination. Now the easiest proofs of quantifier elimination for the complex field use the compactness theorem, which might get you worried that we're using choice. But don't worry: quantifier elimination for C can be proven constructively.

Now there are probably many ways to see that the complex field does not admit any quantifier-free definable family of definable sets which is linearly preordered by $\subseteq $ with the order type of R. Here's the most elementary way that occurred to me - remember, I'm trying to avoid appealing to any more advanced results in algebraic geometry or model theory.

5. First, let's assume $X\subseteq {\mathbb{C}}^{1}$, i.e. x is a single variable, not a tuple of variables. Let $\psi $(x,y) be the formula defining the family of definable sets. By quantifier elimination, we may assume $\psi $ is quantifier-free. Then for any b, $\psi $(x,b) is equivalent to a Boolean combination of polynomial equations p(x)=0 and inequations $p(x)\ne 0$, with each $p\in \mathbb{C}[x]$. When $p\ne 0$, the formula $p(x)\ne 0$ defines a finite set of size at most deg(p), $p(x)\ne 0$ defines a cofinite set whose complement has size at most deg(p), so letting N be the sum of the degrees (in x) of all the polynomials involved in $\psi (x,y)$, we have that $\psi (x,b)$ defines a finite set of size at most N or a cofinite set whose complement has size at most N. Hence, a $\subseteq $chain of definable sets defined by instances of $\psi $ can have length at most 2N+2, and in particular every such chain has a $\subseteq $minimal element.

6. Now let's prove by induction on n, where n is the length of the tuple of variables x, that for any formula $\psi (x,y)$, there is no family $({X}_{b}{)}_{b\in Y}$ of definable sets defined by $\psi $ which is linearly preordered by $\subseteq $ and has no minimal element. We've established the base case n=1. So let $\mathrm{\forall}x\phantom{\rule{thinmathspace}{0ex}}(\psi (x,b)\to \psi (x,{b}^{\prime}))$. Let's write $b\le {b}^{\prime}$ when Xb⊆Xb′, and note that this relation is definable (by ∀x(ψ(x,b)→ψ(x,b′))). For any b and any $a\in {\mathbb{C}}^{n}$, we can look at the set ${Z}_{a,b}$ defined by $\psi (a,{x}_{n+1},b)$. Since ${Z}_{a,b}\subseteq {\mathbb{C}}^{1}$ is the fiber over a of ${X}_{b}$, we have ${Z}_{a,b}\subseteq {Z}_{a,{b}^{\prime}}$ whenever $b\le {b}^{\prime}$. For fixed a, since $({Z}_{a,b}{)}_{b\in Y}$ is a definable family of subsets of ${\mathbb{C}}^{1}$, it has a $\subseteq $least element, i.e. ${Z}_{a,b}$ is constant for a downwards-closed set of bs. Let's call this downwards-closed set ${Y}_{a}$. We can use this observation to definably linearly preorder the n-tuples a: $a\le {a}^{\prime}$′ if ${Y}_{a}\subseteq {Y}_{{a}^{\prime}}$′. Applying induction to the family of downwards-closed sets in this order, by induction the order has a minimal element a∗. But now for any $b\in {Y}_{{a}^{\ast}}$, I claim that ${X}_{b}$ is minimal in the original family of sets. Indeed, if b′ such that that ${X}_{{b}^{\prime}}\u228a{X}_{b}$, then there is some a such that ${Z}_{a,{b}^{\prime}}\u228a{Z}_{a,b}$. But then $b\notin {Y}_{a}$, so ${Y}_{a}\u228a{Y}_{{a}^{\ast}}$, contradicting minimality of a∗.

The word "stable" may look a bit scary here - indeed, stability theory is a rather technical subject - but the argument above actually doesn't need any complicated ideas from stability theory. It just boils down to this: the order on the real field is definable, but no infinite linear order is interpretable in the complex field.

I'm taking your question as a challenge to produce as elementary a proof of this as possible, in particular without using the word "stable" and without using any choice. For those in the know, I'm trading in the order property (a theory is stable if it does not have the order property) for the strict order property, to make the argument a bit more transparent.

Suppose for contradiction that the real field R is interpretable in the complex field C.

First, note that the standard order $x\le y$ on R is definable by the formula $\phi (x,y):\mathrm{\exists}z\phantom{\rule{thinmathspace}{0ex}}(x+{z}^{2}=y)$.

Since the complex field interprets the real field, and the real field interprets the real order, we can compose these interpretations to conclude that the complex field interprets the real order. More precisely: As part of the data of the given interpretation, we have a definable set $X\subseteq {\mathbb{C}}^{n}$ and a surjective map $\pi :X\to \mathbb{R}$. Each real number $r\in \mathbb{R}$ is represented by an equivalence class ${X}_{r}={\pi}^{-1}(\{r\})$ for a definable equivalence relation on X. Pulling back the formula $\phi $ to C, there is a formula $\psi (x,y)$ (where now x and y are tuples of length n) such that for all $a\in {X}_{r}$ and $b\in {X}_{s}$, $\mathbb{C}\models \psi (a,b)$ if and only if $r\le s$.

3. In particular, if we write ${Y}_{b}$ for the subset of X defined by $\psi (x,b)$, then $({Y}_{b}{)}_{b\in X}$ is a family of definable sets which is linearly preordered by $\subseteq $, and such that the quotient linear order is isomorphic to the standard order on R. To get a contradiction, we would like to show that the complex field does not admit any such family of definable sets.

4. To understand the definable sets in C, we use quantifier elimination. Now the easiest proofs of quantifier elimination for the complex field use the compactness theorem, which might get you worried that we're using choice. But don't worry: quantifier elimination for C can be proven constructively.

Now there are probably many ways to see that the complex field does not admit any quantifier-free definable family of definable sets which is linearly preordered by $\subseteq $ with the order type of R. Here's the most elementary way that occurred to me - remember, I'm trying to avoid appealing to any more advanced results in algebraic geometry or model theory.

5. First, let's assume $X\subseteq {\mathbb{C}}^{1}$, i.e. x is a single variable, not a tuple of variables. Let $\psi $(x,y) be the formula defining the family of definable sets. By quantifier elimination, we may assume $\psi $ is quantifier-free. Then for any b, $\psi $(x,b) is equivalent to a Boolean combination of polynomial equations p(x)=0 and inequations $p(x)\ne 0$, with each $p\in \mathbb{C}[x]$. When $p\ne 0$, the formula $p(x)\ne 0$ defines a finite set of size at most deg(p), $p(x)\ne 0$ defines a cofinite set whose complement has size at most deg(p), so letting N be the sum of the degrees (in x) of all the polynomials involved in $\psi (x,y)$, we have that $\psi (x,b)$ defines a finite set of size at most N or a cofinite set whose complement has size at most N. Hence, a $\subseteq $chain of definable sets defined by instances of $\psi $ can have length at most 2N+2, and in particular every such chain has a $\subseteq $minimal element.

6. Now let's prove by induction on n, where n is the length of the tuple of variables x, that for any formula $\psi (x,y)$, there is no family $({X}_{b}{)}_{b\in Y}$ of definable sets defined by $\psi $ which is linearly preordered by $\subseteq $ and has no minimal element. We've established the base case n=1. So let $\mathrm{\forall}x\phantom{\rule{thinmathspace}{0ex}}(\psi (x,b)\to \psi (x,{b}^{\prime}))$. Let's write $b\le {b}^{\prime}$ when Xb⊆Xb′, and note that this relation is definable (by ∀x(ψ(x,b)→ψ(x,b′))). For any b and any $a\in {\mathbb{C}}^{n}$, we can look at the set ${Z}_{a,b}$ defined by $\psi (a,{x}_{n+1},b)$. Since ${Z}_{a,b}\subseteq {\mathbb{C}}^{1}$ is the fiber over a of ${X}_{b}$, we have ${Z}_{a,b}\subseteq {Z}_{a,{b}^{\prime}}$ whenever $b\le {b}^{\prime}$. For fixed a, since $({Z}_{a,b}{)}_{b\in Y}$ is a definable family of subsets of ${\mathbb{C}}^{1}$, it has a $\subseteq $least element, i.e. ${Z}_{a,b}$ is constant for a downwards-closed set of bs. Let's call this downwards-closed set ${Y}_{a}$. We can use this observation to definably linearly preorder the n-tuples a: $a\le {a}^{\prime}$′ if ${Y}_{a}\subseteq {Y}_{{a}^{\prime}}$′. Applying induction to the family of downwards-closed sets in this order, by induction the order has a minimal element a∗. But now for any $b\in {Y}_{{a}^{\ast}}$, I claim that ${X}_{b}$ is minimal in the original family of sets. Indeed, if b′ such that that ${X}_{{b}^{\prime}}\u228a{X}_{b}$, then there is some a such that ${Z}_{a,{b}^{\prime}}\u228a{Z}_{a,b}$. But then $b\notin {Y}_{a}$, so ${Y}_{a}\u228a{Y}_{{a}^{\ast}}$, contradicting minimality of a∗.

asked 2021-02-23

Interpreting z-scores: Complete the following statements using your knowledge about z-scores.

a. If the data is weight, the z-score for someone who is overweight would be

-positive

-negative

-zero

b. If the data is IQ test scores, an individual with a negative z-score would have a

-high IQ

-low IQ

-average IQ

c. If the data is time spent watching TV, an individual with a z-score of zero would

-watch very little TV

-watch a lot of TV

-watch the average amount of TV

d. If the data is annual salary in the U.S and the population is all legally employed people in the U.S., the z-scores of people who make minimum wage would be

-positive

-negative

-zero

a. If the data is weight, the z-score for someone who is overweight would be

-positive

-negative

-zero

b. If the data is IQ test scores, an individual with a negative z-score would have a

-high IQ

-low IQ

-average IQ

c. If the data is time spent watching TV, an individual with a z-score of zero would

-watch very little TV

-watch a lot of TV

-watch the average amount of TV

d. If the data is annual salary in the U.S and the population is all legally employed people in the U.S., the z-scores of people who make minimum wage would be

-positive

-negative

-zero

asked 2022-07-12

I'm trying to solve the initial value problem $(i{\mathrm{\partial}}_{t}+{\mathrm{\Delta}}_{x})u(t,x)=0$ $u(0,x)=f(x)$ for the Schrödinger equation ($x\in {\mathbb{R}}^{n}$, f Schwartz).

I know that a fundamental solution is given by $K(t,x)=(4\pi it{)}^{-n/2}{e}^{i|x{|}^{2}/4t}$.

How do I interpret $\sqrt{i}$ here?

I'm trying to show that if I convolve the above fundamental solution K with the initial data f (convolution in the spatial variable x), then I obtain the solution to the initial value problem.

Specifically, how do I prove that $K\ast f\to f$as $t\to 0$? More generally, what are the differences between this problem and the analogous problem for the heat equation $({\mathrm{\partial}}_{t}-{\mathrm{\Delta}}_{x})u(t,x)=0$?

[I know that the Schrödinger equation and fundamental solution are obtained from their heat counterparts via $t\mapsto it$

Why is the Schrödinger equation time reversible (i.e. why can it be solved both forwards and backwards in time), while the heat equation isn't? The total integral of the heat kernel (with respect to x) is 1; is the total integral of the "Schrödinger kernel" K also equal to 1?

I know that a fundamental solution is given by $K(t,x)=(4\pi it{)}^{-n/2}{e}^{i|x{|}^{2}/4t}$.

How do I interpret $\sqrt{i}$ here?

I'm trying to show that if I convolve the above fundamental solution K with the initial data f (convolution in the spatial variable x), then I obtain the solution to the initial value problem.

Specifically, how do I prove that $K\ast f\to f$as $t\to 0$? More generally, what are the differences between this problem and the analogous problem for the heat equation $({\mathrm{\partial}}_{t}-{\mathrm{\Delta}}_{x})u(t,x)=0$?

[I know that the Schrödinger equation and fundamental solution are obtained from their heat counterparts via $t\mapsto it$

Why is the Schrödinger equation time reversible (i.e. why can it be solved both forwards and backwards in time), while the heat equation isn't? The total integral of the heat kernel (with respect to x) is 1; is the total integral of the "Schrödinger kernel" K also equal to 1?

asked 2021-01-02

Evaluate the expression $\frac{\sqrt{-6}}{\sqrt{-3}\sqrt{-4}}$ and write the result in the form a+bi

asked 2022-06-23

Independent and Identically distributed, conditional independent and Naive bayes

I'm reading about Naive Bayes classification concept, noting that we make the conditionally independence assumption. But isn't this the general assumption that is always made dealing with machine learning algorithms?

Suppose we have a supervised binary classification problem setup, with a dataset $\mathcal{D}=\{({x}_{1},{t}_{1}),\dots ,({x}_{n},{t}_{n})\}$(xn,tn)} where ${x}_{i}\in {\mathbb{R}}^{D}$ and ${t}_{i}\in \{0,1\}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{\forall}i=1,\dots n$

I've read everywhere that we always make the assumption that data are iid (independent and with the same probability dstribution, this would mean that $p(({x}_{i},{t}_{i}),({x}_{j},{t}_{j}))=p(({x}_{i},{t}_{i}))p(({x}_{j},{t}_{j}))$..right?). At this point it is reasonable to think of a Bernoulli distribution to model the data. Let $p(\mathcal{D}|\theta )$ the likelihood function: then we want to find

$\hat{\theta}=arg\underset{\theta}{max}p(\mathcal{D}|\theta )$

where

$p(\mathcal{D}|\theta )=p(({x}_{1},{t}_{1}),\dots ,({x}_{n},{t}_{n})|\theta )=p(({x}_{1},{t}_{1})|\theta )\phantom{\rule{thinmathspace}{0ex}},\dots ,\phantom{\rule{thinmathspace}{0ex}}p(({x}_{n},{t}_{n})|\theta )$

Here we should use a conditional independence hypothesis in order to go on. So in every situation we use the naive bayes hypothesis? I'm having troubles trying to distinguish..

I'm reading about Naive Bayes classification concept, noting that we make the conditionally independence assumption. But isn't this the general assumption that is always made dealing with machine learning algorithms?

Suppose we have a supervised binary classification problem setup, with a dataset $\mathcal{D}=\{({x}_{1},{t}_{1}),\dots ,({x}_{n},{t}_{n})\}$(xn,tn)} where ${x}_{i}\in {\mathbb{R}}^{D}$ and ${t}_{i}\in \{0,1\}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{\forall}i=1,\dots n$

I've read everywhere that we always make the assumption that data are iid (independent and with the same probability dstribution, this would mean that $p(({x}_{i},{t}_{i}),({x}_{j},{t}_{j}))=p(({x}_{i},{t}_{i}))p(({x}_{j},{t}_{j}))$..right?). At this point it is reasonable to think of a Bernoulli distribution to model the data. Let $p(\mathcal{D}|\theta )$ the likelihood function: then we want to find

$\hat{\theta}=arg\underset{\theta}{max}p(\mathcal{D}|\theta )$

where

$p(\mathcal{D}|\theta )=p(({x}_{1},{t}_{1}),\dots ,({x}_{n},{t}_{n})|\theta )=p(({x}_{1},{t}_{1})|\theta )\phantom{\rule{thinmathspace}{0ex}},\dots ,\phantom{\rule{thinmathspace}{0ex}}p(({x}_{n},{t}_{n})|\theta )$

Here we should use a conditional independence hypothesis in order to go on. So in every situation we use the naive bayes hypothesis? I'm having troubles trying to distinguish..

asked 2022-07-04

I am reading "Problem Solving with Algorithms and Data Structures using Python" and the author is currently explaining the relation between comparisons and the Approximate Number of Items Left in an Ordered List.

I am struggling to perform the conversion of: $\left(\frac{n}{{2}^{i}}\right)=1$ to i=logn

I put this expression into MathWay and got back $i=lo{g}_{2}(1)$ and I'm a little confused on how these results are equivalent. I'm pretty rusty with logarithms so if you could help explain this conversion to me I'd greatly appreciate it.

I am struggling to perform the conversion of: $\left(\frac{n}{{2}^{i}}\right)=1$ to i=logn

I put this expression into MathWay and got back $i=lo{g}_{2}(1)$ and I'm a little confused on how these results are equivalent. I'm pretty rusty with logarithms so if you could help explain this conversion to me I'd greatly appreciate it.

asked 2022-06-07

I recently had an idea for an app that I would like to start developing for personal use and development, that attempts to present you with recipe idea's for lunch/dinner etc and by recording your responses learns your preferences. I was thinking it would do this by recording very specific details off each recipe such as carbcount caloriecount proteincount and etc, (factors which might act as determinants for our preference). Then this program would run an OLS regression with prob of being chosen as the dependent variable (for which we will have data on as we know what our user rejected (recipe) and what he accepted and how many times). We will then have various independant variables with which we will try and create an unbiased estimator. We can then run all recipe's to be presented under this regression and rank the recipes in order of probability to be chosen, highest to lowest.

Would this be a viable thing to do? If no, why not and what could perhaps be better?

Would this be a viable thing to do? If no, why not and what could perhaps be better?

asked 2022-06-26

Interpreting Coin Toss Data. Biased or Not?

We have run an experiment in which some good chap has sat down and flipped a coin 100 times. At the end of the 100 flips he has tallied 40 Heads and 60 Tails. Now this seems like something is up with the coin. The question is whether or not this coin is biased.

I have already determined that the mean p=1/2 and the standard deviation is 5. If we take the mean as a random variable of a normal distribution about the mean, then the experimental results we obtained are 2sigma from the mean. My first question is what does it mean if the results are outside one sigma?

Next I proceeded to find the probability that the p-value of the getting a heads was instead 4/10. I used the function 100C40 *p^40 *(1-p)^60 and integrated this (dp) from 0.35 to 0.45. My result was 0.006. Now does this mean that the probability of getting a p is between 0.35 and 0.45 is 0.006 ? But I feel in this method I should be comparing this p against something.

I suppose my problem really lies in interpreting the results and their meaning.

We have run an experiment in which some good chap has sat down and flipped a coin 100 times. At the end of the 100 flips he has tallied 40 Heads and 60 Tails. Now this seems like something is up with the coin. The question is whether or not this coin is biased.

I have already determined that the mean p=1/2 and the standard deviation is 5. If we take the mean as a random variable of a normal distribution about the mean, then the experimental results we obtained are 2sigma from the mean. My first question is what does it mean if the results are outside one sigma?

Next I proceeded to find the probability that the p-value of the getting a heads was instead 4/10. I used the function 100C40 *p^40 *(1-p)^60 and integrated this (dp) from 0.35 to 0.45. My result was 0.006. Now does this mean that the probability of getting a p is between 0.35 and 0.45 is 0.006 ? But I feel in this method I should be comparing this p against something.

I suppose my problem really lies in interpreting the results and their meaning.