## Interesting article in "Nature" on Machine Learning

**Moderators:** hgm, Harvey Williamson, bob

**Forum rules**

This textbox is used to restore diagrams posted with the [d] tag before the upgrade.

### Re: Interesting article in "Nature" on Machine Learning

The article starts with an inaccuracy:

Later on the article corrects this somewhat:

It is inconceivable that this new result has any practical impact. For all we know, the physical world does not even need infinite sets to exist.

Anyway, I will be the last to say that new results are uninteresting if they have no practical impact

Gödel only showed that the continuum hypothesis did not contradict the ZFC axioms, i.e. it could not be proved false within standard set theory. It was Paul Cohen that showed that completed the proof of the independence of the continuum hypothesis from ZFC (i.e. that assuming that the continuum hypothesis is false also does not contradict the ZFC axioms).The mathematicians, who were working on a machine-learning problem, show that the question of ‘learnability’ — whether an algorithm can extract a pattern from limited data — is linked to a paradox known as the continuum hypothesis. Gödel showed that the statement cannot be proved either true or false using standard mathematical language.

Later on the article corrects this somewhat:

So this was not a 1940 result by Gödel but a 1960 result by Cohen.A 1940 result by Gödel (which was completed in the 1960s by US mathematician Paul Cohen) showed that the continuum hypothesis cannot be proved either true or false starting from the standard axioms — the statements taken to be true — of the theory of sets, which are commonly taken as the foundation for all of mathematics.

It is inconceivable that this new result has any practical impact. For all we know, the physical world does not even need infinite sets to exist.

So they seem to be looking into learning things about an infinitely large data set. For all practical purposes (in particular given that practical computers have finite memory) we can limit application of AI to finite data sets.In the latest paper, Yehudayoff and his collaborators define learnability as the ability to make predictions about a large data set by sampling a small number of data points. The link with Cantor’s problem is that there are infinitely many ways of choosing the smaller set, but the size of that infinity is unknown.

Anyway, I will be the last to say that new results are uninteresting if they have no practical impact

### Re: Interesting article in "Nature" on Machine Learning

See also here.

Anyway, whatever the exact theorem is that has been proven to be unprovable, this result only holds if one sticks to ZFC.

If one accepts the ZFC axioms and the continuum hypothesis, then apparently these guys have actually proved their theorem.

And if one accepts instead the ZFC axioms and the negation of the continuum hypothesis, then apparently these guys have disproved their theorem.

In the "real world", the continuum hypothesis does not really play a role (nor does the "C" in ZFC, i.e. the axiom of choice).

There must be a mistake here. A finite subset of the unit interval is finite, so can't be giving much problems.Ben-David and colleagues then prove that the ability to carry out a weak form of monotone compression is related to the size of certain infinite sets. The set that the authors ultimately use in their work is the unit interval, which is the set of real numbers between 0 and 1. Their results imply that the finite subsets of the unit interval have monotone-compression schemes, and therefore are learnable in EMX, if and only if the continuum hypothesis is true, which is known to be unprovable.

Anyway, whatever the exact theorem is that has been proven to be unprovable, this result only holds if one sticks to ZFC.

If one accepts the ZFC axioms and the continuum hypothesis, then apparently these guys have actually proved their theorem.

And if one accepts instead the ZFC axioms and the negation of the continuum hypothesis, then apparently these guys have disproved their theorem.

In the "real world", the continuum hypothesis does not really play a role (nor does the "C" in ZFC, i.e. the axiom of choice).

### Re: Interesting article in "Nature" on Machine Learning

The paper itself is also available: https://www.nature.com/articles/s42256-018-0002-3

Our focus is on a specific learning problem we call ‘estimating the maximum’ (EMX). The EMX problem belongs to both models discussed above. Here is a motivating example. Imagine a website that is being visited by a variety of users. Denote by X the set of all potential visitors to the website. The owner of the website wishes to post ads on it. The posted ads are to be chosen from a given pool of ads. Each ad A in the pool targets a certain population of users FA ⊆ X. For example, if A is a sports ad then FA is the collection of sports fans. The goal is to place an ad whose target population visits the site most frequently. The challenge is that it is not known in advance which visitors are to visit the site.

More formally, we assume access to a training sample of visitors drawn from an (unknown) distribution P. The collection of ads corresponds to the family of sets ℱ = {FA : A is an ad in the pool}. The ad problem above becomes an instance of the following EMX problem:

Given a family ℱ of subsets of some domain X, find a set in ℱ whose measure with respect to an unknown probability distribution P is close to maximal. This should be done based on a finite sample generated i.i.d. from P.

So it is about the "EMX learnability" of the family of all finite subsets of the unit interval.The family of sets ℱ∗ we consider is the family of all finite subsets of the interval [0, 1]. The class of probability distributions 𝒫∗we consider is the class of all distributions over [0, 1] with finite support.

Theorem

The EMX learnability of ℱ∗with respect to 𝒫∗ is independent of the ZFC axioms.