AndrewGrant wrote: ↑Sun Sep 11, 2022 7:29 amWould like to see the information concerning...
1. The ratings of the players he played against, ( at the time and currently )
That was a common response people gave to
that person's tweet, which is why I was intrigued and looked into it more deeply, with more detailed data and a more rigorous statistical analysis, which I already included in the very post you quoted.
But, to recap: the average strength of Niemann's opponents was a statistically insignificant predictor of his performance in USCF-rated classical tournaments over the ~two year time period from March 2019 through November 2020.
If you wish to drill down even further, I included link to all of the relevant cross tables. Each individual game, as well as the rating of each individual opponent, is given in those tables.
2. Whether he knew the pairings ahead of time to prep,
I very much doubt this data exists anywhere but in the mind of Hans Niemann. Maybe by contacting each tournament director you could find out what pairing procedure was used at each tournament, but I'm not going to do that. The regression I ran showed that 67% of the variation in Niemann's performance at USCF-rated classical tournaments over that time period is already explained by the broadcast status of the tournament.
Unless you believe broadcast status is VERY strongly and significantly correlated with Niemann's ability to prepare for his opponents, prep couldn't explain enough of the variation in Niemann's performance, and my results remain damning.
Assuming you've accounted for confounders, even if only 10% of the variation in performance remains explained by broadcast status, it would be incredibly concerning. There
should be no correlation whatsoever, so the onus is on you to show me the confounders, and explain how the mere fact that a tournament was being broadcast over the internet could plausibly and
honestly spike someone's performance rating by a couple hundred Elo.
(Also, the correlation of broadcast status and performance, and the estimate for its regression coefficient, are almost certainly lower bounds.)
You can make data say whatever you want if you try hard enough.
I wasn't trying to make the data say anything. I came into this drama with no horse in the race. If anything, I was slightly biased against Carlsen because my previous impression of him was that he was a bit of a jerk, a prima donna, and a sore loser. (And I had no idea who Niemann was.)
My opinion was somewhat shifted in favor of Carlsen by Niemann's interviews, because:
- Niemann admitted to multiple instances of past cheating, but essentially claimed the only times he'd ever cheated were the times he'd been caught, which is obviously an absurd lie.
- Niemann seemed completely out of his depth in post-game analysis, a sentiment echoed by a fair number of strong players, even super-GMs.
- Niemann's demeanor and "impassioned" defenses of himself set off warning bells in my mind for narcissistic personality disorder.
Nonetheless, I still remained fundamentally undecided, because I don't assign much weight to evaluations based on limited data, especially when I've never personally interacted with someone.
Anyway, I was sent that tweet and was intrigued by its claims. So I verified the correctness and completeness of the data, gathered more detailed data (like number of games and average rating of opponents) that I thought were reasonable proxies for potential confounding factors (like fatigue, mathematical caps on performance rating, or playing less seriously against low-rated opponents), and ran a bog standard linear regression. (I also carried out an informal, cursory check to see if I could identify any other players from those same tournaments who exhibited remotely similar patterns in their performances. I could not.)
The results of that analysis are so overwhelming and definitive that I have a hard time imagining any other interpretation of the data.
I would not be surprised to learn that you could also conclude that Hans is a significantly better player only when given the opportunity to prepare deeply against his opponent, and that he is unable to beat similar-aged players (who have very un-established ratings) decisively in Swiss events for example.
I dunno, maybe? But like I said, unless his capacity for deep preparation is very very very strongly and positively correlated with the broadcast status of a tournament (which sounds facially absurd to me, maybe you can explain how this could be the case), then it doesn't really matter. 67% of the variation in Niemann's performance is explained by broadcast status, and the regression coefficient for that status has a vanishingly small p-value: 0.00088.
I'm quite certain I don't need to tell you this, but for those who aren't familiar, the usual significance threshold is 0.05, and a lower number means a stronger result. P-value isn't everything, and "p-hacking" is a concern in some contexts, but this is
not one of those contexts. All I did was run a standard analysis of an unbroken block of two years of the dude's recent games. Even if the original tweeter "cherry picked" the interval ... it's two solid years of games and a p-value almost two orders of magnitude smaller than it needs to be. Not sure what else needs to be said. The ball is absolutely not in my court.