RandomGuy321 wrote: ↑Tue Sep 13, 2022 1:07 amHey, thanks for getting back to me. I appreciate looking at the work you did since you sometimes don't get such interesting examples of statistical methods in a classroom setting (I took a class last semester). So, if he didn't cheat otb in this period, it would be historic right? I am curious how a predictive model would compare to Hans's performance this year if he were to stay on track from that specific date range you analyzed.
Though regressions can be used as predictive models, that's not really the interpretation that I think makes the most sense, here. Implicit in my analysis is the idea that cherry picking of the interval is irrelevant, or maybe even desired. My contention is that it would be worrying if any player's performances were correlated with broadcast status over any large, unbroken span of games.
Obviously, if a player participates in enough tournaments, we can most likely construct a sizable dataset out of broadcast tournaments where he performed well, and non-broadcast tournaments where he performed poorly. That would be dishonest, to say the least. But if we're trying to answer a question like, "Are there any significant stretches where it appears a player might have been cheating?" then in fact you probably
do want to pick a sufficiently long interval (at least a year?) and then cherry pick a player's worst span of that length.
(Hence I'm unconcerned if the original tweeter actually did this. He picked a solid 20 month interval, and included all classical games from it, which seems reasonable to me. My only real concern about that data is the possibility that the original tweeter might have systematically erred or lied about broadcast status in just the right way. I didn't find any evidence of that, but I also wasn't able to definitively confirm all of his broadcast status claims.)
One argument you might make in favor of such an interval selection procedure is to contend that, most likely, not all cheaters actually cheat all of the time.
Just to wildly speculate, maybe your cheating method requires a confidant to stand within 100 meters of you with a phone that connects via bluetooth to eardrum monitors. Maybe your buddy's willing to drive to most American tournaments, but doesn't have the time or resources to follow you when you spend a year in Europe. Or maybe your confidant graduates from college and no longer has much free time. Or maybe after you accomplish an important goal like earning a title, you think to yourself, "Wow, I can't believe I pulled that off without getting caught. I'd better cool it for a while," and then you stop cheating for a stretch.
Et cetera. I'm sure you can think of any number of scenarios. But since the most pressing question is "Did this person ever cheat?" and not "Did this person cheat most of the time?" I think it's well motivated to try to identify a player's most damning interval. And thus you're not really using regression analysis to predict performance, but rather to identify blocks of games that have unusual correlations, and to get a feel for how performance was influenced by various explanatory variables in those games.
In my analysis, Niemann's broadcast status regression coefficient was large and positive, with a p-value of 0.00088, if I recall correctly. That's a fairly high degree of statistical significance, but you'd still expect about 1/1,000 players to have a comparable stretch as bad or worse than that, and that doesn't mean they're all cheaters. Even after you identify a potentially suspicious player, you still want to do further analysis to see if cheating seems like the best mechanism for explaining your observations:
- For example, maybe you identify a player with a tiny p-value, but their broadcast status regression coefficient is negative. So their performance is getting worse at broadcast tournaments? It's hard to understand how the option to easily cheat would make someone play worse, so maybe that's just random chance. (Or maybe a lot of their opponents are cheating?)
- Maybe you identify a mediocre player whose broadcast status regression coefficient is positive and significant, but quite small, so they appear to perform just a tiny bit better at broadcast tournaments. Why would someone go to that much trouble to perform only slightly better? Maybe it's just random, or there's actually some confounder that can entirely explain such a modest effect. (E.g. they tend to play in local, non-broadcast tournaments throughout the busy school year, but travel to lots of broadcast tournaments over the Summer, when they also dedicate lots of time to studying chess.)
- Maybe you identify a player whose broadcast status regression coefficient is positive and significant, but within that span there are multiple instances of broadcast tournaments where they could have, say, earned a norm, but failed to do so, despite performing somewhat better than expected. Maybe that's just a bad, dumb, or overly cautious cheater, but I think most people would expect such a player to be extra motivated to cheat in order to achieve a concrete goal like earning a norm. So maybe it's just random.
On the other hand, if you identify a player with a large, positive, significant broadcast status regression coefficient, and despite looking for various confounders you can't identify anything remotely plausible that could explain such a large effect, and this player is widely known to have multiple instances of online cheating, even in money tournaments, and some of his peers seem to have vague suspicions that his true ability might be quite a bit lower than his rating, and then you look at the specific characteristics of each tournament in a suspicious interval, and note that not only did this player overperform in broadcast tournaments, he REALLY overperformed in precisely those tournaments where a norm was on the table (with average centipawn loss statistics better than any super-GM's historical best), and he just so happened to only attend norm tournaments that had a live broadcast, etc.
All of these observations are like sticking your finger in the air to see which way the wind's blowing. If the wind always seems to be blowing in the same direction every time you check ... well, there might just be a storm coming.