You are welcom to find my paper when you learn googleMilos wrote: Sorry kiddo, that empty boasting won't help much, your previous post clearly suggests you have no clue what TPU is, and have not read that seminal paper.
So you actually think that double precision is used for training?Milos wrote: (dp) floating point for training.
Why, I leave you to figure it out (that is pretty basic stuff btw.).
I leave you to figure out why dp isn't used in deep learning at all (this is known by anyone who ever done deep learning btw.
Milos wrote: You didn't demonstrate anything that I wrote was wrong, but just basically confirmed it and proved that you were wrong.
You claimed TPU is 180 TOPS, while it is 180 TFLOPS per pod of four TPU, thus your evaluation 4 times higher than it should be. You don't mention that P100/P40 support half precision FP, which provides twice as much FLOPS. Which puts us to 10 P100 to fit 4 TPUs. You claim V100 Tensor cores are shit, though I'm sure you have never used them. You claim V100 costs $10000, while it actualy costs $8000.
It's clear that you have no arguments and thus switch to ad hominem. You've also demonstrated that you know about deep learning a bit less than nothing.Milos wrote: It is clear from your writing you are some kiddo (probably got hold of his first ML course, or even Google intership and is now overexcited and as most of the youth full of himself), so I wouldn't hold you in a discussion any more.
When you have actually some substance to write about, then you can come back.