Engin wrote: ↑Wed Nov 20, 2024 2:05 am
i just one not understanding how can be a neural network is working with integer values that converted from floating numbers, yes its faster working but do you get the same result on the outputs then after quantize whole network to integer and would it be the sigmoid activation function in the forward process working with integer too?? because with sigmoid you get allways floating numbers between 0.0 ... 1.0
NNs tend to work with lower precision than normal computer arithmetic (e.g. brain float - link). This appears to be using this fact to speed up calculations by using integers in place of any sort of float. Probably converted back to floats at the end.
Want to attract exceptional people? Be exceptional.
Note that the 'integers' used in NN are not integers in the mathematical sense, but in the computer-architecture sense. The program interprets them as fixed-point real numbers. True real numbers can never be represented in a computer, which is a finite, discrete device. So they always have to be 'quantized', and there always is a maximum and a minium value that can be represented. The fixed-point system uses the same quantum over the entire range. Which is perfect if the range doesn't span many orders of magnitude. But if it does, then it is often batter to use small quanta on small numbers, and that gives you the floating-point system.
Engin wrote: ↑Wed Nov 20, 2024 2:05 am
i just one not understanding how can be a neural network is working with integer values that converted from floating numbers, yes its faster working but do you get the same result on the outputs then after quantize whole network to integer and would it be the sigmoid activation function in the forward process working with integer too?? because with sigmoid you get allways floating numbers between 0.0 ... 1.0
NNs tend to work with lower precision than normal computer arithmetic (e.g. brain float - link). This appears to be using this fact to speed up calculations by using integers in place of any sort of float. Probably converted back to floats at the end.
yes, then 16 bit float would be enough, but there are not existing in C/C++, there are 32 bit floats.
convert integer to float and back again consumes to much time again, where is the real speed then ?
why not calculating with SIMD only floats instead of integers ?
There is no need to ever convert to float. Winning probability can very easily be represented by a 16-bit signed integer. You would not need more than 64K internediates between certain win and certain loss; the noise in the evaluation would be far greater than 0.015%.
And I don't think there is anything like 8-bit floats. The dynamic range of 8-bit integers is so small (only 2 orders of magnitude) that it makes no sense. But there exists SIMD for packed 8-bit integers. Which has twice the throughput of SIMD for packed 16-bit floats.
hgm wrote: ↑Thu Nov 21, 2024 8:47 pm
And I don't think there is anything like 8-bit floats. The dynamic range of 8-bit integers is so small (only 2 orders of magnitude) that it makes no sense. But there exists SIMD for packed 8-bit integers. Which has twice the throughput of SIMD for packed 16-bit floats.
In fact there are 8-bit float formats that are supported by some GPUs. See: