Code: Select all
0.152658045592678 0.146846164052304
0.152672921563584 0.146796660198495
0.152628787384212 0.146750211537819
0.152533272772998 0.146706348167116
Moderators: hgm, Rebel, chrisw
Code: Select all
0.152658045592678 0.146846164052304
0.152672921563584 0.146796660198495
0.152628787384212 0.146750211537819
0.152533272772998 0.146706348167116
Used this code to approximate Math.Exp(x) but it makes it only far more slow.Joost Buijs wrote: ↑Tue Aug 28, 2018 4:12 pmI don't think that for SELU the accuracy of the Exp() function plays a big role, maybe you can use an approximation with a Taylor series or something alike. It won't make a difference of a magnitude, but every bit of speed you can gain will help of course.Henk wrote: ↑Tue Aug 28, 2018 2:58 pm For loss function I still use mean square error. Maybe I should change that. For activation function I use SELU. I read that if you use a SELU you don't need batch normalization for it is self normalizing. But computing an Exp(x) is one of slowest operations during training.
Code: Select all
static public double SELU(double x) { return 1.0507 * (x >= 0 ? x : 1.67326 * (Math.Exp(x) - 1)); }
Code: Select all
double sum = 1;
double term = 1;
for (int i = 1; i <= 9; i++)
{
term = term * x / i;
sum += term;
}
return sum;
Code: Select all
// assumes x < 0
if (x > -0.1)
{
unchecked
{
double sqX = x * x;
double sum = 1 + x + sqX * (0.5 + x / 6 + sqX * (0.041666666 + x / 120));
Debug.Assert(Math.Abs(sum - Math.Exp(x)) <= 0.0001);
return sum;
}
}
Relu is not self normalizing. If I use RELU I have to implement batch normalization as wel. Might be that adding batch normalization might be making it more slow. I don't know yet.Joost Buijs wrote: ↑Mon Sep 17, 2018 1:12 pm Why don't you switch to ReLU as activation function, at least it is fast because it doesn't need exp().
Like others alread said, purchasing a decent GPU would help a lot, of course you have to dive into CUDA programming, or use Python with Tensorflow like all these script kiddies do.Henk wrote: ↑Mon Sep 17, 2018 1:16 pmRelu is not self normalizing. If I use RELU I have to implement batch normalization as wel. Might be that adding batch normalization might be making it more slow. I don't know yet.Joost Buijs wrote: ↑Mon Sep 17, 2018 1:12 pm Why don't you switch to ReLU as activation function, at least it is fast because it doesn't need exp().