The question is wrong in today's context: "common hardware" would have to include SOCs that cost just £0.50 (if you buy them by the thousand), run on thousandths of a volt, and have Bluetooth built in. Certainly it would need to include SOCs that run watches (right now, SOCs in watches cost more than £0.50).
Many phone SOCs now include GPUs, so the possibility of them running NN based engines becomes imaginable.
However, for me, your question misses a MUCH more important question: when will superhuman engines stop needing NNs?
I've discussed this in depth before, but here's a simple piece of evidence that today's NN's are inefficient:
1. they train on datasets containing a number of positions that is many orders of magnitude higher than the number of positions any human will ever see
2. at ply 1, they're still well behind top human players
Looking at a lot of evidence, I have concluded that these NNs are finding a large number of shallow (simple) patterns, but chess will yield to a small number of deep patterns. Once we have these deep patterns, I think that small, cheap SOCs with low power consumption will be good enough.