It is random, at least while it is learning. It would have most likely tried 5 layers, or even 50.
But the point is to simplify it enough while still working the way it should. And when maximizing the efficiency, you generally get only a handful of efficient ways your problem can be solved.
It is random, at least while it is learning. It would have most likely tried 5 layers, or even 50.
But the point is to simplify it enough while still working the way it should. And when maximizing the efficiency, you generally get only a handful of efficient ways your problem can be solved.