It’s for my daughter who happens to be close to Jeff and Heidi Dean. Jeff thinks it’s workable and so he’ll be following it but, of course, from a distance.

]]>I’m not sure there’s a good analogy/connection to the Central Limit Theorem here, though what you might be thinking of is ensembles of models. E.g., if you have ten models that make predictions using different parameters, cuts of the data, etc, then you can often get better performance. This is one of the core ideas at the heart of random forests, which are incredibly effective for many classification and regression tasks.

More practically as regards this question—for a really in depth study of parameter choices in a Convolutional Neural Net for sentiment analysis, check out this paper by Zhang and Wallace: http://arxiv.org/abs/1510.03820

]]>“…. deep learning methods: they can work amazingly well, but they are very sensitive to initialization and choices about the sizes of layers, activation functions, and the influence of these choices on each other.”

I am wondering whether we can find/devise something similar to the Central Limit Theorem.

]]>