Hacker News new | past | comments | ask | show | jobs | submit login

Not the function itself, the log ratio of fit probabilities of any given pair of these. That is differentiable at most the Lebesgue sense and the likelihood requires something stronger. Specifically it has to be smooth everywhere to have KL divergence well defined. Adding a constant will give you a logarithm that breaks at zero still in log probability ratio.

So, both BIC and AIC are ill defined for this family of functions... Part of the reason the measure returns worthless garbage. This also happens with fits based on neural nets with tan or clipped activations. (Because sum of activations is nonsmooth as is fit probability.) But not with RBM or GMM or exponential neurons. These produce Gaussian or Pareto fit probabilities. (Polynomial probably also fail because they're not smooth functions but both measures could be corrected for nonsmoothness of likelilihood in this case.) Sum of sinc should work too as you get Wishart fit probabilities.

This funny chaotic function? I have no idea how distributed the answers are and whether the distribution is continuous.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact