Hacker News new | past | comments | ask | show | jobs | submit login

> To avoid extracting irrelevant features, the TSFRESH package has a built-in filtering procedure. This filtering procedure evaluates the explaining power and importance of each characteristic for the regression or classification tasks at hand.

> It is based on the well developed theory of hypothesis testing and uses a multiple test procedure. As a result the filtering process mathematically controls the percentage of irrelevant extracted features.

Here's the paper on this: https://arxiv.org/abs/1610.07717

It seems that the relevance of the features is somewhat tunable based on the p-value you choose for the statistical tests. (Every feature selection algorithm I can think of has some tunable parameter, although the information theoretic ones just depend on the length of features you're willing to consider.)




The individual feature significance tests do not have any parameter, they just generate the p-values.

The only parameter that one can tune is the overall percentage of irrelevant extracted features. That is the expected FDR of the Benjamini yakutieli procedure.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: