yano_pres/data/010pipeline.txt

16 lines
570 B
Plaintext

<frame title="pipeline">
<list>
<e>Again there are a couple modifiers possible</e>
<l2st>
<e>nonconst->remove constant features</e>
<e>shuffle</e>
<e>normalize('zscore'/'minmax')</e>
<e>cut(10)->at most 10 datasets</e>
<e>split->train test split, all anomalies in test set</e>
<e>crossval(5)->similar to split, but do multiple times (crossvalidation)</e>
</l2st>
<e>modifiers interact with each other</e>
<e>For example: normalize('minmax'), split</e>
<e>->train set always below 1, but no guarantees for the test set</e>
</list>
</frame>