My suggestion: Compare to normal parameters. This means you get two lists of AUC scores Your params: [0.80,0.75,0.73,....,0.95] Pyod params: [0.82,0.71,0.48,....,0.95] look at two values $\sum_i your_i-pyod_i$ <l2st> Total improvment. If positive, then your parameters help;) But hard to see if this is significant </l2st> Fraction of $your_i>pyod_i$ <l2st> Quantised, so does not care about improving your parameters further But easy to see if this is significant <l3st> 0.5->Probably just random 0.9->Probably quite significant </l3st> </l2st>