Now lets do the same with an autoencoder instead of calling #x# the input, and #y# the desired output, #(x,y)# are both input and output this means, the function is just an identity to force it to learn something, we add a compression in the middle of our architecture after training, we see that the autoencoder basically learned the same function as before (except for some numerics) but: we cannot just use the autoencoder to predict the #y# value for a given #x# anymore still there is the same information saved in the autoencoder defining the relation between #x# and #y# how to use this information? compare prediction to input (difference is loss) if the #(x,y)# pair matches the function: loss is small if it does not match: the loss is big so you can use the loss of an autoencoder to categorize different classes Used by QCDorWhat (arxiv 1808.08979) for unsupervised toptagging Set a cut somewhere everything above classified as signal everything below classified as background for each cut, measure error rates true positive rate:fraction of signal classifications in signal false positive rate:fraction of signal classifications in background measure network quality as #Eq(auc,integrate(tpr(fpr),(fpr,0,1)))# Already used for Toptagging by QCDorWhat (arXiv:1808.08979) They try two different approaches Image based Lola (Lorentz layer) based This Paper is here used as Reference Points worst Autoencoder best Image based one best Lola based one (which is there best Autoencoder)