I use the Dataset provided in this Paper (arXiv:1902.09914) up to 600k Anti-#k_T# jets in the Training Set with: #p_T# between $550 \cdot \textrm{GeV}$ and $650 \cdot \textrm{GeV}$ $R_{i}^{2} = \eta_{i}^{2} + \phi_{i}^{2} \leq {0.8}^{2}$ the 4 vectors in each event are sorted by #p_t# and are preprocessed here into #flag#: a constant $\Delta{\eta}$: $\eta = \log{\left(\frac{p + p_{3}}{p - p_{3}} \right)} / 2$, and $\Delta{\eta} = \eta - \operatorname{mean}{\left(\eta \right)}$ $\Delta{\phi}$: $\phi = \operatorname{arctan_{2}}{\left(p_{2},p_{1} \right)}$, and $\Delta{\phi} = \phi - \operatorname{mean}{\left(\phi \right)}$ $lp_{T}$: $p_{T}^{2} = p_{1}^{2} + p_{2}^{2}$, and $lp_{T} = - \log{\left(\frac{p_{T}}{p_{T}^{jet}} \right)}$ flag (a constant) #Eq(eta,ln((p+p_3)/(p-p_3))/2)# #Eq(phi,atan2(p_2,p_1))# #Eq(ln(p_t_jet/p_t),ln(sqrt((p_1_jet**2+p_2_jet**2)/(p_1**2+p_2**2))))# Preproccessing Sort by the transverse momentum Encoder Learn a graph (topK: connect each node to K neighbours) Run graph updates 4 nodes -> 1 node Decoder 1 node -> 4 nodes Run graph updates Sort again by the transverse momentum 50k jets Learning rate of #0.0003# Batch size of 200 Train until the loss does not improve for 30 Epochs Compression size of 7