tocarina/howto4/data/old2.swp/old.swp/03problems.swp/03rocs

<subsection Why does the Graph fall of?>
<frame>
<split>
<que>
<list>
<e>To understand why, first consider how to combine different tests</e>
<e>Since the loss is just a (quadratic) sum of the feature/particle losses, this is what we need</e>
<e>to model this, lets consider losses made from overlapping gaussians</e>
</list>
</que>
<que>
<i f="dist1" f2="dist2"></i>
</que>
</split>
</frame>
<frame>
<split>
<que>
<list>
<e>now lets add them together</e>
<e>but also add a multiplicative constant #c# to one of them</e>
<e>##<h>Eq(d,d_1+c*d_2)##</e>
<e>depending on #c# the auc of the addition chances</e>
</list>
</que>
<que>
<i f="adda"></i>
</que>
</split>
</frame>
<frame>

<split>
<que>
<list>
<e>There is an optimum value of c</e>
<e>and if you use a value of c that is way to large, it can actually hurt your auc</e>
<e>so assume: #Eq(c,1)#(unweighted addition) is a #c# that is way to big for toptagging</e>
<e>so lets calculate the perfect c for a given distribution</e>
</list>
</que>
<que>
<i f="abc" wmode=True>auc as function of c</i>
</que>
</split>

</frame>

%show animation here

<frame>
##Eq(mu_1B,0),Eq(mu_2B,0),Eq(mu_1S,1),Eq(mu_2S,c*alpha)##
##Eq(sigma_iB,sigma_iS),Eq(sigma_1,s_1),Eq(sigma_2,alpha*c*s_2)##
##Eq(mu_B,0),Eq(mu_S,1+c*alpha),Eq(sigma,sqrt(sigma_1**2+sigma_2**2))##
fix the scale by demanding #Eq(mu_S,1)#, then maximum auc means minimum #sigma# (or #(sigma/s1)**2#)
##Eq((sigma/s1)**2,(1+(s_2/s_1)**2*alpha**2*c**2)/(1+alpha*c))##
</frame>
<frame>
##Eq(d/dc * (sigma/s1)**2,0)##
##Eq((1/(1+alpha*c)**3)*2*y*(c*alpha*(s_2/s_1)**2-1),0)##
##Eq(c,1/(alpha*(s_2/s_1)**2))##
##Eq(alpha,1.0),Eq(s_2,0.75),Eq(s_1,0.5)##
compare to numerics:
##Eq(c,0.4444),Eq(c_n,0.4436),Eq(sigma_c_n,0.0024)##

</frame>

<frame title="Why is that useful?">
##Eq(c,1/(alpha*(s_2/s1)**2))##
but you can approximate
\begin{equation} \alpha \propto loss \end{equation}
\begin{equation} #<empty>s# \propto loss \end{equation}
so
\begin{equation} c \propto loss^{-3} \end{equation}


</frame>

<frame>

<i f="superscale"></i>

</frame>

<frame>
%some tabular comparing the benefits/problems of this bodge
%atm some test que

<split>
<que>
Benefits
<list>
<e easy to use>
<e fast to train>
<e quite good results>
</list>
</que>
<que>
Problems
<list>
<e Probably not the best possible compression/rejection, since there is no Interaction between particles>
<e Does not use the Graph to its full potential>
</list>
</que>
</split>
So maybe use weigths in training to let the network focus more on the important things


</frame>
<frame>

<split>
<que>
<list>
<e>First Goal: Reach the same quality for a small Network (8 nodes) in splittet and nonsplittet training</e>
<e>here 8 nodes, 4 of those weigthed with a factor</e>
<e>auc as a function of this factor</e>
<e>apparently still something i dont understand</e>
</list>
</que>
<que>
<i f="auwei"></i>
</que>
</split>

</frame>
<frame>

<split>
<que>
<list>
<e>First Goal: Reach the same quality for a small Network (8 nodes) in splittet and nonsplittet training</e>
<e>here 8 nodes, 4 of those weigthed with a factor</e>
<e>auc as a function of this factor</e>
<e>apparently still something i dont understand</e>
</list>
</que>
<que>
<i f="auwei2"></i>
</que>
</split>

</frame>

<ignore>

<split>
<que>
<list>
<e></e>
<e></e>
<e></e>
</list>
</que>
<que>

</que>
</split>

<frame>

<split>
<que>
<list>
<e></e>
<e></e>
<e></e>
</list>
</que>
<que>
<i f="none"></i>
</que>
</split>

</frame>

<frame>

<list>
<e></e>
<e></e>
<e></e>
</list>

</frame>


</ignore>
initial push 2022-02-24 11:53:43 +01:00			`<subsection Why does the Graph fall of?>`
			`<frame>`
			`<split>`
			`<que>`
			`<list>`
			`<e>To understand why, first consider how to combine different tests</e>`
			`<e>Since the loss is just a (quadratic) sum of the feature/particle losses, this is what we need</e>`
			`<e>to model this, lets consider losses made from overlapping gaussians</e>`
			`</list>`
			`</que>`
			`<que>`
			`<i f="dist1" f2="dist2"></i>`
			`</que>`
			`</split>`
			`</frame>`
			`<frame>`
			`<split>`
			`<que>`
			`<list>`
			`<e>now lets add them together</e>`
			`<e>but also add a multiplicative constant #c# to one of them</e>`
			`<e>##<h>Eq(d,d_1+c*d_2)##</e>`
			`<e>depending on #c# the auc of the addition chances</e>`
			`</list>`
			`</que>`
			`<que>`
			`<i f="adda"></i>`
			`</que>`
			`</split>`
			`</frame>`
			`<frame>`

			`<split>`
			`<que>`
			`<list>`
			`<e>There is an optimum value of c</e>`
			`<e>and if you use a value of c that is way to large, it can actually hurt your auc</e>`
			`<e>so assume: #Eq(c,1)#(unweighted addition) is a #c# that is way to big for toptagging</e>`
			`<e>so lets calculate the perfect c for a given distribution</e>`
			`</list>`
			`</que>`
			`<que>`
			`<i f="abc" wmode=True>auc as function of c</i>`
			`</que>`
			`</split>`

			`</frame>`

			`%show animation here`

			`<frame>`
			`##Eq(mu_1B,0),Eq(mu_2B,0),Eq(mu_1S,1),Eq(mu_2S,c*alpha)##`
			`##Eq(sigma_iB,sigma_iS),Eq(sigma_1,s_1),Eq(sigma_2,alphacs_2)##`
			`##Eq(mu_B,0),Eq(mu_S,1+calpha),Eq(sigma,sqrt(sigma_12+sigma_2*2))##`
			`fix the scale by demanding #Eq(mu_S,1)#, then maximum auc means minimum #sigma# (or #(sigma/s1)**2#)`
			`##Eq((sigma/s1)2,(1+(s_2/s_1)2alpha2c*2)/(1+alphac))##`
			`</frame>`
			`<frame>`
			`##Eq(d/dc * (sigma/s1)**2,0)##`
			`##Eq((1/(1+alphac)3)2y(calpha(s_2/s_1)**2-1),0)##`
			`##Eq(c,1/(alpha(s_2/s_1)*2))##`
			`##Eq(alpha,1.0),Eq(s_2,0.75),Eq(s_1,0.5)##`
			`compare to numerics:`
			`##Eq(c,0.4444),Eq(c_n,0.4436),Eq(sigma_c_n,0.0024)##`

			`</frame>`

			`<frame title="Why is that useful?">`
			`##Eq(c,1/(alpha(s_2/s1)*2))##`
			`but you can approximate`
			`\begin{equation} \alpha \propto loss \end{equation}`
			`\begin{equation} #<empty>s# \propto loss \end{equation}`
			`so`
			`\begin{equation} c \propto loss^{-3} \end{equation}`


			`</frame>`

			`<frame>`

			`<i f="superscale"></i>`

			`</frame>`

			`<frame>`
			`%some tabular comparing the benefits/problems of this bodge`
			`%atm some test que`

			`<split>`
			`<que>`
			`Benefits`
			`<list>`
			`<e easy to use>`
			`<e fast to train>`
			`<e quite good results>`
			`</list>`
			`</que>`
			`<que>`
			`Problems`
			`<list>`
			`<e Probably not the best possible compression/rejection, since there is no Interaction between particles>`
			`<e Does not use the Graph to its full potential>`
			`</list>`
			`</que>`
			`</split>`
			`So maybe use weigths in training to let the network focus more on the important things`


			`</frame>`
			`<frame>`

			`<split>`
			`<que>`
			`<list>`
			`<e>First Goal: Reach the same quality for a small Network (8 nodes) in splittet and nonsplittet training</e>`
			`<e>here 8 nodes, 4 of those weigthed with a factor</e>`
			`<e>auc as a function of this factor</e>`
			`<e>apparently still something i dont understand</e>`
			`</list>`
			`</que>`
			`<que>`
			`<i f="auwei"></i>`
			`</que>`
			`</split>`

			`</frame>`
			`<frame>`

			`<split>`
			`<que>`
			`<list>`
			`<e>First Goal: Reach the same quality for a small Network (8 nodes) in splittet and nonsplittet training</e>`
			`<e>here 8 nodes, 4 of those weigthed with a factor</e>`
			`<e>auc as a function of this factor</e>`
			`<e>apparently still something i dont understand</e>`
			`</list>`
			`</que>`
			`<que>`
			`<i f="auwei2"></i>`
			`</que>`
			`</split>`

			`</frame>`

			`<ignore>`

			`<split>`
			`<que>`
			`<list>`
			`<e></e>`
			`<e></e>`
			`<e></e>`
			`</list>`
			`</que>`
			`<que>`

			`</que>`
			`</split>`

			`<frame>`

			`<split>`
			`<que>`
			`<list>`
			`<e></e>`
			`<e></e>`
			`<e></e>`
			`</list>`
			`</que>`
			`<que>`
			`<i f="none"></i>`
			`</que>`
			`</split>`

			`</frame>`

			`<frame>`

			`<list>`
			`<e></e>`
			`<e></e>`
			`<e></e>`
			`</list>`

			`</frame>`


			`</ignore>`