1
00:00:00,000 --> 00:00:03,810
Anyway, okay, Hello, everyone, I hope
you're all doing well. Today I will

2
00:00:03,810 --> 00:00:09,570
present the performance of hedonic
reconstruction and identification at Atlas

3
00:00:09,570 --> 00:00:15,960
and CMS. So first, just a quick motivation
on why we are interested in looking at

4
00:00:15,960 --> 00:00:22,170
house, Atlas and CMS. So one thing that's
interesting about it, though, is that it's

5
00:00:22,170 --> 00:00:29,220
the heaviest Latin and so it has the
largest you color coupling to the Higgs

6
00:00:30,330 --> 00:00:36,240
among the leptons, but even though it's
only about 6% of the branching fraction of

7
00:00:36,270 --> 00:00:41,700
x, okay, I'll come back to 60% of beak
works you can see from from this

8
00:00:41,700 --> 00:00:47,340
measurement of the coupling, that the
resolution can can be made much higher.

9
00:00:48,000 --> 00:00:53,250
One other interesting thing about towers
is that they also decay inside the

10
00:00:53,250 --> 00:00:58,380
detector, and then there will be one
neutrino and this allows you to make Sep

11
00:00:58,380 --> 00:01:04,110
measurements on The exposure by looking at
angular distribution, which are sensitive

12
00:01:04,110 --> 00:01:09,420
to the spin correlation of this decay, but
also for several beyond the Standard Model

13
00:01:09,420 --> 00:01:14,970
scenarios. You might have some interesting
signatures without in the final states.

14
00:01:15,390 --> 00:01:19,320
And a lot of these are well motivated
Fatah's because they might have some type

15
00:01:19,350 --> 00:01:25,860
of enhanced coupling to the first
generation. And so before we start, also a

16
00:01:25,860 --> 00:01:30,540
quick review of the top properties
themselves, so it has a mass of about 1.8

17
00:01:30,540 --> 00:01:37,170
gV, and this means it can decay to lighter
hydrants like the pylons. And this is

18
00:01:37,170 --> 00:01:42,060
indeed the dominant type of Saudi case. So
if you look at this pie chart here, it's

19
00:01:42,060 --> 00:01:48,480
about two thirds of the pies, you have at
least 100 in the final state. And in this

20
00:01:48,480 --> 00:01:55,020
talk, I will call this hydronic decay tau
H or just tau. And then about one third of

21
00:01:55,020 --> 00:02:02,670
the time you have either one moon or one
electron decay from Stop. It also has a

22
00:02:02,670 --> 00:02:09,450
very short lifetime but not too short. So
it can still travel on on average one

23
00:02:09,450 --> 00:02:16,350
millimeter inside the detector for typical
energy of 20 gV. And just to compare, for

24
00:02:16,350 --> 00:02:20,250
example, we had drones, we had drones,
they have lifetimes, which are about 10

25
00:02:20,250 --> 00:02:26,610
times longer, but thanks to this traveling
inside the detector, there will be often a

26
00:02:26,610 --> 00:02:34,050
secondary vertex which we can make use of
in our education algorithms. Besides that,

27
00:02:34,050 --> 00:02:40,470
because because it has a very light at
these energies, they are correlated jets

28
00:02:40,470 --> 00:02:45,120
with that are very isolated compared to
for example, cork and glue on jets that

29
00:02:45,120 --> 00:02:50,790
are typically much wider and have a lot of
activity in this wider cone. So for

30
00:02:50,790 --> 00:02:55,170
towels, you typically have different
number of charged pylons, typically one

31
00:02:55,170 --> 00:03:00,720
are free. And then you can also have
sometimes an extra by zero which promptly

32
00:03:00,720 --> 00:03:06,000
to two photons. So on this slide I just
wanted to make a very rough sketch of how

33
00:03:06,000 --> 00:03:12,270
the satanic indicators are detected that
opposite CMS. So, typically, we start from

34
00:03:12,330 --> 00:03:18,570
ak 48, which is used as a seed for the tau
algorithm, where we reconstruct different

35
00:03:18,600 --> 00:03:24,660
types of decay modes. And so typically,
these are counting different number of

36
00:03:24,660 --> 00:03:30,000
charge tracks that have some energy
deposits in HDL. And these are assumed to

37
00:03:30,000 --> 00:03:36,090
be charged pints, but sometimes they can
also be by zero which quickly the case to

38
00:03:36,120 --> 00:03:42,420
photons and then the sun energy deposit
decal. And then on top of this, you also

39
00:03:42,420 --> 00:03:46,920
have identification, which often are some
type of MVA techniques that are used to

40
00:03:46,920 --> 00:03:51,810
reject many jets but also electrons and
neutrons that can fake or can be

41
00:03:51,810 --> 00:03:56,400
misidentified as stars very efficiently
sometimes, and these typically use

42
00:03:56,730 --> 00:04:04,650
lifetime and isolation variables to to
discriminate against jets. So before I go

43
00:04:04,650 --> 00:04:09,990
in the in the details, I just want to give
a very broad overview of the algorithms

44
00:04:09,990 --> 00:04:14,610
that are available. So at CMS on the left,
we have just one algorithm for

45
00:04:14,700 --> 00:04:18,840
reconstruction which is called HPS.
Whereas It seems there are two so one is

46
00:04:18,870 --> 00:04:24,120
called the baseline algorithm which was
developed during red one, and then they're

47
00:04:24,120 --> 00:04:29,610
going to also now we are using the tau
Particle Flow. And then on top of these,

48
00:04:30,090 --> 00:04:34,290
we have classically abilities that are
used to discriminate against jets, but

49
00:04:34,410 --> 00:04:40,350
recently there also have been new neural
networks that that improve the performance

50
00:04:40,350 --> 00:04:48,510
of identification. So now I will first go
into more details for CMS with hp hp s

51
00:04:48,510 --> 00:04:50,280
algorithm and the dnn.

52
00:04:52,110 --> 00:04:58,980
So HP S stands for pattern strips and as I
mentioned a few slides back, start from an

53
00:04:58,980 --> 00:05:06,330
ak 45 For jets as a seat, so in CMS these
jets are already made from Particle Flow

54
00:05:06,330 --> 00:05:15,420
objects, which are already ready hadrons
or electron or a photon candidates. And

55
00:05:15,420 --> 00:05:20,370
then you look inside this cone and you
look for the track with the highest

56
00:05:20,400 --> 00:05:26,010
momentum. And around this track you make
silicone, which has a size of 0.2 or

57
00:05:26,010 --> 00:05:31,590
smaller and then also an isolation cone of
about 0.4 which is typically used to

58
00:05:31,590 --> 00:05:40,140
define some type of isolation. Then, at
CMS, the decay most of the TAs is assigned

59
00:05:40,140 --> 00:05:45,780
by simply counting the number of church
tracks inside this signal cone. And then

60
00:05:45,930 --> 00:05:52,080
to count the different numbers of zeros,
we look at the clusters in the email. And

61
00:05:52,080 --> 00:05:55,290
what's particular about this algorithm is
that you have these green boxes here which

62
00:05:55,470 --> 00:06:01,320
are called strips, where you iteratively
merge together like Concept photons,

63
00:06:01,320 --> 00:06:06,930
because the main reason to do this like is
because photons have a high probability of

64
00:06:06,930 --> 00:06:11,400
converting into electrons, but it also
helps reduce some of the the pilot.

65
00:06:12,870 --> 00:06:14,520
Then once you have your tau

66
00:06:15,900 --> 00:06:21,840
candidates, you apply some some type of
identification. So the newest thing is

67
00:06:22,110 --> 00:06:27,060
this thing called Deep Tao. And this is a
computational deep neural network and so

68
00:06:27,060 --> 00:06:32,250
it uses just like the older PDT is high
level information, like the lifetime

69
00:06:32,610 --> 00:06:36,240
isolation variables and variables related
to the kinematics of electrons and

70
00:06:36,240 --> 00:06:44,640
photons. But what's new is that they also
use these compositional layers with as an

71
00:06:44,640 --> 00:06:49,440
input, the particle for Particle Flow
hadron smoothens electrons or photons that

72
00:06:49,440 --> 00:06:55,620
fall into this grid or this screen of
cells in either five and then as an

73
00:06:55,620 --> 00:07:01,080
output, it's it gives you the probability
that you're talking about is a real time

74
00:07:01,110 --> 00:07:06,600
when we own electron or jets. And so,
basically this is one of the classifier on

75
00:07:06,600 --> 00:07:11,580
the bottom here I show some some examples
of the performance in terms of

76
00:07:11,580 --> 00:07:17,520
misidentification versus efficiency. So,
in blue is the new deep neural network and

77
00:07:17,520 --> 00:07:22,650
you have to compare this to the green one,
which are the previous identification

78
00:07:22,650 --> 00:07:27,750
algorithms. And in particular, the most
importantly, is the anti jets

79
00:07:27,810 --> 00:07:34,860
discrimination here, where for a very
typical efficiency of about 6%, you have a

80
00:07:34,860 --> 00:07:40,620
misidentification probability of 0.6 or
inversely, this is a rejection factor of

81
00:07:40,620 --> 00:07:49,380
about 170. Okay, so this was CMS. Now I
will talk a bit more about Atlas starting

82
00:07:49,410 --> 00:07:56,640
with the baseline algorithm and the RNN.
So for Atlas, there was this baseline

83
00:07:56,640 --> 00:08:02,910
reconstruction developed during round one
Since Atlas doesn't have Particle Flow, it

84
00:08:02,910 --> 00:08:09,660
instead instead has four jets that are
made out of clusters. And just like CMS,

85
00:08:09,990 --> 00:08:16,170
there is a cyclical in defiance of a fixed
size 0.2 and then isolation colon of 0.4.

86
00:08:16,770 --> 00:08:23,370
And then looking inside the signal con,
there is a BDT with set of BTS which is

87
00:08:23,370 --> 00:08:29,010
used to classify the tracks. And then
based on the output from this, you can

88
00:08:29,010 --> 00:08:33,060
assign the decay modes to either one or
three prongs. So, prong here means just

89
00:08:33,060 --> 00:08:38,280
the charge track. And one thing that's
important to notice here is that for

90
00:08:38,280 --> 00:08:43,800
baseline reconstruction, deployment is
simply the sum of the energy clusters

91
00:08:43,800 --> 00:08:50,400
here. And so this is inclusive in by zero,
which is not separately identified as, for

92
00:08:50,400 --> 00:08:55,530
example, photons or electrons. Then on top
of the baseline reconstruction algorithm,

93
00:08:55,530 --> 00:09:00,210
there is the neural network, which is a
recurrent neural network. That's your to

94
00:09:00,210 --> 00:09:06,630
discriminate against jets. And this uses
just like CMS high level variables, like

95
00:09:06,630 --> 00:09:11,640
the risks related to the towel lifetime
isolation, but also different variables

96
00:09:11,640 --> 00:09:15,930
related to the calorie meter. And what's
special about this neural network that it

97
00:09:15,930 --> 00:09:21,090
has several recurrent layers that uses as
input this low level information about the

98
00:09:21,090 --> 00:09:26,790
different tracks and clusters. On the
bottom here I have on the left one plot

99
00:09:26,820 --> 00:09:31,050
that shows the efficiency just after
baseline reconstruction, and you can see

100
00:09:31,050 --> 00:09:39,690
that it's around 74 for this algorithm,
and they compare here the the real one

101
00:09:39,690 --> 00:09:47,250
from Murfreesboro. Then on the right,
there is the performance plots of the

102
00:09:47,280 --> 00:09:53,790
neural network of the rejection of jets
versus efficiency of towers. And so here

103
00:09:53,790 --> 00:10:02,850
again, for a very typical marking point of
60% efficiency, you have About a 1%

104
00:10:03,180 --> 00:10:10,230
misidentification probability which is on
the spot here, I think about 80 rejection

105
00:10:10,230 --> 00:10:17,340
of a factor of 80 Okay. So, so, this was
the first algorithm reconstruction

106
00:10:17,340 --> 00:10:21,240
algorithm, the second one is called the
top Particle Flow, and this one was

107
00:10:21,240 --> 00:10:26,760
developed to improve the, the momentum
resolution of the tau. This time instead

108
00:10:26,760 --> 00:10:32,850
of using a naked for jet as a seat it uses
the baseline top candidates. And it looks

109
00:10:32,850 --> 00:10:40,410
inside to identify and reconstruct the
individual particles. So, in case of the

110
00:10:40,410 --> 00:10:45,840
church pions, you simply identify or you
associate the different tracks which are

111
00:10:45,840 --> 00:10:50,550
in the cartoon, the red dots here with
different energy clusters in blue. And

112
00:10:50,550 --> 00:10:54,090
then all the remaining clusters that don't
have a church track associated with them

113
00:10:54,330 --> 00:10:59,490
are assumed to be coming from Pi Zero and
there's a dedicated BDT that is used to

114
00:10:59,490 --> 00:11:03,900
identify These clusters and given
probability that they are indeed by zero,

115
00:11:06,270 --> 00:11:12,330
then then once you have these different
constituents, you have an additional set

116
00:11:12,330 --> 00:11:18,030
of entities that assign the right to K
modes using these as an input. And this

117
00:11:18,030 --> 00:11:23,040
indeed improves the the moments of
resolution, mainly below a PT of 100 gV.

118
00:11:23,400 --> 00:11:29,640
So on the bottom left, I have one example
for the for the angular distribution for

119
00:11:29,730 --> 00:11:35,520
in terms of phi. So here you have the
resolution for tau Particle Flow, which is

120
00:11:35,520 --> 00:11:40,950
black and you see it's much more much
narrower than the rest, which is the

121
00:11:40,950 --> 00:11:46,770
baseline reconstruction, but also in terms
of the transfers energy on the bottom in

122
00:11:46,770 --> 00:11:52,350
the middle, if you've just focused on the
bottom few lines, which are the core of

123
00:11:52,350 --> 00:11:58,740
this resolution distribution, where you
have about 68% of the sales. If you

124
00:11:58,740 --> 00:12:02,550
compare the black again, which As the top
article fall to the red baseline, you can

125
00:12:02,550 --> 00:12:07,920
see that you can improve by quite a lot,
the momentum resolution. And on the right

126
00:12:07,920 --> 00:12:14,970
here you see efficiency of assigning
correctly, the different true decay modes

127
00:12:14,970 --> 00:12:18,420
to the reconstructed decay modes and so
you see that you have a very nice, the

128
00:12:18,420 --> 00:12:21,990
agonal with a strong correlation between
the different decay modes.

129
00:12:23,610 --> 00:12:25,860
You have a bit less than three minutes.

130
00:12:26,460 --> 00:12:34,440
Okay, good. Thanks. So, for the sector's
description, we typically make an

131
00:12:34,440 --> 00:12:39,210
efficiency measurements using the towels
and so here you have Zeebo isn't the king

132
00:12:39,210 --> 00:12:44,850
to two towers, one tower was to Milan one
goes to hadrons and here to Milan is used

133
00:12:44,850 --> 00:12:50,070
as a well measured tech and then the other
The, the hydronic indicates how is your

134
00:12:50,070 --> 00:12:54,420
probe and then you use some type of
maximum like Fitz, to do some

135
00:12:54,420 --> 00:12:59,970
discriminating observable to to extract
the scale factors. So in terms of atlases,

136
00:13:00,000 --> 00:13:08,370
In case of outliers, they use the number
of tracks inside the cone of 0.6. And so I

137
00:13:08,400 --> 00:13:12,090
there are no plots for the newer neural
network. So I'm showing you the ones for

138
00:13:12,090 --> 00:13:18,930
the BDT. But I can say that for the neural
network, the scale factor is close to one

139
00:13:18,960 --> 00:13:23,700
with about 3% of certainty in case of CMS
using event mass between the moon and Tao,

140
00:13:23,700 --> 00:13:28,530
so this is called the visible mass. And
again, I only have two BDT plots, but the

141
00:13:28,530 --> 00:13:34,290
scale factor for the Ico network is also
close to 0.9 and one with about 6%

142
00:13:34,290 --> 00:13:40,260
uncertainty which shows that you have a
very good description of the detector. So

143
00:13:41,010 --> 00:13:44,820
this was the reconstruction and
identification. one extra thing I want to

144
00:13:44,820 --> 00:13:50,520
go into is the energy calibration. So for
outlasts there's dedicated energy

145
00:13:50,520 --> 00:13:58,200
calibration using a boosted regression
tree, and they use interpolation between

146
00:13:58,200 --> 00:14:03,690
the calorie meter base and top Particle
Flow. based at and then they have targets

147
00:14:03,690 --> 00:14:08,640
the true PT and then train this train this
boosted regression tree for this

148
00:14:09,060 --> 00:14:13,410
resolution is about 6% on the on the
momentum and the energy scaled and

149
00:14:13,410 --> 00:14:20,460
MonteCarlo is typically one or 3%. For
CMS, we already are using Particle Flow

150
00:14:20,460 --> 00:14:24,750
constituents which are already well
calibrated so there's not much extra

151
00:14:24,750 --> 00:14:29,640
calibration needed. And again, the PT is
well modeled with about 10% resolution.

152
00:14:31,350 --> 00:14:36,030
And just like Atlas, the energy scale in
MonteCarlo, is very close to one or 3%.

153
00:14:37,140 --> 00:14:42,000
And then just to show off in the last two
slides, the performance of these different

154
00:14:42,000 --> 00:14:46,920
algorithms, I have some plots. So in this
slide, we have Thomas. And what's really

155
00:14:46,920 --> 00:14:51,720
cool about these slides, sorry about these
plots is that you can see for the

156
00:14:51,720 --> 00:14:57,090
different the different components of
Trillian decaying to tell us and you see

157
00:14:57,090 --> 00:15:03,030
the decay modes of these charges. So, so,
for example, you have the Pi Zero mass

158
00:15:03,030 --> 00:15:08,610
here and intermediate resonances select
the row or the a message. And then on the

159
00:15:08,610 --> 00:15:15,810
last slide you have the physical the
physical mass of the of the Z boson. And

160
00:15:15,810 --> 00:15:22,470
just to give you an idea, you have about
66 pure purity of about 66% at the peak

161
00:15:22,470 --> 00:15:27,930
here for this plot for the deep Tao
algorithm. So, hopefully I could convince

162
00:15:27,930 --> 00:15:31,410
you that thoughts are interesting for
looking for new physics, but also doing

163
00:15:31,410 --> 00:15:36,630
Hicks measurements. Each detector has
their own unique approach. So same as many

164
00:15:36,630 --> 00:15:42,570
uses, exploits the fact they have Particle
Flow, which already provides for

165
00:15:42,570 --> 00:15:47,370
calibrates to constituents, whereas Atlas
starts from a baseline which is more

166
00:15:47,370 --> 00:15:52,620
calorie meter base, but has its article
flow developed to exploit tracking for

167
00:15:52,620 --> 00:15:57,540
improved resolution and I also showed that
we have an excellent understanding of the

168
00:15:57,540 --> 00:16:04,170
detector. good description and also showed
that recent developments of networks have

169
00:16:04,230 --> 00:16:09,660
been able to improve the Jets reduction
with about 1% misidentification for 6%

170
00:16:09,690 --> 00:16:12,630
efficiency. So thank you for your
attention.

171
00:16:13,349 --> 00:16:18,389
Thanks a lot, Zack. Is Do we have any
questions from from the audience?

172
00:16:25,560 --> 00:16:32,370
I don't see raised hands. Yes, there is
one. So bedtime, I should be able to talk

173
00:16:32,370 --> 00:16:32,790
now.

174
00:16:35,010 --> 00:16:36,180
Hi, can you hear me?

175
00:16:36,240 --> 00:16:42,000
Yes. Okay. Yeah. Thanks a lot for this
nice talk. I had one question about the

176
00:16:42,030 --> 00:16:48,540
energy calibration. At high PT. Don't you
need anything special in your calibration

177
00:16:48,570 --> 00:16:57,030
in CMS? Or you just rely out of the box on
the polygon fluid constant because I'm

178
00:16:57,030 --> 00:17:03,660
asking in Atlas What we get from the
particular Kobe construction is very good

179
00:17:03,660 --> 00:17:10,500
at low PT but degrades a Thai PT because
the the track momentum is less accurate.

180
00:17:12,540 --> 00:17:18,360
Yeah, so for CMS, we, we don't have
dedicated energy calibration at this Hyper

181
00:17:18,360 --> 00:17:21,810
V. So we indeed still use the Particle
Flow constituents. And indeed, you see

182
00:17:21,810 --> 00:17:26,700
that the resolution goes down, I believe
at higher. But I don't have a I don't have

183
00:17:26,700 --> 00:17:27,570
a blood for this.

184
00:17:28,470 --> 00:17:29,340
Okay, thank you.

185
00:17:33,450 --> 00:17:34,950
Do we have more questions?

186
00:17:37,200 --> 00:17:43,260
Yes. Can I ask a question? Everything? Go
ahead. Yes, thank you. So what I would

187
00:17:43,260 --> 00:17:45,390
like to know is how much

188
00:17:46,830 --> 00:17:51,390
the reconstructions relies on a good
calibration in CMS

189
00:17:54,450 --> 00:17:58,170
which means that you need to be
reconstructed

190
00:18:00,000 --> 00:18:06,840
calibrate objects with the refined
calibration or calibration you get from

191
00:18:06,840 --> 00:18:10,740
the data Yeah, so

192
00:18:10,770 --> 00:18:12,570
the hazards performance.

193
00:18:13,139 --> 00:18:19,049
So, so, you could you could already in
principle use the prompt data but

194
00:18:19,169 --> 00:18:24,509
typically when once once we make really
measurements for for physics analyses as

195
00:18:24,509 --> 00:18:31,049
the as the top group we typically do this
based on reconstructed data so, so already

196
00:18:31,049 --> 00:18:35,009
you have better calibration in particular
for tracker which is very important to get

197
00:18:35,009 --> 00:18:42,539
a good resolution on the or description of
the lifetime. So, typically we use more to

198
00:18:42,539 --> 00:18:47,789
reconstruct its data but it already
performs pretty well on on prompts I would

199
00:18:47,789 --> 00:18:48,059
say.

200
00:18:52,590 --> 00:18:56,850
I was also curious about the about the ego
part

201
00:18:58,260 --> 00:19:00,300
for CMS, you Yes.

202
00:19:02,639 --> 00:19:03,539
That's a good question

203
00:19:05,220 --> 00:19:09,960
that I'm not sure about about. But I
believe it's similar. So I think we

204
00:19:09,960 --> 00:19:14,250
probably already have a very good
calibration prompt. But, but I'm not an

205
00:19:14,250 --> 00:19:18,750
expert on that. So. Okay. Thank you.

206
00:19:19,590 --> 00:19:20,610
Thank you for the question.

207
00:19:21,419 --> 00:19:28,349
Thanks a lot. I have myself a question.
So, on slide seven, you show that the dnn

208
00:19:28,799 --> 00:19:33,719
performs much better than the BDT. Do you
have any idea why it is so?

209
00:19:34,679 --> 00:19:40,709
Yeah, so like I said, that this deep
network is a compositional, deep neural

210
00:19:40,709 --> 00:19:45,539
network. So it basically it looks at
really the substructure of the different

211
00:19:45,869 --> 00:19:51,569
constituents inside this in this cone. So
it doesn't just look at the lifetime

212
00:19:51,599 --> 00:19:57,599
isolation, but also the relation between
all these different tracks and I mean,

213
00:19:57,599 --> 00:20:01,079
hadron send me once and electrons that
that's the fall inside this

214
00:20:03,780 --> 00:20:04,830
code so

215
00:20:06,750 --> 00:20:11,220
so it kind of makes it kind of learns the
patterns of its image. And so this really

216
00:20:11,220 --> 00:20:15,360
helps to exploit you get the factor to
increase the performance here.

217
00:20:16,320 --> 00:20:21,780
And the for the BDT you used to use a
specific library

218
00:20:24,360 --> 00:20:34,350
uh, I guess I guess it was simply the team
MVA library routes, okay. Yeah. Okay.

219
00:20:36,840 --> 00:20:37,380
Thank you.

220
00:20:40,500 --> 00:20:54,120
Do we have someone else Let me check the
attendees know from from the pain list.

221
00:20:58,680 --> 00:21:01,980
Now, it seems not So thanks Tor display