1
00:00:02,790 --> 00:00:08,820
Thanks very much everyone for
attending and my pleasure to give this talk

2
00:00:08,820 --> 00:00:13,260
about triggering and online calibration
with machine learning techniques

3
00:00:15,690 --> 00:00:19,650
covering material from the four
collaborations and also from some

4
00:00:20,130 --> 00:00:27,960
community initiatives on these topics. So,
I wanted to briefly start by, because you

5
00:00:27,960 --> 00:00:33,540
know that whenever we discuss sort of
machine learning, including in these

6
00:00:33,540 --> 00:00:36,630
contexts, it's important to sort of
remember that

7
00:00:38,430 --> 00:00:43,560
it's been with us somehow since, since the
beginning, but it's just taken different,

8
00:00:43,770 --> 00:00:50,010
different forms. And so I stole this, this
picture to illustrate it.

9
00:00:51,420 --> 00:00:56,400
And the point of the picture is that
you're doing the same thing fundamentally

10
00:00:56,400 --> 00:00:59,940
(which is banging your computer) but
you're just banging in different ways.

11
00:01:01,709 --> 00:01:03,029
So before run one,

12
00:01:04,589 --> 00:01:07,079
it's not that the community didn't know
about machine learning. But

13
00:01:08,130 --> 00:01:12,270
we were preoccupied by saying, you know,
can we trust our detector in the first

14
00:01:12,270 --> 00:01:12,810
place?

15
00:01:15,240 --> 00:01:20,370
And then, as run one happened and things
worked, and I think it's fair to say that

16
00:01:20,400 --> 00:01:23,310
the detectors worked better than we could have
hoped for.

17
00:01:25,740 --> 00:01:26,640
We started

18
00:01:28,050 --> 00:01:32,400
increasingly adopting sort of at least
BDTs as some kind of minimal machine

19
00:01:32,400 --> 00:01:38,280
learning approach and increasingly fancier
things. First in offline analyses but

20
00:01:38,280 --> 00:01:40,110
also in real time classification.

21
00:01:41,580 --> 00:01:47,910
And then in run two, as we got more and more
confident, there was an increasing use of

22
00:01:47,910 --> 00:01:52,110
kind of aligning and calibrating our
detectors in real time and real time

23
00:01:52,110 --> 00:01:56,460
analysis, whether it's called that in LHCb
or scouting in CMS or trigger level

24
00:01:56,460 --> 00:01:59,790
analysis in ATLAS and that was increasingly
becoming a thing.

25
00:02:00,450 --> 00:02:07,170
And so, in addition to
classification, we started really

26
00:02:07,170 --> 00:02:12,420
deploying ML to assist, kind of also
feature building in generators and for

27
00:02:12,420 --> 00:02:13,170
calibration.

28
00:02:15,660 --> 00:02:17,640
And now going towards run three,

29
00:02:18,690 --> 00:02:24,090
we are really, I would say, kind of
starting to

30
00:02:25,350 --> 00:02:29,850
consider ML methods and how they
interleave with classical methods across,

31
00:02:31,050 --> 00:02:36,840
you know, the full list of our use cases,
whether real time or offline.

32
00:02:39,810 --> 00:02:43,620
And so that's somehow a little bit the
context. And now I'd like to go through

33
00:02:43,620 --> 00:02:45,330
and start with classification.

34
00:02:47,760 --> 00:02:53,250
Because that's really the one where we
started earliest in the trigger or real

35
00:02:53,250 --> 00:02:59,940
time context to deploy it. So this is a
slide from from LHCb

36
00:03:00,810 --> 00:03:07,320
On the right hand side, you can see a plot
of essentially our when we optimized our

37
00:03:07,320 --> 00:03:14,730
main BDT based trigger for for run two and
you see the different signal modes that you tested

38
00:03:14,730 --> 00:03:18,990
with and the efficiency of these different
types of

39
00:03:20,010 --> 00:03:24,090
BDTs and classifiers that we that we tried
out.

40
00:03:27,180 --> 00:03:32,580
And on on the on the left, this is a kind
of nine year old plot now but this is from

41
00:03:32,580 --> 00:03:37,830
the very first time we deployed the same
BDT that that was reoptimized in a fairly

42
00:03:37,830 --> 00:03:42,660
sophisticated way for run two, was first
deployed in a much cruder way for already

43
00:03:42,660 --> 00:03:43,590
2011.

44
00:03:44,610 --> 00:03:46,170
And you see the

45
00:03:47,250 --> 00:03:51,480
in gray the response of the minimum
bias data and in red the response of

46
00:03:51,480 --> 00:03:52,410
bbar MonteCarlo

47
00:03:53,430 --> 00:03:58,740
2010 MonteCarlo and this very nice
modeling actually of the higher

48
00:03:58,740 --> 00:04:00,000
response area where you're having

49
00:04:01,170 --> 00:04:03,540
because this classifier was
classifying

50
00:04:04,859 --> 00:04:08,069
proton-proton collisions which had bbar 
pairs in them. And you see how nicely the

51
00:04:08,069 --> 00:04:13,559
simulation is following data. So it's just
to say that within the LHCb trigger

52
00:04:14,219 --> 00:04:19,259
you know, the main, the main, inclusive
trigger was based on BDTs since almost the

53
00:04:19,259 --> 00:04:24,599
beginning. And then different classifiers
have been used widely throughout, there's

54
00:04:24,599 --> 00:04:27,389
a lot of documentation on that.

55
00:04:28,860 --> 00:04:31,170
One thing we put quite some work into,

56
00:04:32,400 --> 00:04:35,100
over the years is finding ways to

57
00:04:36,660 --> 00:04:39,270
flatten the efficiency curve. So here you
see

58
00:04:40,980 --> 00:04:47,220
the efficiency of the BDT. On the left and on
the right here it's for a three body

59
00:04:48,300 --> 00:04:52,500
meson decay as a function of position
in the Dalitz plane. On the left is if you

60
00:04:52,500 --> 00:04:56,340
don't explicitly train your classifier to
be flat. Now on the right is if you

61
00:04:56,340 --> 00:04:59,940
explicitly do train it to keep the
efficiency flat, and this is

62
00:05:00,000 --> 00:05:04,680
to minimize systematic. So that just to give
you an idea of the kind of things we worry

63
00:05:04,680 --> 00:05:07,620
about when we're doing this training.

64
00:05:09,059 --> 00:05:13,019
And actually this connects to one
lesson for me over the last decade when it

65
00:05:13,019 --> 00:05:17,039
comes to classifiers is that unlike when
we were saying, like, before we trust our

66
00:05:17,039 --> 00:05:23,039
detector, so you know, ML black boxes, do
we know what they're doing? Will they bias

67
00:05:23,039 --> 00:05:26,909
us, but actually, well designed 
ML classifiers turned out to be less

68
00:05:26,909 --> 00:05:32,129
biasing than, quote unquote simple cuts,
in many cases

69
00:05:33,720 --> 00:05:37,050
and similarly, it's just too...

70
00:05:39,780 --> 00:05:45,510
This has been used in LHCb for fake or
ghost track classification. Also since

71
00:05:45,540 --> 00:05:47,400
since quite early on. This is a plot

72
00:05:48,420 --> 00:05:51,270
you see the ROC curve on the left and
the

73
00:05:52,920 --> 00:05:58,950
plot on the right with real
data where you see in blue, what is

74
00:05:59,100 --> 00:05:59,970
selected as

75
00:06:00,000 --> 00:06:03,090
as being real tracks and in red was
rejected as being fakes.

76
00:06:05,250 --> 00:06:10,110
And we actually gained quite a lot over a
simple kind of track chi squared by

77
00:06:10,200 --> 00:06:14,550
training a neural network to really use
the information from all the different

78
00:06:14,550 --> 00:06:22,140
tracking sub detectors. And this was then
tuned, as all of these things have been, by

79
00:06:22,140 --> 00:06:25,800
hand, the implementation was tuned for
execution speed.

80
00:06:26,910 --> 00:06:30,900
And there are many other classifiers used
for particle identification, things like that,

81
00:06:30,960 --> 00:06:32,790
that LHCb included in the trigger.

82
00:06:35,070 --> 00:06:37,290
Let me move on from LHCb,

83
00:06:38,310 --> 00:06:39,360
not to be too

84
00:06:41,730 --> 00:06:42,750
too parochial.

85
00:06:44,160 --> 00:06:47,310
This is the true classification at Atlas.

86
00:06:48,990 --> 00:06:54,150
So there's three examples here on the
left, you see the algorithms used in the

87
00:06:54,150 --> 00:07:00,000
HLT to to like, identify taus for

88
00:07:00,000 --> 00:07:01,170
tau triggers.

89
00:07:02,639 --> 00:07:08,129
And neural network based election ID in
the middle, and the online b-jet

90
00:07:08,129 --> 00:07:13,439
tagging on the right. So you can see on
the on the right the performance of

91
00:07:13,439 --> 00:07:17,879
different online operating points with
respect to the offline, b-tagging.

92
00:07:20,880 --> 00:07:24,900
And, indeed, Atlas uses

93
00:07:26,550 --> 00:07:34,380
different kinds of classifiers in parts of
its online processing chain with I would

94
00:07:34,380 --> 00:07:38,070
say it's fair to say with a gradual
evolution from BDTs to kind of neural

95
00:07:38,070 --> 00:07:42,660
networks and more complex things over
time and

96
00:07:44,310 --> 00:07:50,730
with large rate reductions at the same
efficiency working point. And also and

97
00:07:50,730 --> 00:07:54,300
this is also what we see in LHCb, I
think it's generally seen that you can

98
00:07:54,300 --> 00:07:57,630
reduce your trigger latency by reducing
backgrounds earlier in the processing

99
00:07:57,630 --> 00:07:57,930
chain.

100
00:08:00,210 --> 00:08:03,360
Now for CMS, I picked out two examples.

101
00:08:04,440 --> 00:08:10,740
One is the L1 muon endcap trigger on the
left of this plot, where you see the

102
00:08:12,480 --> 00:08:19,110
the physics performance of this what is
called the upgraded trigger on the right.

103
00:08:19,110 --> 00:08:24,720
And you see it's better a bit better in
this turn on region than the previous

104
00:08:24,750 --> 00:08:30,480
incarnation. But on the far left, you
see that the trigger rate is really

105
00:08:30,480 --> 00:08:34,860
substantially lower again for broadly the
same physics performance with

106
00:08:37,230 --> 00:08:41,430
with this BDT based endcap trigger. On the
right you can see

107
00:08:42,510 --> 00:08:43,770
the performance of this

108
00:08:46,230 --> 00:08:48,060
neural network deep CSV

109
00:08:49,860 --> 00:08:51,360
b-jet tagging algorithm.

110
00:08:53,730 --> 00:08:59,400
And you see again, the reduction in the
light flavor efficiency. If you look

111
00:08:59,400 --> 00:08:59,970
between the

112
00:09:00,000 --> 00:09:05,430
solid and the dashed lines of each 
color, the reduction

113
00:09:05,430 --> 00:09:10,170
in light flavor jet efficiency with respect
for given b-jet efficiency working point.

114
00:09:12,990 --> 00:09:13,890
So,

115
00:09:14,910 --> 00:09:17,820
I find it by the way personally
interesting that all of us seem to

116
00:09:17,820 --> 00:09:23,580
have converged on using ML of one kind or
another to tag beauty hadrons or beauty

117
00:09:23,580 --> 00:09:26,910
jets. So that's quite nice.

118
00:09:29,100 --> 00:09:35,070
But then the other thing I wanted a little
bit to, to highlight here is somehow some

119
00:09:35,070 --> 00:09:38,880
of the approaches towards the
future for classification. So one thing

120
00:09:38,910 --> 00:09:44,700
very exciting at the moment is deploying
classifiers on FPGAs. I tried to say that

121
00:09:44,700 --> 00:09:51,390
since the beginning we've worked on
optimizing these classifiers for

122
00:09:51,420 --> 00:09:55,500
speed and their implementations and kind
of a logical end point of that is to see

123
00:09:55,500 --> 00:09:59,940
if you can compress your models to the
extent that they can fit into FPGAs and

124
00:10:00,000 --> 00:10:03,780
then this would allow you to use them in
first level triggers or in coprocessor

125
00:10:03,780 --> 00:10:05,010
based architectures.

126
00:10:07,140 --> 00:10:09,870
And this is, I think, extremely
interesting also for the future. And

127
00:10:09,870 --> 00:10:12,090
there's a lot of work going on in this and
you see

128
00:10:13,139 --> 00:10:15,389
a plot of the

129
00:10:16,470 --> 00:10:18,240
performance of these,

130
00:10:19,980 --> 00:10:24,150
these implementations in this HLS4ml
framework.

131
00:10:26,400 --> 00:10:29,790
And then the other thing that I find
interesting is this unsupervised trigger

132
00:10:29,790 --> 00:10:35,130
classification. So this is the question
we often get right that, Oh, well, what if

133
00:10:35,130 --> 00:10:40,980
your trigger misses something you didn't predict
and so now there's approaches on going to

134
00:10:42,000 --> 00:10:47,340
have a kind of generalized anomaly
detection based triggers. And you can see

135
00:10:47,340 --> 00:10:47,940
from these

136
00:10:50,550 --> 00:10:51,960
from these plots, that

137
00:10:54,120 --> 00:10:58,200
they're actually robust against having SM
backgrounds in your training sample. So

138
00:10:58,200 --> 00:10:59,970
you can you can train on data, but

139
00:11:00,269 --> 00:11:04,889
of course you don't get the same performance
as fully supervised training. But still, I

140
00:11:04,889 --> 00:11:09,839
think it's interesting area being pursued.
Let me move on to two other things as

141
00:11:09,839 --> 00:11:11,369
I am running a bit behind time.

142
00:11:12,480 --> 00:11:16,500
As far as reconstruction goes, I think the
main message here is that for the moment,

143
00:11:17,190 --> 00:11:21,690
reconstructions are still primarily sort
of classical based with machine learning

144
00:11:21,690 --> 00:11:26,160
assistance. And then the community is
trying to see how far the assistance part

145
00:11:26,160 --> 00:11:31,410
could be pushed. So again, at LHCb the
first steps were to add kind of

146
00:11:31,560 --> 00:11:35,610
classifiers inside the classic kind of
Hough transform based pattern recognition

147
00:11:36,060 --> 00:11:39,450
for an earlier rejections of bad hit
combinations.

148
00:11:41,640 --> 00:11:45,060
But you're not making trajectories from
hits in the neural network itself, right,

149
00:11:45,060 --> 00:11:45,900
you just classify.

150
00:11:48,090 --> 00:11:53,730
And then in the community more broadly,
there are ongoing efforts. So there

151
00:11:53,730 --> 00:11:55,380
was the TrackML challenge,

152
00:11:56,670 --> 00:11:59,970
where many participants sort of competed
on kaggle to

153
00:12:00,480 --> 00:12:04,320
to come up with sort of ML based tracking
algorithms and actually,

154
00:12:05,400 --> 00:12:09,870
interestingly, the best approaches kind of
mixed classical tracking, and some

155
00:12:09,870 --> 00:12:14,880
physical models of the track path with machine
learning aspects again, to reject fake

156
00:12:14,880 --> 00:12:19,140
combinatorics as early as possible and also
approaches to training to find the optimal

157
00:12:19,140 --> 00:12:24,210
search parameters. And then there's more
kind of even more ambitious things, let's

158
00:12:24,210 --> 00:12:28,620
say like this Exa.TrkX project, which
is trying to use neural networks to really

159
00:12:28,620 --> 00:12:30,630
have a completely ML based tracking.

160
00:12:32,700 --> 00:12:36,300
And that I think, is very, very
interesting.

161
00:12:37,980 --> 00:12:39,150
And in a similar vein,

162
00:12:40,800 --> 00:12:42,870
just want to show some results for

163
00:12:44,040 --> 00:12:49,920
a neural network based primary vertex
reconstruction, where you see here, this

164
00:12:50,310 --> 00:12:53,880
kind of picture of primary vertex
distribution in Z in LHCb and

165
00:12:53,880 --> 00:12:59,970
sort of the tracks obviously cluster
at various Z values and

166
00:13:00,000 --> 00:13:05,760
you can quite intuitively use a neural
network to try and find these peaks. But

167
00:13:05,760 --> 00:13:10,650
again, you have a classical algorithm to
find the tracks first. So, still a hybrid

168
00:13:10,650 --> 00:13:11,610
approach, but

169
00:13:12,719 --> 00:13:14,279
that's the direction of travel.

170
00:13:15,450 --> 00:13:17,100
We go towards calibration.

171
00:13:18,330 --> 00:13:20,370
A couple of examples here.

172
00:13:22,500 --> 00:13:23,430
In ALICE

173
00:13:25,530 --> 00:13:27,150
here you have

174
00:13:28,260 --> 00:13:35,460
a calibration of the TPC, which you wish
to achieve, and you want to correct for

175
00:13:36,210 --> 00:13:39,330
distortions in the position of TPC
clusters

176
00:13:41,250 --> 00:13:47,280
already online, but your analytical
correction procedure is far too

177
00:13:47,280 --> 00:13:48,150
slow for this.

178
00:13:49,290 --> 00:13:55,140
And so what's being explored is to use
convolution neural networks to correct for

179
00:13:55,140 --> 00:13:55,680
this

180
00:13:57,870 --> 00:14:00,000
using space charge

181
00:14:00,900 --> 00:14:03,690
or currents as the input, and

182
00:14:04,890 --> 00:14:05,490
you

183
00:14:06,600 --> 00:14:12,930
you see here on the on the plot that the
resolution of the predicted distortion is

184
00:14:13,350 --> 00:14:19,410
about where you would expect from the
intrinsic resolution, but is much faster

185
00:14:19,410 --> 00:14:21,990
than analytical methods. So,

186
00:14:23,100 --> 00:14:26,670
you have a lot of details in the backup
slides about this.

187
00:14:27,840 --> 00:14:33,240
And I also wanted to mention in CMS,
there was some very interesting work to

188
00:14:33,240 --> 00:14:41,220
calibrate jet energy with with DNNs and
just very quickly, because I am runnig out of

189
00:14:41,220 --> 00:14:45,780
time on the right you can see the kind of
visually the clear improvement in the

190
00:14:45,960 --> 00:14:49,530
resolution between the DNN and the baseline
approach.

191
00:14:52,650 --> 00:14:57,840
So, let me just conclude with a few kind
of personal remarks

192
00:14:59,670 --> 00:15:00,000
which is

193
00:15:00,000 --> 00:15:04,800
Just to say that, I mean broadly speaking,
over the last 10 years, the direction of

194
00:15:04,800 --> 00:15:10,590
travel has been increasing deployment of
ML based methods throughout online and

195
00:15:11,220 --> 00:15:15,090
calibration. And we know by now that this
can outperform classical methods in

196
00:15:15,090 --> 00:15:18,690
physics terms certainly in classification,
increasing hints of that in reconstruction

197
00:15:18,690 --> 00:15:19,020
too,

198
00:15:20,490 --> 00:15:25,290
but what I think is going to be crucial is
to continue to accumulate our experience

199
00:15:25,290 --> 00:15:29,640
of deplying ML across different
architectures, especially if computing

200
00:15:30,210 --> 00:15:35,640
keeps becoming more heterogeneous, kind of
in the real world outside of HEP, and to

201
00:15:35,640 --> 00:15:39,330
understand where you know the computational
efficiency and where we should mix,

202
00:15:39,420 --> 00:15:45,660
the classical and ml approaches to
make things faster. And both the

203
00:15:45,660 --> 00:15:50,640
algorithms and also the structures used to
pass data around are critical to this.

204
00:15:51,120 --> 00:15:51,660
Thank you Vava

205
00:15:54,840 --> 00:15:59,970
for the talk, and for sticking to time and
opening the floor for questions.

206
00:16:00,000 --> 00:16:04,800
People if you want to raise your hands in
the zoom interface

207
00:16:05,910 --> 00:16:09,870
So far I don't see any raised hands

208
00:16:11,370 --> 00:16:13,350
yes, Igor,

209
00:16:14,700 --> 00:16:19,170
you should be able to unmute yourself. Can
you hear me? Yes.

210
00:16:20,220 --> 00:16:25,530
Okay, this is Igor Volobuev with Texas Tech
University. There appears to be a

211
00:16:25,530 --> 00:16:31,260
major effort in at least statistical
community on interpretability of various

212
00:16:31,260 --> 00:16:35,790
machine learning techniques. I wonder if
this kind of effort could be translated to

213
00:16:35,790 --> 00:16:36,840
high energy physics.

214
00:16:42,120 --> 00:16:43,020
So,

215
00:16:45,060 --> 00:16:50,430
when you say interpretability what I mean
just to understand the question, well,

216
00:16:50,430 --> 00:16:56,160
what specifically are you are
you thinking of? Well, interpretability

217
00:16:56,760 --> 00:16:57,570
means that

218
00:16:58,890 --> 00:16:59,970
humans should not be viewing

219
00:17:00,000 --> 00:17:05,460
the machine learning
techniques as a black box, right? So some

220
00:17:05,490 --> 00:17:11,730
understanding of what the techniques is
doing should be transferable and

221
00:17:11,730 --> 00:17:12,750
quantifiable.

222
00:17:14,849 --> 00:17:17,039
Right, no, but right. So I think

223
00:17:18,390 --> 00:17:24,600
if we if we talk about concrete, in terms
of quantifying it, certainly in LHCb

224
00:17:24,600 --> 00:17:26,490
but I think this is true across
the experiments.

225
00:17:28,740 --> 00:17:32,880
There's a lot of work that that's already
gone into, and I think goes into

226
00:17:32,880 --> 00:17:36,630
especially when we deploy these things in
real time, because if you make a mistake,

227
00:17:36,630 --> 00:17:38,280
you can't recover very easily,

228
00:17:39,300 --> 00:17:41,040
to quantify its performance

229
00:17:42,480 --> 00:17:46,110
as a function of all the different
potential variables of interest. So if you

230
00:17:46,110 --> 00:17:51,030
deploy a classifier in LHCb you
would check typically the efficiency

231
00:17:51,030 --> 00:17:55,650
turn on curve not only in PT, but in the
lifetime of the particle. I mean, I showed

232
00:17:55,650 --> 00:17:59,880
this example of flattening the efficiency
across the Dalitz plane which is

233
00:18:00,510 --> 00:18:05,040
A good example of something where we, we
care about controlling the performance of

234
00:18:05,040 --> 00:18:08,940
the classifier, not only in some absolute
efficiency, but say the shape of that

235
00:18:08,940 --> 00:18:13,500
efficiency explicitly because it helps you
minimize systematics associated with the

236
00:18:13,500 --> 00:18:16,950
simulation of this efficiency later on.

237
00:18:18,449 --> 00:18:22,799
And certainly, as part of that, I mean,
again, I, you know, in LHCb I can

238
00:18:22,799 --> 00:18:27,749
speak directly, but I'm sure it's true for
others as well, that there's been since

239
00:18:27,749 --> 00:18:33,659
the beginning, we've always pruned our
classifier. So you always as part of it,

240
00:18:33,659 --> 00:18:38,489
basically, start with some set of
features. And then we fairly aggressively

241
00:18:38,879 --> 00:18:42,899
reduced that to some minimum set that are
actually relevant. And that's also part of

242
00:18:42,899 --> 00:18:45,269
understanding what it what it does, right?
You

243
00:18:46,410 --> 00:18:50,340
use or figure out which features it's
using more and so on and so forth. So

244
00:18:51,510 --> 00:18:56,850
but I don't think that this work, this is
sort of being done in if you want in

245
00:18:56,850 --> 00:18:59,910
a physicist way. I don't think this is
formalized, in

246
00:19:00,000 --> 00:19:03,750
some language, that the
statistical community wouldn't necessarily

247
00:19:04,950 --> 00:19:09,300
recognize generally speaking. So that may
be something where

248
00:19:10,440 --> 00:19:14,700
which, which could be interesting to, to
understand actually, how much of this are

249
00:19:14,700 --> 00:19:16,860
we doing and just talking about it
in a different way.

250
00:19:23,700 --> 00:19:26,040
I see Steve, Steven, you're supposed to be able 
to unmute yourself?

251
00:19:27,840 --> 00:19:31,920
Yeah. Hi, this is Steven Schramm. So very
nice talk Vava. Thank you very much. I was just

252
00:19:31,920 --> 00:19:35,940
wondering actually about combining the two
aspects in the triggering and the

253
00:19:35,940 --> 00:19:36,810
calibration.

254
00:19:37,860 --> 00:19:42,540
You spoke a lot about how we are using
classification with machine learning in

255
00:19:42,540 --> 00:19:45,750
the trigger. I'm wondering, because I
haven't seen too much of it yet, have you

256
00:19:45,750 --> 00:19:49,770
seen much in the form of regression or
other forms of calibration being applied

257
00:19:49,770 --> 00:19:50,670
in the trigger as well?

258
00:19:52,260 --> 00:20:00,000
So, no, I mean, no I mean certainly
not, not in 

259
00:20:00,000 --> 00:20:02,910
LHCb I mean, our

260
00:20:05,400 --> 00:20:07,920
our focus has been on

261
00:20:10,320 --> 00:20:14,100
automating the calibration workflows so
that

262
00:20:15,120 --> 00:20:20,220
so they can be performed in real time but
but really the workflows themselves by and

263
00:20:20,220 --> 00:20:23,790
large are still very much classical. Now
this may be

264
00:20:26,820 --> 00:20:31,200
you know, this, this may be something
where in the future, particularly

265
00:20:31,200 --> 00:20:31,920
for

266
00:20:33,900 --> 00:20:40,560
calorimeter related things, there are
some, you know, obvious points to tackle

267
00:20:40,560 --> 00:20:45,300
there. For tracking or for something
like alignment, it's, it's less obvious to

268
00:20:45,300 --> 00:20:46,110
me how you would

269
00:20:47,700 --> 00:20:52,230
Yeah, I guess I was thinking of, for
example, you know, some object PT

270
00:20:52,230 --> 00:20:55,920
cut or something like that, and you lose a
lot of rate from the difference between

271
00:20:55,920 --> 00:20:59,610
the trigger and offline selection. And I
was wondering if you know of any studies

272
00:20:59,610 --> 00:21:00,000
where you're trying

273
00:21:00,000 --> 00:21:03,240
to basically turn that into a step
function by having something which really

274
00:21:03,240 --> 00:21:07,020
converges to the offline. I don't know, I
haven't seen any studies. But I was just

275
00:21:07,020 --> 00:21:13,470
curious if you had... No I hadn't I mean, I
mean, LHCb focuses on on making sure that

276
00:21:13,470 --> 00:21:17,310
if you want, the features that go in are the
same. So we focused on getting the

277
00:21:17,820 --> 00:21:21,630
reconstruction to the underlying inputs
being identical between online and

278
00:21:21,630 --> 00:21:22,230
offline.

279
00:21:23,250 --> 00:21:27,870
But it's true that what we have
have seen over time is no matter how much

280
00:21:27,870 --> 00:21:33,330
you make the underlying inputs, the same,
actually, even even much more primitive

281
00:21:33,330 --> 00:21:38,460
things in machine learning, like packing
and unpacking the data, when you ship it

282
00:21:38,460 --> 00:21:44,460
around, actually ends up if you end up
with a classifier later, because it's so

283
00:21:44,460 --> 00:21:49,530
multi dimensional, right? If you have tiny
changes due to even trivial things like

284
00:21:49,530 --> 00:21:55,080
that in your input features, you can end
up with non negligible tails coming out

285
00:21:55,320 --> 00:21:59,970
for no good reason, particularly. So when
we've tried our best to

286
00:22:00,000 --> 00:22:04,110
suppress stuff like that. But it's an
interesting point something we should look

287
00:22:04,110 --> 00:22:06,720
at more systematically. Okay, thanks a
lot.

288
00:22:10,380 --> 00:22:11,670
Maybe you have related questions

289
00:22:13,440 --> 00:22:18,450
is, would online training be able to help
with this type of

290
00:22:19,470 --> 00:22:22,500
variational, slow variational classification
of

291
00:22:23,550 --> 00:22:26,820
classification performance or whatever
performance of the models

292
00:22:29,220 --> 00:22:35,700
so, I think I think where online training
could be could be interesting is if you

293
00:22:37,980 --> 00:22:40,110
is actually in helping you

294
00:22:41,490 --> 00:22:45,870
if you're if you expect your detector
performance to change significantly over

295
00:22:45,870 --> 00:22:53,160
time. So if you, for example, if part of
your detector ages and then again until

296
00:22:53,160 --> 00:22:57,990
now, you know, we've often taken crude
approaches to this. So the calorimeter

297
00:22:57,990 --> 00:23:00,000
does age over time. So we actually

298
00:23:00,000 --> 00:23:04,950
had an automated procedure in run two
which follows the occupancy of the 

299
00:23:04,950 --> 00:23:09,450
calorimeter and then directly adjusts the
voltages to

300
00:23:10,500 --> 00:23:13,830
keep the relationship between the ADC
counts and the energy as

301
00:23:13,830 --> 00:23:18,930
similar as far as possible. And you can
for sure, think of doing more

302
00:23:18,930 --> 00:23:24,570
sophisticated things with stuff like this.
Yes. Okay. All right. Thank you Vava for the

303
00:23:24,570 --> 00:23:26,910
talk and for the answers

304
00:23:28,320 --> 00:23:30,360
Thank you for listening.

305
00:23:31,770 --> 00:23:35,100
Next is Lucas on machine learning techniques
for jet classifications.