1 00:00:02,790 --> 00:00:08,820 Thanks very much everyone for attending and my pleasure to give this talk 2 00:00:08,820 --> 00:00:13,260 about triggering and online calibration with machine learning techniques 3 00:00:15,690 --> 00:00:19,650 covering material from the four collaborations and also from some 4 00:00:20,130 --> 00:00:27,960 community initiatives on these topics. So, I wanted to briefly start by, because you 5 00:00:27,960 --> 00:00:33,540 know that whenever we discuss sort of machine learning, including in these 6 00:00:33,540 --> 00:00:36,630 contexts, it's important to sort of remember that 7 00:00:38,430 --> 00:00:43,560 it's been with us somehow since, since the beginning, but it's just taken different, 8 00:00:43,770 --> 00:00:50,010 different forms. And so I stole this, this picture to illustrate it. 9 00:00:51,420 --> 00:00:56,400 And the point of the picture is that you're doing the same thing fundamentally 10 00:00:56,400 --> 00:00:59,940 (which is banging your computer) but you're just banging in different ways. 11 00:01:01,709 --> 00:01:03,029 So before run one, 12 00:01:04,589 --> 00:01:07,079 it's not that the community didn't know about machine learning. But 13 00:01:08,130 --> 00:01:12,270 we were preoccupied by saying, you know, can we trust our detector in the first 14 00:01:12,270 --> 00:01:12,810 place? 15 00:01:15,240 --> 00:01:20,370 And then, as run one happened and things worked, and I think it's fair to say that 16 00:01:20,400 --> 00:01:23,310 the detectors worked better than we could have hoped for. 17 00:01:25,740 --> 00:01:26,640 We started 18 00:01:28,050 --> 00:01:32,400 increasingly adopting sort of at least BDTs as some kind of minimal machine 19 00:01:32,400 --> 00:01:38,280 learning approach and increasingly fancier things. First in offline analyses but 20 00:01:38,280 --> 00:01:40,110 also in real time classification. 21 00:01:41,580 --> 00:01:47,910 And then in run two, as we got more and more confident, there was an increasing use of 22 00:01:47,910 --> 00:01:52,110 kind of aligning and calibrating our detectors in real time and real time 23 00:01:52,110 --> 00:01:56,460 analysis, whether it's called that in LHCb or scouting in CMS or trigger level 24 00:01:56,460 --> 00:01:59,790 analysis in ATLAS and that was increasingly becoming a thing. 25 00:02:00,450 --> 00:02:07,170 And so, in addition to classification, we started really 26 00:02:07,170 --> 00:02:12,420 deploying ML to assist, kind of also feature building in generators and for 27 00:02:12,420 --> 00:02:13,170 calibration. 28 00:02:15,660 --> 00:02:17,640 And now going towards run three, 29 00:02:18,690 --> 00:02:24,090 we are really, I would say, kind of starting to 30 00:02:25,350 --> 00:02:29,850 consider ML methods and how they interleave with classical methods across, 31 00:02:31,050 --> 00:02:36,840 you know, the full list of our use cases, whether real time or offline. 32 00:02:39,810 --> 00:02:43,620 And so that's somehow a little bit the context. And now I'd like to go through 33 00:02:43,620 --> 00:02:45,330 and start with classification. 34 00:02:47,760 --> 00:02:53,250 Because that's really the one where we started earliest in the trigger or real 35 00:02:53,250 --> 00:02:59,940 time context to deploy it. So this is a slide from from LHCb 36 00:03:00,810 --> 00:03:07,320 On the right hand side, you can see a plot of essentially our when we optimized our 37 00:03:07,320 --> 00:03:14,730 main BDT based trigger for for run two and you see the different signal modes that you tested 38 00:03:14,730 --> 00:03:18,990 with and the efficiency of these different types of 39 00:03:20,010 --> 00:03:24,090 BDTs and classifiers that we that we tried out. 40 00:03:27,180 --> 00:03:32,580 And on on the on the left, this is a kind of nine year old plot now but this is from 41 00:03:32,580 --> 00:03:37,830 the very first time we deployed the same BDT that that was reoptimized in a fairly 42 00:03:37,830 --> 00:03:42,660 sophisticated way for run two, was first deployed in a much cruder way for already 43 00:03:42,660 --> 00:03:43,590 2011. 44 00:03:44,610 --> 00:03:46,170 And you see the 45 00:03:47,250 --> 00:03:51,480 in gray the response of the minimum bias data and in red the response of 46 00:03:51,480 --> 00:03:52,410 bbar MonteCarlo 47 00:03:53,430 --> 00:03:58,740 2010 MonteCarlo and this very nice modeling actually of the higher 48 00:03:58,740 --> 00:04:00,000 response area where you're having 49 00:04:01,170 --> 00:04:03,540 because this classifier was classifying 50 00:04:04,859 --> 00:04:08,069 proton-proton collisions which had bbar pairs in them. And you see how nicely the 51 00:04:08,069 --> 00:04:13,559 simulation is following data. So it's just to say that within the LHCb trigger 52 00:04:14,219 --> 00:04:19,259 you know, the main, the main, inclusive trigger was based on BDTs since almost the 53 00:04:19,259 --> 00:04:24,599 beginning. And then different classifiers have been used widely throughout, there's 54 00:04:24,599 --> 00:04:27,389 a lot of documentation on that. 55 00:04:28,860 --> 00:04:31,170 One thing we put quite some work into, 56 00:04:32,400 --> 00:04:35,100 over the years is finding ways to 57 00:04:36,660 --> 00:04:39,270 flatten the efficiency curve. So here you see 58 00:04:40,980 --> 00:04:47,220 the efficiency of the BDT. On the left and on the right here it's for a three body 59 00:04:48,300 --> 00:04:52,500 meson decay as a function of position in the Dalitz plane. On the left is if you 60 00:04:52,500 --> 00:04:56,340 don't explicitly train your classifier to be flat. Now on the right is if you 61 00:04:56,340 --> 00:04:59,940 explicitly do train it to keep the efficiency flat, and this is 62 00:05:00,000 --> 00:05:04,680 to minimize systematic. So that just to give you an idea of the kind of things we worry 63 00:05:04,680 --> 00:05:07,620 about when we're doing this training. 64 00:05:09,059 --> 00:05:13,019 And actually this connects to one lesson for me over the last decade when it 65 00:05:13,019 --> 00:05:17,039 comes to classifiers is that unlike when we were saying, like, before we trust our 66 00:05:17,039 --> 00:05:23,039 detector, so you know, ML black boxes, do we know what they're doing? Will they bias 67 00:05:23,039 --> 00:05:26,909 us, but actually, well designed ML classifiers turned out to be less 68 00:05:26,909 --> 00:05:32,129 biasing than, quote unquote simple cuts, in many cases 69 00:05:33,720 --> 00:05:37,050 and similarly, it's just too... 70 00:05:39,780 --> 00:05:45,510 This has been used in LHCb for fake or ghost track classification. Also since 71 00:05:45,540 --> 00:05:47,400 since quite early on. This is a plot 72 00:05:48,420 --> 00:05:51,270 you see the ROC curve on the left and the 73 00:05:52,920 --> 00:05:58,950 plot on the right with real data where you see in blue, what is 74 00:05:59,100 --> 00:05:59,970 selected as 75 00:06:00,000 --> 00:06:03,090 as being real tracks and in red was rejected as being fakes. 76 00:06:05,250 --> 00:06:10,110 And we actually gained quite a lot over a simple kind of track chi squared by 77 00:06:10,200 --> 00:06:14,550 training a neural network to really use the information from all the different 78 00:06:14,550 --> 00:06:22,140 tracking sub detectors. And this was then tuned, as all of these things have been, by 79 00:06:22,140 --> 00:06:25,800 hand, the implementation was tuned for execution speed. 80 00:06:26,910 --> 00:06:30,900 And there are many other classifiers used for particle identification, things like that, 81 00:06:30,960 --> 00:06:32,790 that LHCb included in the trigger. 82 00:06:35,070 --> 00:06:37,290 Let me move on from LHCb, 83 00:06:38,310 --> 00:06:39,360 not to be too 84 00:06:41,730 --> 00:06:42,750 too parochial. 85 00:06:44,160 --> 00:06:47,310 This is the true classification at Atlas. 86 00:06:48,990 --> 00:06:54,150 So there's three examples here on the left, you see the algorithms used in the 87 00:06:54,150 --> 00:07:00,000 HLT to to like, identify taus for 88 00:07:00,000 --> 00:07:01,170 tau triggers. 89 00:07:02,639 --> 00:07:08,129 And neural network based election ID in the middle, and the online b-jet 90 00:07:08,129 --> 00:07:13,439 tagging on the right. So you can see on the on the right the performance of 91 00:07:13,439 --> 00:07:17,879 different online operating points with respect to the offline, b-tagging. 92 00:07:20,880 --> 00:07:24,900 And, indeed, Atlas uses 93 00:07:26,550 --> 00:07:34,380 different kinds of classifiers in parts of its online processing chain with I would 94 00:07:34,380 --> 00:07:38,070 say it's fair to say with a gradual evolution from BDTs to kind of neural 95 00:07:38,070 --> 00:07:42,660 networks and more complex things over time and 96 00:07:44,310 --> 00:07:50,730 with large rate reductions at the same efficiency working point. And also and 97 00:07:50,730 --> 00:07:54,300 this is also what we see in LHCb, I think it's generally seen that you can 98 00:07:54,300 --> 00:07:57,630 reduce your trigger latency by reducing backgrounds earlier in the processing 99 00:07:57,630 --> 00:07:57,930 chain. 100 00:08:00,210 --> 00:08:03,360 Now for CMS, I picked out two examples. 101 00:08:04,440 --> 00:08:10,740 One is the L1 muon endcap trigger on the left of this plot, where you see the 102 00:08:12,480 --> 00:08:19,110 the physics performance of this what is called the upgraded trigger on the right. 103 00:08:19,110 --> 00:08:24,720 And you see it's better a bit better in this turn on region than the previous 104 00:08:24,750 --> 00:08:30,480 incarnation. But on the far left, you see that the trigger rate is really 105 00:08:30,480 --> 00:08:34,860 substantially lower again for broadly the same physics performance with 106 00:08:37,230 --> 00:08:41,430 with this BDT based endcap trigger. On the right you can see 107 00:08:42,510 --> 00:08:43,770 the performance of this 108 00:08:46,230 --> 00:08:48,060 neural network deep CSV 109 00:08:49,860 --> 00:08:51,360 b-jet tagging algorithm. 110 00:08:53,730 --> 00:08:59,400 And you see again, the reduction in the light flavor efficiency. If you look 111 00:08:59,400 --> 00:08:59,970 between the 112 00:09:00,000 --> 00:09:05,430 solid and the dashed lines of each color, the reduction 113 00:09:05,430 --> 00:09:10,170 in light flavor jet efficiency with respect for given b-jet efficiency working point. 114 00:09:12,990 --> 00:09:13,890 So, 115 00:09:14,910 --> 00:09:17,820 I find it by the way personally interesting that all of us seem to 116 00:09:17,820 --> 00:09:23,580 have converged on using ML of one kind or another to tag beauty hadrons or beauty 117 00:09:23,580 --> 00:09:26,910 jets. So that's quite nice. 118 00:09:29,100 --> 00:09:35,070 But then the other thing I wanted a little bit to, to highlight here is somehow some 119 00:09:35,070 --> 00:09:38,880 of the approaches towards the future for classification. So one thing 120 00:09:38,910 --> 00:09:44,700 very exciting at the moment is deploying classifiers on FPGAs. I tried to say that 121 00:09:44,700 --> 00:09:51,390 since the beginning we've worked on optimizing these classifiers for 122 00:09:51,420 --> 00:09:55,500 speed and their implementations and kind of a logical end point of that is to see 123 00:09:55,500 --> 00:09:59,940 if you can compress your models to the extent that they can fit into FPGAs and 124 00:10:00,000 --> 00:10:03,780 then this would allow you to use them in first level triggers or in coprocessor 125 00:10:03,780 --> 00:10:05,010 based architectures. 126 00:10:07,140 --> 00:10:09,870 And this is, I think, extremely interesting also for the future. And 127 00:10:09,870 --> 00:10:12,090 there's a lot of work going on in this and you see 128 00:10:13,139 --> 00:10:15,389 a plot of the 129 00:10:16,470 --> 00:10:18,240 performance of these, 130 00:10:19,980 --> 00:10:24,150 these implementations in this HLS4ml framework. 131 00:10:26,400 --> 00:10:29,790 And then the other thing that I find interesting is this unsupervised trigger 132 00:10:29,790 --> 00:10:35,130 classification. So this is the question we often get right that, Oh, well, what if 133 00:10:35,130 --> 00:10:40,980 your trigger misses something you didn't predict and so now there's approaches on going to 134 00:10:42,000 --> 00:10:47,340 have a kind of generalized anomaly detection based triggers. And you can see 135 00:10:47,340 --> 00:10:47,940 from these 136 00:10:50,550 --> 00:10:51,960 from these plots, that 137 00:10:54,120 --> 00:10:58,200 they're actually robust against having SM backgrounds in your training sample. So 138 00:10:58,200 --> 00:10:59,970 you can you can train on data, but 139 00:11:00,269 --> 00:11:04,889 of course you don't get the same performance as fully supervised training. But still, I 140 00:11:04,889 --> 00:11:09,839 think it's interesting area being pursued. Let me move on to two other things as 141 00:11:09,839 --> 00:11:11,369 I am running a bit behind time. 142 00:11:12,480 --> 00:11:16,500 As far as reconstruction goes, I think the main message here is that for the moment, 143 00:11:17,190 --> 00:11:21,690 reconstructions are still primarily sort of classical based with machine learning 144 00:11:21,690 --> 00:11:26,160 assistance. And then the community is trying to see how far the assistance part 145 00:11:26,160 --> 00:11:31,410 could be pushed. So again, at LHCb the first steps were to add kind of 146 00:11:31,560 --> 00:11:35,610 classifiers inside the classic kind of Hough transform based pattern recognition 147 00:11:36,060 --> 00:11:39,450 for an earlier rejections of bad hit combinations. 148 00:11:41,640 --> 00:11:45,060 But you're not making trajectories from hits in the neural network itself, right, 149 00:11:45,060 --> 00:11:45,900 you just classify. 150 00:11:48,090 --> 00:11:53,730 And then in the community more broadly, there are ongoing efforts. So there 151 00:11:53,730 --> 00:11:55,380 was the TrackML challenge, 152 00:11:56,670 --> 00:11:59,970 where many participants sort of competed on kaggle to 153 00:12:00,480 --> 00:12:04,320 to come up with sort of ML based tracking algorithms and actually, 154 00:12:05,400 --> 00:12:09,870 interestingly, the best approaches kind of mixed classical tracking, and some 155 00:12:09,870 --> 00:12:14,880 physical models of the track path with machine learning aspects again, to reject fake 156 00:12:14,880 --> 00:12:19,140 combinatorics as early as possible and also approaches to training to find the optimal 157 00:12:19,140 --> 00:12:24,210 search parameters. And then there's more kind of even more ambitious things, let's 158 00:12:24,210 --> 00:12:28,620 say like this Exa.TrkX project, which is trying to use neural networks to really 159 00:12:28,620 --> 00:12:30,630 have a completely ML based tracking. 160 00:12:32,700 --> 00:12:36,300 And that I think, is very, very interesting. 161 00:12:37,980 --> 00:12:39,150 And in a similar vein, 162 00:12:40,800 --> 00:12:42,870 just want to show some results for 163 00:12:44,040 --> 00:12:49,920 a neural network based primary vertex reconstruction, where you see here, this 164 00:12:50,310 --> 00:12:53,880 kind of picture of primary vertex distribution in Z in LHCb and 165 00:12:53,880 --> 00:12:59,970 sort of the tracks obviously cluster at various Z values and 166 00:13:00,000 --> 00:13:05,760 you can quite intuitively use a neural network to try and find these peaks. But 167 00:13:05,760 --> 00:13:10,650 again, you have a classical algorithm to find the tracks first. So, still a hybrid 168 00:13:10,650 --> 00:13:11,610 approach, but 169 00:13:12,719 --> 00:13:14,279 that's the direction of travel. 170 00:13:15,450 --> 00:13:17,100 We go towards calibration. 171 00:13:18,330 --> 00:13:20,370 A couple of examples here. 172 00:13:22,500 --> 00:13:23,430 In ALICE 173 00:13:25,530 --> 00:13:27,150 here you have 174 00:13:28,260 --> 00:13:35,460 a calibration of the TPC, which you wish to achieve, and you want to correct for 175 00:13:36,210 --> 00:13:39,330 distortions in the position of TPC clusters 176 00:13:41,250 --> 00:13:47,280 already online, but your analytical correction procedure is far too 177 00:13:47,280 --> 00:13:48,150 slow for this. 178 00:13:49,290 --> 00:13:55,140 And so what's being explored is to use convolution neural networks to correct for 179 00:13:55,140 --> 00:13:55,680 this 180 00:13:57,870 --> 00:14:00,000 using space charge 181 00:14:00,900 --> 00:14:03,690 or currents as the input, and 182 00:14:04,890 --> 00:14:05,490 you 183 00:14:06,600 --> 00:14:12,930 you see here on the on the plot that the resolution of the predicted distortion is 184 00:14:13,350 --> 00:14:19,410 about where you would expect from the intrinsic resolution, but is much faster 185 00:14:19,410 --> 00:14:21,990 than analytical methods. So, 186 00:14:23,100 --> 00:14:26,670 you have a lot of details in the backup slides about this. 187 00:14:27,840 --> 00:14:33,240 And I also wanted to mention in CMS, there was some very interesting work to 188 00:14:33,240 --> 00:14:41,220 calibrate jet energy with with DNNs and just very quickly, because I am runnig out of 189 00:14:41,220 --> 00:14:45,780 time on the right you can see the kind of visually the clear improvement in the 190 00:14:45,960 --> 00:14:49,530 resolution between the DNN and the baseline approach. 191 00:14:52,650 --> 00:14:57,840 So, let me just conclude with a few kind of personal remarks 192 00:14:59,670 --> 00:15:00,000 which is 193 00:15:00,000 --> 00:15:04,800 Just to say that, I mean broadly speaking, over the last 10 years, the direction of 194 00:15:04,800 --> 00:15:10,590 travel has been increasing deployment of ML based methods throughout online and 195 00:15:11,220 --> 00:15:15,090 calibration. And we know by now that this can outperform classical methods in 196 00:15:15,090 --> 00:15:18,690 physics terms certainly in classification, increasing hints of that in reconstruction 197 00:15:18,690 --> 00:15:19,020 too, 198 00:15:20,490 --> 00:15:25,290 but what I think is going to be crucial is to continue to accumulate our experience 199 00:15:25,290 --> 00:15:29,640 of deplying ML across different architectures, especially if computing 200 00:15:30,210 --> 00:15:35,640 keeps becoming more heterogeneous, kind of in the real world outside of HEP, and to 201 00:15:35,640 --> 00:15:39,330 understand where you know the computational efficiency and where we should mix, 202 00:15:39,420 --> 00:15:45,660 the classical and ml approaches to make things faster. And both the 203 00:15:45,660 --> 00:15:50,640 algorithms and also the structures used to pass data around are critical to this. 204 00:15:51,120 --> 00:15:51,660 Thank you Vava 205 00:15:54,840 --> 00:15:59,970 for the talk, and for sticking to time and opening the floor for questions. 206 00:16:00,000 --> 00:16:04,800 People if you want to raise your hands in the zoom interface 207 00:16:05,910 --> 00:16:09,870 So far I don't see any raised hands 208 00:16:11,370 --> 00:16:13,350 yes, Igor, 209 00:16:14,700 --> 00:16:19,170 you should be able to unmute yourself. Can you hear me? Yes. 210 00:16:20,220 --> 00:16:25,530 Okay, this is Igor Volobuev with Texas Tech University. There appears to be a 211 00:16:25,530 --> 00:16:31,260 major effort in at least statistical community on interpretability of various 212 00:16:31,260 --> 00:16:35,790 machine learning techniques. I wonder if this kind of effort could be translated to 213 00:16:35,790 --> 00:16:36,840 high energy physics. 214 00:16:42,120 --> 00:16:43,020 So, 215 00:16:45,060 --> 00:16:50,430 when you say interpretability what I mean just to understand the question, well, 216 00:16:50,430 --> 00:16:56,160 what specifically are you are you thinking of? Well, interpretability 217 00:16:56,760 --> 00:16:57,570 means that 218 00:16:58,890 --> 00:16:59,970 humans should not be viewing 219 00:17:00,000 --> 00:17:05,460 the machine learning techniques as a black box, right? So some 220 00:17:05,490 --> 00:17:11,730 understanding of what the techniques is doing should be transferable and 221 00:17:11,730 --> 00:17:12,750 quantifiable. 222 00:17:14,849 --> 00:17:17,039 Right, no, but right. So I think 223 00:17:18,390 --> 00:17:24,600 if we if we talk about concrete, in terms of quantifying it, certainly in LHCb 224 00:17:24,600 --> 00:17:26,490 but I think this is true across the experiments. 225 00:17:28,740 --> 00:17:32,880 There's a lot of work that that's already gone into, and I think goes into 226 00:17:32,880 --> 00:17:36,630 especially when we deploy these things in real time, because if you make a mistake, 227 00:17:36,630 --> 00:17:38,280 you can't recover very easily, 228 00:17:39,300 --> 00:17:41,040 to quantify its performance 229 00:17:42,480 --> 00:17:46,110 as a function of all the different potential variables of interest. So if you 230 00:17:46,110 --> 00:17:51,030 deploy a classifier in LHCb you would check typically the efficiency 231 00:17:51,030 --> 00:17:55,650 turn on curve not only in PT, but in the lifetime of the particle. I mean, I showed 232 00:17:55,650 --> 00:17:59,880 this example of flattening the efficiency across the Dalitz plane which is 233 00:18:00,510 --> 00:18:05,040 A good example of something where we, we care about controlling the performance of 234 00:18:05,040 --> 00:18:08,940 the classifier, not only in some absolute efficiency, but say the shape of that 235 00:18:08,940 --> 00:18:13,500 efficiency explicitly because it helps you minimize systematics associated with the 236 00:18:13,500 --> 00:18:16,950 simulation of this efficiency later on. 237 00:18:18,449 --> 00:18:22,799 And certainly, as part of that, I mean, again, I, you know, in LHCb I can 238 00:18:22,799 --> 00:18:27,749 speak directly, but I'm sure it's true for others as well, that there's been since 239 00:18:27,749 --> 00:18:33,659 the beginning, we've always pruned our classifier. So you always as part of it, 240 00:18:33,659 --> 00:18:38,489 basically, start with some set of features. And then we fairly aggressively 241 00:18:38,879 --> 00:18:42,899 reduced that to some minimum set that are actually relevant. And that's also part of 242 00:18:42,899 --> 00:18:45,269 understanding what it what it does, right? You 243 00:18:46,410 --> 00:18:50,340 use or figure out which features it's using more and so on and so forth. So 244 00:18:51,510 --> 00:18:56,850 but I don't think that this work, this is sort of being done in if you want in 245 00:18:56,850 --> 00:18:59,910 a physicist way. I don't think this is formalized, in 246 00:19:00,000 --> 00:19:03,750 some language, that the statistical community wouldn't necessarily 247 00:19:04,950 --> 00:19:09,300 recognize generally speaking. So that may be something where 248 00:19:10,440 --> 00:19:14,700 which, which could be interesting to, to understand actually, how much of this are 249 00:19:14,700 --> 00:19:16,860 we doing and just talking about it in a different way. 250 00:19:23,700 --> 00:19:26,040 I see Steve, Steven, you're supposed to be able to unmute yourself? 251 00:19:27,840 --> 00:19:31,920 Yeah. Hi, this is Steven Schramm. So very nice talk Vava. Thank you very much. I was just 252 00:19:31,920 --> 00:19:35,940 wondering actually about combining the two aspects in the triggering and the 253 00:19:35,940 --> 00:19:36,810 calibration. 254 00:19:37,860 --> 00:19:42,540 You spoke a lot about how we are using classification with machine learning in 255 00:19:42,540 --> 00:19:45,750 the trigger. I'm wondering, because I haven't seen too much of it yet, have you 256 00:19:45,750 --> 00:19:49,770 seen much in the form of regression or other forms of calibration being applied 257 00:19:49,770 --> 00:19:50,670 in the trigger as well? 258 00:19:52,260 --> 00:20:00,000 So, no, I mean, no I mean certainly not, not in 259 00:20:00,000 --> 00:20:02,910 LHCb I mean, our 260 00:20:05,400 --> 00:20:07,920 our focus has been on 261 00:20:10,320 --> 00:20:14,100 automating the calibration workflows so that 262 00:20:15,120 --> 00:20:20,220 so they can be performed in real time but but really the workflows themselves by and 263 00:20:20,220 --> 00:20:23,790 large are still very much classical. Now this may be 264 00:20:26,820 --> 00:20:31,200 you know, this, this may be something where in the future, particularly 265 00:20:31,200 --> 00:20:31,920 for 266 00:20:33,900 --> 00:20:40,560 calorimeter related things, there are some, you know, obvious points to tackle 267 00:20:40,560 --> 00:20:45,300 there. For tracking or for something like alignment, it's, it's less obvious to 268 00:20:45,300 --> 00:20:46,110 me how you would 269 00:20:47,700 --> 00:20:52,230 Yeah, I guess I was thinking of, for example, you know, some object PT 270 00:20:52,230 --> 00:20:55,920 cut or something like that, and you lose a lot of rate from the difference between 271 00:20:55,920 --> 00:20:59,610 the trigger and offline selection. And I was wondering if you know of any studies 272 00:20:59,610 --> 00:21:00,000 where you're trying 273 00:21:00,000 --> 00:21:03,240 to basically turn that into a step function by having something which really 274 00:21:03,240 --> 00:21:07,020 converges to the offline. I don't know, I haven't seen any studies. But I was just 275 00:21:07,020 --> 00:21:13,470 curious if you had... No I hadn't I mean, I mean, LHCb focuses on on making sure that 276 00:21:13,470 --> 00:21:17,310 if you want, the features that go in are the same. So we focused on getting the 277 00:21:17,820 --> 00:21:21,630 reconstruction to the underlying inputs being identical between online and 278 00:21:21,630 --> 00:21:22,230 offline. 279 00:21:23,250 --> 00:21:27,870 But it's true that what we have have seen over time is no matter how much 280 00:21:27,870 --> 00:21:33,330 you make the underlying inputs, the same, actually, even even much more primitive 281 00:21:33,330 --> 00:21:38,460 things in machine learning, like packing and unpacking the data, when you ship it 282 00:21:38,460 --> 00:21:44,460 around, actually ends up if you end up with a classifier later, because it's so 283 00:21:44,460 --> 00:21:49,530 multi dimensional, right? If you have tiny changes due to even trivial things like 284 00:21:49,530 --> 00:21:55,080 that in your input features, you can end up with non negligible tails coming out 285 00:21:55,320 --> 00:21:59,970 for no good reason, particularly. So when we've tried our best to 286 00:22:00,000 --> 00:22:04,110 suppress stuff like that. But it's an interesting point something we should look 287 00:22:04,110 --> 00:22:06,720 at more systematically. Okay, thanks a lot. 288 00:22:10,380 --> 00:22:11,670 Maybe you have related questions 289 00:22:13,440 --> 00:22:18,450 is, would online training be able to help with this type of 290 00:22:19,470 --> 00:22:22,500 variational, slow variational classification of 291 00:22:23,550 --> 00:22:26,820 classification performance or whatever performance of the models 292 00:22:29,220 --> 00:22:35,700 so, I think I think where online training could be could be interesting is if you 293 00:22:37,980 --> 00:22:40,110 is actually in helping you 294 00:22:41,490 --> 00:22:45,870 if you're if you expect your detector performance to change significantly over 295 00:22:45,870 --> 00:22:53,160 time. So if you, for example, if part of your detector ages and then again until 296 00:22:53,160 --> 00:22:57,990 now, you know, we've often taken crude approaches to this. So the calorimeter 297 00:22:57,990 --> 00:23:00,000 does age over time. So we actually 298 00:23:00,000 --> 00:23:04,950 had an automated procedure in run two which follows the occupancy of the 299 00:23:04,950 --> 00:23:09,450 calorimeter and then directly adjusts the voltages to 300 00:23:10,500 --> 00:23:13,830 keep the relationship between the ADC counts and the energy as 301 00:23:13,830 --> 00:23:18,930 similar as far as possible. And you can for sure, think of doing more 302 00:23:18,930 --> 00:23:24,570 sophisticated things with stuff like this. Yes. Okay. All right. Thank you Vava for the 303 00:23:24,570 --> 00:23:26,910 talk and for the answers 304 00:23:28,320 --> 00:23:30,360 Thank you for listening. 305 00:23:31,770 --> 00:23:35,100 Next is Lucas on machine learning techniques for jet classifications.