1 00:00:00,000 --> 00:00:03,810 Anyway, okay, Hello, everyone, I hope you're all doing well. Today I will 2 00:00:03,810 --> 00:00:09,570 present the performance of hedonic reconstruction and identification at Atlas 3 00:00:09,570 --> 00:00:15,960 and CMS. So first, just a quick motivation on why we are interested in looking at 4 00:00:15,960 --> 00:00:22,170 house, Atlas and CMS. So one thing that's interesting about it, though, is that it's 5 00:00:22,170 --> 00:00:29,220 the heaviest Latin and so it has the largest you color coupling to the Higgs 6 00:00:30,330 --> 00:00:36,240 among the leptons, but even though it's only about 6% of the branching fraction of 7 00:00:36,270 --> 00:00:41,700 x, okay, I'll come back to 60% of beak works you can see from from this 8 00:00:41,700 --> 00:00:47,340 measurement of the coupling, that the resolution can can be made much higher. 9 00:00:48,000 --> 00:00:53,250 One other interesting thing about towers is that they also decay inside the 10 00:00:53,250 --> 00:00:58,380 detector, and then there will be one neutrino and this allows you to make Sep 11 00:00:58,380 --> 00:01:04,110 measurements on The exposure by looking at angular distribution, which are sensitive 12 00:01:04,110 --> 00:01:09,420 to the spin correlation of this decay, but also for several beyond the Standard Model 13 00:01:09,420 --> 00:01:14,970 scenarios. You might have some interesting signatures without in the final states. 14 00:01:15,390 --> 00:01:19,320 And a lot of these are well motivated Fatah's because they might have some type 15 00:01:19,350 --> 00:01:25,860 of enhanced coupling to the first generation. And so before we start, also a 16 00:01:25,860 --> 00:01:30,540 quick review of the top properties themselves, so it has a mass of about 1.8 17 00:01:30,540 --> 00:01:37,170 gV, and this means it can decay to lighter hydrants like the pylons. And this is 18 00:01:37,170 --> 00:01:42,060 indeed the dominant type of Saudi case. So if you look at this pie chart here, it's 19 00:01:42,060 --> 00:01:48,480 about two thirds of the pies, you have at least 100 in the final state. And in this 20 00:01:48,480 --> 00:01:55,020 talk, I will call this hydronic decay tau H or just tau. And then about one third of 21 00:01:55,020 --> 00:02:02,670 the time you have either one moon or one electron decay from Stop. It also has a 22 00:02:02,670 --> 00:02:09,450 very short lifetime but not too short. So it can still travel on on average one 23 00:02:09,450 --> 00:02:16,350 millimeter inside the detector for typical energy of 20 gV. And just to compare, for 24 00:02:16,350 --> 00:02:20,250 example, we had drones, we had drones, they have lifetimes, which are about 10 25 00:02:20,250 --> 00:02:26,610 times longer, but thanks to this traveling inside the detector, there will be often a 26 00:02:26,610 --> 00:02:34,050 secondary vertex which we can make use of in our education algorithms. Besides that, 27 00:02:34,050 --> 00:02:40,470 because because it has a very light at these energies, they are correlated jets 28 00:02:40,470 --> 00:02:45,120 with that are very isolated compared to for example, cork and glue on jets that 29 00:02:45,120 --> 00:02:50,790 are typically much wider and have a lot of activity in this wider cone. So for 30 00:02:50,790 --> 00:02:55,170 towels, you typically have different number of charged pylons, typically one 31 00:02:55,170 --> 00:03:00,720 are free. And then you can also have sometimes an extra by zero which promptly 32 00:03:00,720 --> 00:03:06,000 to two photons. So on this slide I just wanted to make a very rough sketch of how 33 00:03:06,000 --> 00:03:12,270 the satanic indicators are detected that opposite CMS. So, typically, we start from 34 00:03:12,330 --> 00:03:18,570 ak 48, which is used as a seed for the tau algorithm, where we reconstruct different 35 00:03:18,600 --> 00:03:24,660 types of decay modes. And so typically, these are counting different number of 36 00:03:24,660 --> 00:03:30,000 charge tracks that have some energy deposits in HDL. And these are assumed to 37 00:03:30,000 --> 00:03:36,090 be charged pints, but sometimes they can also be by zero which quickly the case to 38 00:03:36,120 --> 00:03:42,420 photons and then the sun energy deposit decal. And then on top of this, you also 39 00:03:42,420 --> 00:03:46,920 have identification, which often are some type of MVA techniques that are used to 40 00:03:46,920 --> 00:03:51,810 reject many jets but also electrons and neutrons that can fake or can be 41 00:03:51,810 --> 00:03:56,400 misidentified as stars very efficiently sometimes, and these typically use 42 00:03:56,730 --> 00:04:04,650 lifetime and isolation variables to to discriminate against jets. So before I go 43 00:04:04,650 --> 00:04:09,990 in the in the details, I just want to give a very broad overview of the algorithms 44 00:04:09,990 --> 00:04:14,610 that are available. So at CMS on the left, we have just one algorithm for 45 00:04:14,700 --> 00:04:18,840 reconstruction which is called HPS. Whereas It seems there are two so one is 46 00:04:18,870 --> 00:04:24,120 called the baseline algorithm which was developed during red one, and then they're 47 00:04:24,120 --> 00:04:29,610 going to also now we are using the tau Particle Flow. And then on top of these, 48 00:04:30,090 --> 00:04:34,290 we have classically abilities that are used to discriminate against jets, but 49 00:04:34,410 --> 00:04:40,350 recently there also have been new neural networks that that improve the performance 50 00:04:40,350 --> 00:04:48,510 of identification. So now I will first go into more details for CMS with hp hp s 51 00:04:48,510 --> 00:04:50,280 algorithm and the dnn. 52 00:04:52,110 --> 00:04:58,980 So HP S stands for pattern strips and as I mentioned a few slides back, start from an 53 00:04:58,980 --> 00:05:06,330 ak 45 For jets as a seat, so in CMS these jets are already made from Particle Flow 54 00:05:06,330 --> 00:05:15,420 objects, which are already ready hadrons or electron or a photon candidates. And 55 00:05:15,420 --> 00:05:20,370 then you look inside this cone and you look for the track with the highest 56 00:05:20,400 --> 00:05:26,010 momentum. And around this track you make silicone, which has a size of 0.2 or 57 00:05:26,010 --> 00:05:31,590 smaller and then also an isolation cone of about 0.4 which is typically used to 58 00:05:31,590 --> 00:05:40,140 define some type of isolation. Then, at CMS, the decay most of the TAs is assigned 59 00:05:40,140 --> 00:05:45,780 by simply counting the number of church tracks inside this signal cone. And then 60 00:05:45,930 --> 00:05:52,080 to count the different numbers of zeros, we look at the clusters in the email. And 61 00:05:52,080 --> 00:05:55,290 what's particular about this algorithm is that you have these green boxes here which 62 00:05:55,470 --> 00:06:01,320 are called strips, where you iteratively merge together like Concept photons, 63 00:06:01,320 --> 00:06:06,930 because the main reason to do this like is because photons have a high probability of 64 00:06:06,930 --> 00:06:11,400 converting into electrons, but it also helps reduce some of the the pilot. 65 00:06:12,870 --> 00:06:14,520 Then once you have your tau 66 00:06:15,900 --> 00:06:21,840 candidates, you apply some some type of identification. So the newest thing is 67 00:06:22,110 --> 00:06:27,060 this thing called Deep Tao. And this is a computational deep neural network and so 68 00:06:27,060 --> 00:06:32,250 it uses just like the older PDT is high level information, like the lifetime 69 00:06:32,610 --> 00:06:36,240 isolation variables and variables related to the kinematics of electrons and 70 00:06:36,240 --> 00:06:44,640 photons. But what's new is that they also use these compositional layers with as an 71 00:06:44,640 --> 00:06:49,440 input, the particle for Particle Flow hadron smoothens electrons or photons that 72 00:06:49,440 --> 00:06:55,620 fall into this grid or this screen of cells in either five and then as an 73 00:06:55,620 --> 00:07:01,080 output, it's it gives you the probability that you're talking about is a real time 74 00:07:01,110 --> 00:07:06,600 when we own electron or jets. And so, basically this is one of the classifier on 75 00:07:06,600 --> 00:07:11,580 the bottom here I show some some examples of the performance in terms of 76 00:07:11,580 --> 00:07:17,520 misidentification versus efficiency. So, in blue is the new deep neural network and 77 00:07:17,520 --> 00:07:22,650 you have to compare this to the green one, which are the previous identification 78 00:07:22,650 --> 00:07:27,750 algorithms. And in particular, the most importantly, is the anti jets 79 00:07:27,810 --> 00:07:34,860 discrimination here, where for a very typical efficiency of about 6%, you have a 80 00:07:34,860 --> 00:07:40,620 misidentification probability of 0.6 or inversely, this is a rejection factor of 81 00:07:40,620 --> 00:07:49,380 about 170. Okay, so this was CMS. Now I will talk a bit more about Atlas starting 82 00:07:49,410 --> 00:07:56,640 with the baseline algorithm and the RNN. So for Atlas, there was this baseline 83 00:07:56,640 --> 00:08:02,910 reconstruction developed during round one Since Atlas doesn't have Particle Flow, it 84 00:08:02,910 --> 00:08:09,660 instead instead has four jets that are made out of clusters. And just like CMS, 85 00:08:09,990 --> 00:08:16,170 there is a cyclical in defiance of a fixed size 0.2 and then isolation colon of 0.4. 86 00:08:16,770 --> 00:08:23,370 And then looking inside the signal con, there is a BDT with set of BTS which is 87 00:08:23,370 --> 00:08:29,010 used to classify the tracks. And then based on the output from this, you can 88 00:08:29,010 --> 00:08:33,060 assign the decay modes to either one or three prongs. So, prong here means just 89 00:08:33,060 --> 00:08:38,280 the charge track. And one thing that's important to notice here is that for 90 00:08:38,280 --> 00:08:43,800 baseline reconstruction, deployment is simply the sum of the energy clusters 91 00:08:43,800 --> 00:08:50,400 here. And so this is inclusive in by zero, which is not separately identified as, for 92 00:08:50,400 --> 00:08:55,530 example, photons or electrons. Then on top of the baseline reconstruction algorithm, 93 00:08:55,530 --> 00:09:00,210 there is the neural network, which is a recurrent neural network. That's your to 94 00:09:00,210 --> 00:09:06,630 discriminate against jets. And this uses just like CMS high level variables, like 95 00:09:06,630 --> 00:09:11,640 the risks related to the towel lifetime isolation, but also different variables 96 00:09:11,640 --> 00:09:15,930 related to the calorie meter. And what's special about this neural network that it 97 00:09:15,930 --> 00:09:21,090 has several recurrent layers that uses as input this low level information about the 98 00:09:21,090 --> 00:09:26,790 different tracks and clusters. On the bottom here I have on the left one plot 99 00:09:26,820 --> 00:09:31,050 that shows the efficiency just after baseline reconstruction, and you can see 100 00:09:31,050 --> 00:09:39,690 that it's around 74 for this algorithm, and they compare here the the real one 101 00:09:39,690 --> 00:09:47,250 from Murfreesboro. Then on the right, there is the performance plots of the 102 00:09:47,280 --> 00:09:53,790 neural network of the rejection of jets versus efficiency of towers. And so here 103 00:09:53,790 --> 00:10:02,850 again, for a very typical marking point of 60% efficiency, you have About a 1% 104 00:10:03,180 --> 00:10:10,230 misidentification probability which is on the spot here, I think about 80 rejection 105 00:10:10,230 --> 00:10:17,340 of a factor of 80 Okay. So, so, this was the first algorithm reconstruction 106 00:10:17,340 --> 00:10:21,240 algorithm, the second one is called the top Particle Flow, and this one was 107 00:10:21,240 --> 00:10:26,760 developed to improve the, the momentum resolution of the tau. This time instead 108 00:10:26,760 --> 00:10:32,850 of using a naked for jet as a seat it uses the baseline top candidates. And it looks 109 00:10:32,850 --> 00:10:40,410 inside to identify and reconstruct the individual particles. So, in case of the 110 00:10:40,410 --> 00:10:45,840 church pions, you simply identify or you associate the different tracks which are 111 00:10:45,840 --> 00:10:50,550 in the cartoon, the red dots here with different energy clusters in blue. And 112 00:10:50,550 --> 00:10:54,090 then all the remaining clusters that don't have a church track associated with them 113 00:10:54,330 --> 00:10:59,490 are assumed to be coming from Pi Zero and there's a dedicated BDT that is used to 114 00:10:59,490 --> 00:11:03,900 identify These clusters and given probability that they are indeed by zero, 115 00:11:06,270 --> 00:11:12,330 then then once you have these different constituents, you have an additional set 116 00:11:12,330 --> 00:11:18,030 of entities that assign the right to K modes using these as an input. And this 117 00:11:18,030 --> 00:11:23,040 indeed improves the the moments of resolution, mainly below a PT of 100 gV. 118 00:11:23,400 --> 00:11:29,640 So on the bottom left, I have one example for the for the angular distribution for 119 00:11:29,730 --> 00:11:35,520 in terms of phi. So here you have the resolution for tau Particle Flow, which is 120 00:11:35,520 --> 00:11:40,950 black and you see it's much more much narrower than the rest, which is the 121 00:11:40,950 --> 00:11:46,770 baseline reconstruction, but also in terms of the transfers energy on the bottom in 122 00:11:46,770 --> 00:11:52,350 the middle, if you've just focused on the bottom few lines, which are the core of 123 00:11:52,350 --> 00:11:58,740 this resolution distribution, where you have about 68% of the sales. If you 124 00:11:58,740 --> 00:12:02,550 compare the black again, which As the top article fall to the red baseline, you can 125 00:12:02,550 --> 00:12:07,920 see that you can improve by quite a lot, the momentum resolution. And on the right 126 00:12:07,920 --> 00:12:14,970 here you see efficiency of assigning correctly, the different true decay modes 127 00:12:14,970 --> 00:12:18,420 to the reconstructed decay modes and so you see that you have a very nice, the 128 00:12:18,420 --> 00:12:21,990 agonal with a strong correlation between the different decay modes. 129 00:12:23,610 --> 00:12:25,860 You have a bit less than three minutes. 130 00:12:26,460 --> 00:12:34,440 Okay, good. Thanks. So, for the sector's description, we typically make an 131 00:12:34,440 --> 00:12:39,210 efficiency measurements using the towels and so here you have Zeebo isn't the king 132 00:12:39,210 --> 00:12:44,850 to two towers, one tower was to Milan one goes to hadrons and here to Milan is used 133 00:12:44,850 --> 00:12:50,070 as a well measured tech and then the other The, the hydronic indicates how is your 134 00:12:50,070 --> 00:12:54,420 probe and then you use some type of maximum like Fitz, to do some 135 00:12:54,420 --> 00:12:59,970 discriminating observable to to extract the scale factors. So in terms of atlases, 136 00:13:00,000 --> 00:13:08,370 In case of outliers, they use the number of tracks inside the cone of 0.6. And so I 137 00:13:08,400 --> 00:13:12,090 there are no plots for the newer neural network. So I'm showing you the ones for 138 00:13:12,090 --> 00:13:18,930 the BDT. But I can say that for the neural network, the scale factor is close to one 139 00:13:18,960 --> 00:13:23,700 with about 3% of certainty in case of CMS using event mass between the moon and Tao, 140 00:13:23,700 --> 00:13:28,530 so this is called the visible mass. And again, I only have two BDT plots, but the 141 00:13:28,530 --> 00:13:34,290 scale factor for the Ico network is also close to 0.9 and one with about 6% 142 00:13:34,290 --> 00:13:40,260 uncertainty which shows that you have a very good description of the detector. So 143 00:13:41,010 --> 00:13:44,820 this was the reconstruction and identification. one extra thing I want to 144 00:13:44,820 --> 00:13:50,520 go into is the energy calibration. So for outlasts there's dedicated energy 145 00:13:50,520 --> 00:13:58,200 calibration using a boosted regression tree, and they use interpolation between 146 00:13:58,200 --> 00:14:03,690 the calorie meter base and top Particle Flow. based at and then they have targets 147 00:14:03,690 --> 00:14:08,640 the true PT and then train this train this boosted regression tree for this 148 00:14:09,060 --> 00:14:13,410 resolution is about 6% on the on the momentum and the energy scaled and 149 00:14:13,410 --> 00:14:20,460 MonteCarlo is typically one or 3%. For CMS, we already are using Particle Flow 150 00:14:20,460 --> 00:14:24,750 constituents which are already well calibrated so there's not much extra 151 00:14:24,750 --> 00:14:29,640 calibration needed. And again, the PT is well modeled with about 10% resolution. 152 00:14:31,350 --> 00:14:36,030 And just like Atlas, the energy scale in MonteCarlo, is very close to one or 3%. 153 00:14:37,140 --> 00:14:42,000 And then just to show off in the last two slides, the performance of these different 154 00:14:42,000 --> 00:14:46,920 algorithms, I have some plots. So in this slide, we have Thomas. And what's really 155 00:14:46,920 --> 00:14:51,720 cool about these slides, sorry about these plots is that you can see for the 156 00:14:51,720 --> 00:14:57,090 different the different components of Trillian decaying to tell us and you see 157 00:14:57,090 --> 00:15:03,030 the decay modes of these charges. So, so, for example, you have the Pi Zero mass 158 00:15:03,030 --> 00:15:08,610 here and intermediate resonances select the row or the a message. And then on the 159 00:15:08,610 --> 00:15:15,810 last slide you have the physical the physical mass of the of the Z boson. And 160 00:15:15,810 --> 00:15:22,470 just to give you an idea, you have about 66 pure purity of about 66% at the peak 161 00:15:22,470 --> 00:15:27,930 here for this plot for the deep Tao algorithm. So, hopefully I could convince 162 00:15:27,930 --> 00:15:31,410 you that thoughts are interesting for looking for new physics, but also doing 163 00:15:31,410 --> 00:15:36,630 Hicks measurements. Each detector has their own unique approach. So same as many 164 00:15:36,630 --> 00:15:42,570 uses, exploits the fact they have Particle Flow, which already provides for 165 00:15:42,570 --> 00:15:47,370 calibrates to constituents, whereas Atlas starts from a baseline which is more 166 00:15:47,370 --> 00:15:52,620 calorie meter base, but has its article flow developed to exploit tracking for 167 00:15:52,620 --> 00:15:57,540 improved resolution and I also showed that we have an excellent understanding of the 168 00:15:57,540 --> 00:16:04,170 detector. good description and also showed that recent developments of networks have 169 00:16:04,230 --> 00:16:09,660 been able to improve the Jets reduction with about 1% misidentification for 6% 170 00:16:09,690 --> 00:16:12,630 efficiency. So thank you for your attention. 171 00:16:13,349 --> 00:16:18,389 Thanks a lot, Zack. Is Do we have any questions from from the audience? 172 00:16:25,560 --> 00:16:32,370 I don't see raised hands. Yes, there is one. So bedtime, I should be able to talk 173 00:16:32,370 --> 00:16:32,790 now. 174 00:16:35,010 --> 00:16:36,180 Hi, can you hear me? 175 00:16:36,240 --> 00:16:42,000 Yes. Okay. Yeah. Thanks a lot for this nice talk. I had one question about the 176 00:16:42,030 --> 00:16:48,540 energy calibration. At high PT. Don't you need anything special in your calibration 177 00:16:48,570 --> 00:16:57,030 in CMS? Or you just rely out of the box on the polygon fluid constant because I'm 178 00:16:57,030 --> 00:17:03,660 asking in Atlas What we get from the particular Kobe construction is very good 179 00:17:03,660 --> 00:17:10,500 at low PT but degrades a Thai PT because the the track momentum is less accurate. 180 00:17:12,540 --> 00:17:18,360 Yeah, so for CMS, we, we don't have dedicated energy calibration at this Hyper 181 00:17:18,360 --> 00:17:21,810 V. So we indeed still use the Particle Flow constituents. And indeed, you see 182 00:17:21,810 --> 00:17:26,700 that the resolution goes down, I believe at higher. But I don't have a I don't have 183 00:17:26,700 --> 00:17:27,570 a blood for this. 184 00:17:28,470 --> 00:17:29,340 Okay, thank you. 185 00:17:33,450 --> 00:17:34,950 Do we have more questions? 186 00:17:37,200 --> 00:17:43,260 Yes. Can I ask a question? Everything? Go ahead. Yes, thank you. So what I would 187 00:17:43,260 --> 00:17:45,390 like to know is how much 188 00:17:46,830 --> 00:17:51,390 the reconstructions relies on a good calibration in CMS 189 00:17:54,450 --> 00:17:58,170 which means that you need to be reconstructed 190 00:18:00,000 --> 00:18:06,840 calibrate objects with the refined calibration or calibration you get from 191 00:18:06,840 --> 00:18:10,740 the data Yeah, so 192 00:18:10,770 --> 00:18:12,570 the hazards performance. 193 00:18:13,139 --> 00:18:19,049 So, so, you could you could already in principle use the prompt data but 194 00:18:19,169 --> 00:18:24,509 typically when once once we make really measurements for for physics analyses as 195 00:18:24,509 --> 00:18:31,049 the as the top group we typically do this based on reconstructed data so, so already 196 00:18:31,049 --> 00:18:35,009 you have better calibration in particular for tracker which is very important to get 197 00:18:35,009 --> 00:18:42,539 a good resolution on the or description of the lifetime. So, typically we use more to 198 00:18:42,539 --> 00:18:47,789 reconstruct its data but it already performs pretty well on on prompts I would 199 00:18:47,789 --> 00:18:48,059 say. 200 00:18:52,590 --> 00:18:56,850 I was also curious about the about the ego part 201 00:18:58,260 --> 00:19:00,300 for CMS, you Yes. 202 00:19:02,639 --> 00:19:03,539 That's a good question 203 00:19:05,220 --> 00:19:09,960 that I'm not sure about about. But I believe it's similar. So I think we 204 00:19:09,960 --> 00:19:14,250 probably already have a very good calibration prompt. But, but I'm not an 205 00:19:14,250 --> 00:19:18,750 expert on that. So. Okay. Thank you. 206 00:19:19,590 --> 00:19:20,610 Thank you for the question. 207 00:19:21,419 --> 00:19:28,349 Thanks a lot. I have myself a question. So, on slide seven, you show that the dnn 208 00:19:28,799 --> 00:19:33,719 performs much better than the BDT. Do you have any idea why it is so? 209 00:19:34,679 --> 00:19:40,709 Yeah, so like I said, that this deep network is a compositional, deep neural 210 00:19:40,709 --> 00:19:45,539 network. So it basically it looks at really the substructure of the different 211 00:19:45,869 --> 00:19:51,569 constituents inside this in this cone. So it doesn't just look at the lifetime 212 00:19:51,599 --> 00:19:57,599 isolation, but also the relation between all these different tracks and I mean, 213 00:19:57,599 --> 00:20:01,079 hadron send me once and electrons that that's the fall inside this 214 00:20:03,780 --> 00:20:04,830 code so 215 00:20:06,750 --> 00:20:11,220 so it kind of makes it kind of learns the patterns of its image. And so this really 216 00:20:11,220 --> 00:20:15,360 helps to exploit you get the factor to increase the performance here. 217 00:20:16,320 --> 00:20:21,780 And the for the BDT you used to use a specific library 218 00:20:24,360 --> 00:20:34,350 uh, I guess I guess it was simply the team MVA library routes, okay. Yeah. Okay. 219 00:20:36,840 --> 00:20:37,380 Thank you. 220 00:20:40,500 --> 00:20:54,120 Do we have someone else Let me check the attendees know from from the pain list. 221 00:20:58,680 --> 00:21:01,980 Now, it seems not So thanks Tor display