conditional assignment net

C# - Ternary Operator ?:

C# includes a decision-making operator ?: which is called the conditional operator or ternary operator. It is the short form of the if else conditions.

The ternary operator starts with a boolean condition. If this condition evaluates to true then it will execute the first statement after ? , otherwise the second statement after : will be executed.

The following example demonstrates the ternary operator.

Above, a conditional expression x > y returns true, so the first statement after ? will be execute.

The following executes the second statement.

Thus, a ternary operator is short form of if else statement. The above example can be re-write using if else condition, as shown below.

Nested Ternary Operator

Nested ternary operators are possible by including a conditional expression as a second statement.

The ternary operator is right-associative. The expression a ? b : c ? d : e is evaluated as a ? b : (c ? d : e) , not as (a ? b : c) ? d : e .

  • Difference between Array and ArrayList
  • Difference between Hashtable and Dictionary
  • How to write file using StreamWriter in C#?
  • How to sort the generic SortedList in the descending order?
  • Difference between delegates and events in C#
  • How to read file using StreamReader in C#?
  • How to calculate the code execution time in C#?
  • Design Principle vs Design Pattern
  • How to convert string to int in C#?
  • Boxing and Unboxing in C#
  • More C# articles

conditional assignment net

We are a team of passionate developers, educators, and technology enthusiasts who, with their combined expertise and experience, create in -depth, comprehensive, and easy to understand tutorials.We focus on a blend of theoretical explanations and practical examples to encourages hands - on learning. Visit About Us page for more information.

  • C# Questions & Answers
  • C# Skill Test
  • C# Latest Articles

Programmer's Ranch

More beef. Less bull.

Saturday, November 30, 2013

Vb .net basics: conditionals, logical operators and short-circuiting.

conditional assignment net

No comments:

Post a comment.

Note: Only a member of this blog may post a comment.

Verilog Assignments

Variable declaration assignment, net declaration assignment, assign deassign, force release.

  • Procedural continuous

Legal LHS values

An assignment has two parts - right-hand side (RHS) and left-hand side (LHS) with an equal symbol (=) or a less than-equal symbol (<=) in between.

The RHS can contain any expression that evaluates to a final value while the LHS indicates a net or a variable to which the value in RHS is being assigned.

Procedural Assignment

Procedural assignments occur within procedures such as always , initial , task and functions and are used to place values onto variables. The variable will hold the value until the next assignment to the same variable.

The value will be placed onto the variable when the simulation executes this statement at some point during simulation time. This can be controlled and modified the way we want by the use of control flow statements such as if-else-if , case statement and looping mechanisms.

An initial value can be placed onto a variable at the time of its declaration as shown next. The assignment does not have a duration and holds the value until the next assignment to the same variable happens. Note that variable declaration assignments to an array are not allowed.

If the variable is initialized during declaration and at time 0 in an initial block as shown below, the order of evaluation is not guaranteed, and hence can have either 8'h05 or 8'hee.

Procedural blocks and assignments will be covered in more detail in a later section.

Continuous Assignment

This is used to assign values onto scalar and vector nets and happens whenever there is a change in the RHS. It provides a way to model combinational logic without specifying an interconnection of gates and makes it easier to drive the net with logical expressions.

Whenever b or c changes its value, then the whole expression in RHS will be evaluated and a will be updated with the new value.

This allows us to place a continuous assignment on the same statement that declares the net. Note that because a net can be declared only once, only one declaration assignment is possible for a net.

Procedural Continuous Assignment

  • assign ... deassign
  • force ... release

This will override all procedural assignments to a variable and is deactivated by using the same signal with deassign . The value of the variable will remain same until the variable gets a new value through a procedural or procedural continuous assignment. The LHS of an assign statement cannot be a bit-select, part-select or an array reference but can be a variable or a concatenation of variables.

These are similar to the assign - deassign statements but can also be applied to nets and variables. The LHS can be a bit-select of a net, part-select of a net, variable or a net but cannot be the reference to an array and bit/part select of a variable. The force statment will override all other assignments made to the variable until it is released using the release keyword. Protection Status

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications

Using List Comprehensions with Conditional Assignment

Ever coded in Python and wished you could build lists faster and more efficiently? Buckle up, because we’re about to explore a powerful technique called Python List Comprehensions with Conditional Assignment . While the name might sound intimidating, it simply refers to using conditions to filter and modify elements as you create a new list in Python. This approach streamlines your code and makes data manipulation a breeze.

Demystifying List Comprehensions: Building Blocks

Before diving into conditional assignments, let’s revisit the basics of list comprehensions. Imagine you have a list of numbers, and you want to create a new list containing only the even ones. Traditionally, you might use a loop:

This method gets the job done, but it requires a loop and an extra list. Here’s where list comprehensions come in:

This single line accomplishes the same task! It iterates through each number in the original list and uses an if statement to filter out only the even numbers (those divisible by 2). This core structure – [expression for item in iterable if condition] – forms the foundation of list comprehensions.

Taking Control: Filtering with Conditional Assignment

Now, things get exciting! We can leverage conditional assignments within list comprehensions to filter elements based on specific criteria. Let’s say you only want strings longer than 5 characters from a list of words. Here’s how we achieve that:

In this example, the len(word) > 5 condition checks the length of each word. If a word is longer than 5 characters, it’s included in the long_words list. But what if you want to handle shorter words differently? That’s where the else clause comes in!

Here, the else clause takes action on words that don’t meet the initial condition (length less than or equal to 5). It converts them to uppercase using word.upper() . This showcases the power of conditional assignments – you can filter elements and define what happens to those that don’t fit the criteria, all within a concise list comprehension.

Level Up: Complex Filtering and Beyond

List comprehensions with conditional assignments can handle more intricate filtering logic than simple checks. Imagine you have exam scores and want to keep only scores between 70 and 90 (inclusive), with a remainder of 2 when divided by 4. Here’s how we tackle this:

We use two conditions chained with and to filter for scores within the desired range and with a specific remainder. This demonstrates how you can build complex filtering logic with conditional assignments within list comprehensions.

Not Just Numbers: Conditional Magic with Other Iterables

While we’ve focused on lists, list comprehensions with conditional assignments can work with other iterables like dictionaries. Imagine you have a dictionary with student names as keys and their grades as values. You can use a list comprehension to create a new dictionary with only students who scored above 80, assigning a congratulatory message as the value:

This example iterates through each key-value pair in the grades dictionary. The if statement filters for students with grades above 80, and the else clause (though not used here) could be used to assign a different message for students with lower grades. This demonstrates the versatility of conditional assignments in list comprehensions, extending their application beyond simple lists.

Advantages of List Comprehensions with Conditional Assignment

Now that you’ve grasped the core concepts, let’s explore the advantages of using list comprehensions with conditional assignment in your Python code.

1. Conciseness and Readability: Compared to traditional loops with if statements, list comprehensions offer a more compact and readable way to manipulate data. They condense multiple lines of code into a single, clear expression, improving the overall maintainability of your project.

2. Improved Performance: In some cases, list comprehensions can be more efficient than loops, especially when dealing with large datasets. This is because they leverage Python’s built-in list creation mechanisms, potentially leading to faster execution.

3. Versatility: As we’ve seen, conditional assignments within list comprehensions allow you to handle various filtering and modification tasks. You can create complex filtering logic, perform actions based on conditions, and even work with iterables beyond lists.

4. Functional Programming Style: List comprehensions align well with a functional programming approach, where you focus on transforming data without explicitly modifying existing structures. This can lead to cleaner and more declarative code.

When to Use List Comprehensions with Conditional Assignment

While these techniques offer significant benefits, they might not always be the best fit. Here are some scenarios where list comprehensions with conditional assignment shine:

  • Filtering and modifying elements based on specific criteria: When you need to create a new list from an existing one, applying conditions to include or modify elements, list comprehensions with conditional assignment provide a concise and efficient solution.
  • Data cleaning and transformation: These techniques are particularly useful for cleaning and transforming data before analysis. You can filter out irrelevant entries, convert values based on conditions, and reshape your data for further processing.
  • Creating new data structures with specific conditions: Need to generate a new list, dictionary, or other iterable based on selective criteria? List comprehensions with conditional assignment offer a powerful tool to achieve this in a single line of code.

In Conclusion

List comprehension with conditional assignment are a powerful tool in any Python developer’s arsenal. They streamline data manipulation, enhance code readability, and offer a concise way to filter and modify elements while creating new lists or iterables. By understanding their core concepts, benefits, and appropriate use cases, you can leverage these techniques to write cleaner, more efficient, and Pythonic code.

' src=

Post navigation

Previous post.

No comments yet. Why don’t you start the discussion?

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Compiler tricks in x86 assembly: Ternary operator optimization

One relatively common compiler optimization that can be handy to quickly recognize relates to conditional assignment (where a variable is conditionally assigned either one value or an alternate value). This optimization typically happens when the ternary operator in C (“?:”) is used, although it can also be used in code like so:

The primary optimization that the compiler would try to make here is the elimination of explicit branch instructions.

Although conditional move operations were added to the x86 instruction set around the time of the Pentium II, the Microsoft C compiler still does not use them by default when targeting x86 platforms (in contrast, x64 compiler uses them extensively). However, there are still some tricks that the compiler has at its disposal to perform such conditional assignments without branch instructions.

This is possible through clever use of the “conditional set ( setcc )” family of instructions (e.g. setz ), which store either a zero or a one into a register without requiring a branch. For example, here’s an example that I came across recently:

Broken up into the individual relevant steps, this code is something along the lines of the following in pseudo-code:

The key trick here is the use of a setcc instruction, followed by a dec instruction and an and instruction. If one takes a minute to look at the code, the meaning of these three instructions in sequence becomes apparent. Specifically, a setcc followed by a dec sets a register to either 0 or 0xFFFFFFFF (-1) based on a particular condition flag. Following which the register is ANDed with a constant, which depending on whether the register is 0 or -1 will result in the register being set to the constant or being left at zero (since anything AND zero is zero, while ANDing any particular value with 0xFFFFFFFF yields the input value). After this sequence, a second constant is summed with the current value of the register, yielding the desired result of the operation.

(The initial constant is chosen such that adding the second constant to it results in one of the values of the conditional assignment, where the second constant is the other possible value in the conditional assignment.)

Cleaned up a bit, this code might look more like so:

This sort of trick is also often used where something is conditionally set to either zero or some other value, in which case the “add” trick can be omitted and the non-zero conditonal assignment value is used in the AND step.

A similar construct with the sbb instruction and the carry flag can also be constructed (as opposed to setcc , if sbb is more convenient for the particular case at hand). For example, the sbb approach tends to be preferred by the compiler when setting a value to zero or -1 as a further optimization on this theme as it avoids the need to decrement, assuming that the input value was already zero initially and the condition is specified via the carry flag.

This entry was posted on Thursday, October 18th, 2007 at 10:14 am and is filed under Reverse Engineering . You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

4 Responses to “Compiler tricks in x86 assembly: Ternary operator optimization”

[…] Nynaeve Adventures in Windows debugging and reverse engineering. « Compiler tricks in x86 assembly: Ternary operator optimization […]

This is excellent, thank you! I’ve just used this in a program I’m writing which has to do runtime code generation (pretty complex virtual function hooking for protoypes that are unknown at compile time)..

“Although conditional move operations were added to the x86 instruction set around the time of the Pentium II,” Actually Pentium Pro.

Funny, the first time I saw one jump bounds check in ZX Spectrum ROM. This looks similar.

Nynaeve is proudly powered by WordPress Entries (RSS) and Comments (RSS) .

Continuous Assignment and Combinational Logic in SystemVerilog

In this post, we primarily talk about the concept of continuous assignment in SystemVerilog . We also look at how we can use this in conjunction with the SystemVerilog operators to model basic combinational logic circuits .

However, continuous assignment is a feature which is entirely inherited from verilog so anyone who is already familiar with verilog can skip this post.

There are two main classes of digital circuit which we can model in SystemVerilog – combinational and sequential .

Combinational logic is the simplest of the two, consisting solely of basic logic gates, such as ANDs, ORs and NOTs. When the circuit input changes, the output changes almost immediately (there is a small delay as signals propagate through the circuit).

In contrast, sequential circuits use a clock and require storage elements such as flip flops . As a result, output changes are synchronized to the circuit clock and are not immediate.

In the rest of this post, we talk about the main techniques we can use to design combinational logic circuits in SystemVerilog.

In the next post, we will discuss the techniques we use to  model basic sequential circuits .

Continuous Assignment in SystemVerilog

In verilog based designs, we use continuous assignment to drive data on verilog net types . As a result of this, we use continuous assignment to model combinational logic circuits.

In SystemVerilog, we often use the logic data type rather than the verilog net or reg types. This is because the behavior of the logic type is generally more intuitive than the reg and wire types.

Despite this, we still make use of continuous assignment in SystemVerilog as it provides a convenient way of modelling combinational logic circuits.

We can use continuous assignment with either the logic type or with net types such as wire.

In SystemVerilog, we can actually use two different methods to implement continuous assignment.

The first of these is known as explicit continuous assignment. This is the most commonly used method for continuous assignment in SystemVerilog.

In addition, we can also use implicit continuous assignment, or net declaration assignment as it is also known. This method is less common but it can allow us to write less code.

Let's look at both of these techniques in more detail.

  • Explicit Continuous Assignment

We normally use the assign keyword when we want to use continuous assignment in SystemVerilog. This approach is known as explicit continuous assignment.

The SystemVerilog code below shows the general syntax for continuous assignment using the assign keyword.

In this construct, we use the <variable> field to give the name of the signal which we are assigning data to. As we mentioned earlier, we can only use continuous assignment to assign data to net or logic type variables.

The <value> field can be a fixed value or we can create an expression using the SystemVerilog operators we discussed in a previous post.

When we use continuous assignment, the <variable> value changes whenever one of the signals in the <value> field changes state.

The code snippet below shows the most basic example of continuous assignment in SystemVerilog. In this case, whenever the b signal changes states, the value of a is updated so that it is equal to b.

  • Net Declaration Assignment

We can also use implicit continuous assignment in our SystemVerilog designs. This approach is also commonly known as net declaration assignment in SystemVerilog.

When we use net declaration assignment, we place a continuous assignment in the statement which declares our signal. This can allow us to reduce the amount of code we have to write.

To use net declaration assignment in SystemVerilog, we use the = symbol to assign a value to a signal when we declare it.

The code snippet below shows the general syntax we use for net declaration assignment.

The variable and value fields have the same function for both explicit continuous assignment and net declaration assignment.

As an example, the SystemVerilog code below shows how we would use net declaration assignment to assign the value of b to signal a.

Modelling Combinational Logic Circuits in SystemVerilog

We use continuous assignment and the SystemVerilog operators to model basic combinational logic circuits in SystemVerilog.

In order to show we would do this, let's look at the very basic example of a three input and gate as shown below.

In order to model this circuit in SystemVerilog, we must use the assign keyword to drive the data on to the and_out output.

We can then use the bit wise and operator (&) to model the behavior of the and gate.

The code snippet below shows how we would model this three input and gate in SystemVerilog.

This example shows how simple it is to design basic combinational logic circuits in SystemVerilog. If we need to change the functionality of the logic gate, we can simply use a different SystemVerilog bit wise operator .

If we need to build a more complex combinational logic circuit, it is also possible for us to use a mixture of different bit wise operators.

To demonstrate this, let's consider the basic circuit shown below as an example.

In order to model this circuit in SystemVerilog, we need to use a mixture of the bit wise and (&) and or (|) operators. The code snippet below shows how we would implement this circuit in SystemVerilog.

Again, this code is relatively straight forward to understand as it makes use of the SystemVerilog bit wise operators which we discussed in the last post.

However, we need to make sure that we use brackets to model more complex logic circuit. Not only does this ensure that the circuit operates properly, it also makes our code easier to read and maintain.

Modelling Multiplexors in SystemVerilog

Multiplexors are another component which are commonly used in combinational logic circuits.

In SystemVerilog, there are a number of ways we can model these components.

One of these methods uses a construct known as an always block which we will discuss in detail in the next post. Therefore, we will not discuss this approach to modelling multiplexors in this post.

However, we will look at the other methods we can use to model multiplexors in the rest of this post.

  • SystemVerilog Conditional Operator

As we talked about in a previous post, there is a conditional operator in SystemVerilog . This functions in the same way as the conditional operator in the C programming language.

To use the conditional operator, we write a logical expression before the ? operator which is then evaluated to see if it is true or false.

The output is assigned to one of two values depending on whether the expression is true or false.

The SystemVerilog code below shows the general syntax which the conditional operator uses.

From this example, it is clear how we can create a basic two to one multiplexor using this operator.

However, let's look at the example of a simple 2 to 1 multiplexor as shown in the circuit diagram below.

The code snippet below shows how we would use the conditional operator to model this multiplexor in SystemVerilog.

  • Nested Conditional Operators

Although this is not common, we can also write code to build larger multiplexors by nesting conditional operators.

To show how this is done, let's consider a basic 4 to 1 multiplexor as shown in the circuit below.

In order to model this in SystemVerilog using the conditional operator, we treat the multiplexor circuit as if it were a pair of two input multiplexors.

This means one multiplexor will select between inputs A and B whilst the other selects between C and D. Both of these multiplexors use the LSB of the address signal as the address pin.

The SystemVerilog code shown below demonstrates how we would implement this.

To create the full four input multiplexor, we would then need another multiplexor.

This multiplexor then takes the output of the other two multiplexors and uses the MSB of the address signal to select between.

The code snippet below shows the simplest way to do this. This code uses the signals mux1 and mux2 which we defined in the last example.

However, we could easily remove the mux1 and mux2 signals from this code and instead use nested conditional operators.

This reduces the amount of code that we would have to write without affecting the functionality.

The code snippet below shows how we would do this.

As we can see from this example, when we use conditional operators to model multiplexors in verilog, the code can quickly become difficult to understand. Therefore, we should only use this method to model small multiplexors.

  • Arrays as Multiplexors

It is also possible for us to use basic SystemVerilog arrays to build simple multiplexors.

In order to do this, we combine all of the multiplexor inputs into a single array type and use the address to point at an element in the array.

In order to get a better idea of how this works in practise, let's consider a basic four to one multiplexor as an example.

The first thing we must do is combine our input signals into an array. There are two ways in which we can do this.

Firstly, we can declare an array and then assign all of the individual bits, as shown in the SystemVerilog code below.

Alternatively we can use the SystemVerilog concatenation operator , which allows us to assign the entire array in one line of code.

In order to do this, we use a pair of curly braces - { } - and list the elements we wish to include in the array inside of them.

When we use the concatenation operator we can also declare and assign the variable in one statement.

The SystemVerilog code below shows how we can use the concatenation operator to populate an array.

As SystemVerilog is a loosely typed language , we can use the two bit addr signal as if it were an integer type. This signal then acts as a pointer that determines which of the four elements to select.

The code snippet below demonstrates this method in practise.

What is the difference between implicit and explicit continuous assignment?

When we use implicit continuous assignment we assign the variable a value when we declare. In contrast, when we use explicit continuous assignment we use the assign keyword to assign a value.

Write the code for a 2 to 1 multiplexor using any of the methods discussed in this post.

Write the code for circuit below using both implicit and explicit continuous assignment.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Table of Contents

Sign up free for exclusive content.

Don't Miss Out

We are about to launch exclusive video content. Sign up to hear about it first.

This browser is no longer supported.

Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.

Assignment operators (C# reference)

  • 11 contributors

The assignment operator = assigns the value of its right-hand operand to a variable, a property , or an indexer element given by its left-hand operand. The result of an assignment expression is the value assigned to the left-hand operand. The type of the right-hand operand must be the same as the type of the left-hand operand or implicitly convertible to it.

The assignment operator = is right-associative, that is, an expression of the form

is evaluated as

The following example demonstrates the usage of the assignment operator with a local variable, a property, and an indexer element as its left-hand operand:

The left-hand operand of an assignment receives the value of the right-hand operand. When the operands are of value types , assignment copies the contents of the right-hand operand. When the operands are of reference types , assignment copies the reference to the object.

This is called value assignment : the value is assigned.

ref assignment

Ref assignment = ref makes its left-hand operand an alias to the right-hand operand, as the following example demonstrates:

In the preceding example, the local reference variable arrayElement is initialized as an alias to the first array element. Then, it's ref reassigned to refer to the last array element. As it's an alias, when you update its value with an ordinary assignment operator = , the corresponding array element is also updated.

The left-hand operand of ref assignment can be a local reference variable , a ref field , and a ref , out , or in method parameter. Both operands must be of the same type.

Compound assignment

For a binary operator op , a compound assignment expression of the form

is equivalent to

except that x is only evaluated once.

Compound assignment is supported by arithmetic , Boolean logical , and bitwise logical and shift operators.

Null-coalescing assignment

You can use the null-coalescing assignment operator ??= to assign the value of its right-hand operand to its left-hand operand only if the left-hand operand evaluates to null . For more information, see the ?? and ??= operators article.

Operator overloadability

A user-defined type can't overload the assignment operator. However, a user-defined type can define an implicit conversion to another type. That way, the value of a user-defined type can be assigned to a variable, a property, or an indexer element of another type. For more information, see User-defined conversion operators .

A user-defined type can't explicitly overload a compound assignment operator. However, if a user-defined type overloads a binary operator op , the op= operator, if it exists, is also implicitly overloaded.

C# language specification

For more information, see the Assignment operators section of the C# language specification .

  • C# operators and expressions
  • ref keyword
  • Use compound assignment (style rules IDE0054 and IDE0074)

Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see: .

Submit and view feedback for

Additional resources

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 30 April 2024

Reconstruction of unstable heavy particles using deep symmetry-preserving attention networks

  • Michael James Fenton   ORCID: 1   na1 ,
  • Alexander Shmakov 2   na1 ,
  • Hideki Okawa   ORCID: 3 ,
  • Yuji Li 4 ,
  • Ko-Yang Hsiao 5 ,
  • Shih-Chieh Hsu   ORCID: 6 ,
  • Daniel Whiteson   ORCID: 1 &
  • Pierre Baldi   ORCID: 2  

Communications Physics volume  7 , Article number:  139 ( 2024 ) Cite this article

Metrics details

  • Experimental particle physics
  • Phenomenology

Reconstructing unstable heavy particles requires sophisticated techniques to sift through the large number of possible permutations for assignment of detector objects to the underlying partons. An approach based on a generalized attention mechanism, symmetry preserving attention networks (SPA-NET), has been previously applied to top quark pair decays at the Large Hadron Collider which produce only hadronic jets. Here we extend the SPA-NET architecture to consider multiple input object types, such as leptons, as well as global event features, such as the missing transverse momentum. In addition, we provide regression and classification outputs to supplement the parton assignment. We explore the performance of the extended capability of SPA-NET in the context of semi-leptonic decays of top quark pairs as well as top quark pairs produced in association with a Higgs boson. We find significant improvements in the power of three representative studies: a search for \(t\bar{t}H\) , a measurement of the top quark mass, and a search for a heavy \({Z}^{{\prime} }\) decaying to top quark pairs. We present ablation studies to provide insight on what the network has learned in each case.


Event reconstruction is a crucial problem at the Large Hadron Collider (LHC), where heavy, unstable particles such as top quarks, Higgs bosons, and electroweak W and Z bosons decay before being directly measured by the detectors. Measuring the properties of these particles requires reconstructing their four-momenta from their immediate decay products, which we refer to as partons . Since many partons leave indistinguishable signatures in detectors, a central difficulty is assigning the observed detector objects to each parton. As the number of partons grows, the combinatorics of the problem becomes overwhelming, and the inability to efficiently select the correct assignment dilutes valuable information.

Previously, methods such as χ 2 fits 1 or kinematic likelihoods 2 have provided analytic approaches for performing this task. These approaches are limited, however, by the requirement of exhaustively building each possible permutation of the event and by the limited amount of kinematic information that can be incorporated. Particularly at high-energy hadron colliders such as the LHC, events often contain many extra objects from additional activity as well as the particles originating from the hard scattering event, which can cause the performance of permutation-based methods to degrade substantially.

In recent years, modern machine learning tools such as graph neural networks and transformers 3 have been broadly applied to many problems in high-energy physics. For example, the problem of identifying the origin of single, large-radius jets has been closely studied 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 using such techniques. Some of these have incorporated symmetry considerations 11 , 12 , 14 to aid performance. Implementations of such strategies to event-level reconstruction have been limited so far to single object permutation assignment 15 , 16 , 17 or direct regression 18 .

This work presents a complete machine learning approach to multi-object event reconstruction and kinematic regression at the LHC, named SPA-NET owing to its use of a symmetry-preserving attention mechanism, designed to incorporate all of the symmetries present in the problem. It was first introduced 15 , 16 in the context of reconstruction of the all-hadronic final state in which only one type of object is present. In this work, we extend and complete the method by generalizing to arbitrary numbers of object types, as well as adding multiple capabilities that can aid the application of SPA-NET in LHC data analysis, including signal and background discrimination, kinematic regression, and auxiliary outputs to separate different kinds of events.

To demonstrate the new capacity of the technique, we study its performance in final states containing a lepton and a neutrino. The method is compared to existing baseline approaches and demonstrated to provide significant improvements in three flagship LHC physics measurements: \(t\bar{t}H\) cross-section, top quark mass, and a search for a hypothetical \({Z}^{{\prime} }\) boson decaying to top quark pairs. These examples demonstrate various additional features, such as kinematic regression and signal versus background discrimination. The method can be applied to any final state at the LHC or other particle collider experiments, and may be applicable to other set assignment tasks in other scientific fields.

SPA-NET extensions

We present several improvements to the base SPA-NET architecture 15 , 16 to tackle the additional challenges inherent to events containing multiple reconstructed object classes and to allow for a greater variety of outputs for an array of potential auxiliary tasks. These modifications allow SPA-NET to be applied to essentially any topology and allow for the analysis of many additional aspects of events beyond the original jet-parton assignment task.

Base SPA-NET overview

For context, we first provide a brief overview of the original SPA-NET architecture 15 , 16 . These components are those which are presented with black boxes and lines in Fig.  1 . The jets, represented by their kinematics, are first embedded into a high dimensional latent space and subsequently processed by a central transformer encoder 3 with the goal of providing contextual information to the jets. We note that the architecture of this transformer encoder follows the original definition 3 , with one major exception: we omit the positional encoding to prevent introducing ordering over our input. As the jets are presented as a set of momentum vectors, with no obvious order, we want the network to remain permutation equivariant with respect to the input order. We replicate the architecture for the particle transformers, now applying individually trained transformers for every resonance particle in our event.

figure 1

The diagram flows left to right, with inputs denoted by \({{{{{{{{\mathcal{E}}}}}}}}}_{i}\) , assignment outputs denoted by P j , regression outputs η ν and \(m_{t\bar{t}}\) , and classification output \({{{{{{{\mathcal{S/B}}}}}}}}\) . Black blocks show components common to our previous works 15 , 16 , with new components shown in blue.

Finally, to extract the joint distribution over jets for each resonance particle, we apply a symmetric tensor attention layer defined in Section 3 of our previous work 16 . This layer applies a generalized form of attention, modified by a symmetry group over assignments, to produce a symmetric joint distribution over jets describing the likelihood of assigning said jets to the resonance particle. This split architecture, with individual branches for every resonance particle, allows us to avoid computing a full permutation over all possible assignments and reduced the runtime from combinatorial w.r.t the number of jets, \({{{{{{{\mathcal{O}}}}}}}}(N!)\) , to \({{{{{{{\mathcal{O}}}}}}}}({N}^{{k}_{p}})\) where k p is the number of daughter particles produced by a resonance particle.

Input observables

While the original SPA-NET 15 , 16 studies concentrated on examples where all objects have hadronic origins, we focus here on the challenges of semi-leptonic topologies. These events contain several different reconstructed objects, including the typical hadronic jets as well as leptons and missing transverse momentum ( \({E}_{{{{{{{{\rm{T}}}}}}}}}^{{{{{{{{\rm{miss}}}}}}}}}\) ) typically associated with neutrinos. Unlike jets or leptons, this \({E}_{{{{{{{{\rm{T}}}}}}}}}^{{{{{{{{\rm{miss}}}}}}}}}\) is a global observable, and its multiplicity does not vary event by event.

We accommodate these additional inputs by training individual position-independent embeddings for each class of input. This allows the network to adjust to the various distributions for each input type, and allows us to define sets of features specific to each type of object. We parameterize jets using the \(\{M,{p}_{{{{{{{{\rm{T}}}}}}}}},\eta ,\sin \phi ,\cos \phi ,b{{{{{{{\rm{-tag}}}}}}}}\}\) representation, where M is the jet mass, p T is the jet momentum transverse to the incoming proton beams, and ϕ is the azimuthal angle around the detector, represented by its trigonometric components to avoid the boundary condition at ϕ  = ±  π . η is the pseudo-rapidity 19 of the jet, the standard measure of the polar angle between the incoming proton beam and the jet commonly used in particle physics due to its Lorentz-invariant quantities. Leptons are similarly represented using \(\{M,{p}_{{{{{{{{\rm{T}}}}}}}}},\eta ,\sin \phi ,\cos \phi ,{{{{{{{\rm{flavor}}}}}}}}\}\) where flavor is 0 for electrons and 1 for muons. Finally, \({E}_{{{{{{{{\rm{T}}}}}}}}}^{{{{{{{{\rm{miss}}}}}}}}}\) is represented using two scalar values, the magnitude and azimuthal angle, and is treated as an always-present jet or lepton. The individual embedding layers map these disparate objects with different features into a unified latent space which may be processed by the central transformer.

The global inputs, such as \({E}_{{{{{{{{\rm{T}}}}}}}}}^{{{{{{{{\rm{miss}}}}}}}}}\) , need to be treated differently than the jets and leptons, as they do not have associated parton assignments. Therefore, after computing the central transformer, we do not include the extra global \({E}_{{{{{{{{\rm{T}}}}}}}}}^{{{{{{{{\rm{miss}}}}}}}}}\) vector in the particle transformers. This allows the transformer to freely share the \({E}_{{{{{{{{\rm{T}}}}}}}}}^{{{{{{{{\rm{miss}}}}}}}}}\) information with the other objects during the central transformer step while preventing it from being chosen as a reconstruction object for jet-parton assignment.

Secondary outputs

Beyond jet-parton assignment, we are interested in reconstruction of further observables, such as the unknown neutrino η , or differentiation of signal events from background. These observables are defined at event level, and are independent of the jet multiplicity, so we must construct a way of summarizing the entire event in a single vector to predict these values.

To accomplish this, we add additional output heads to the central transformer, presented with blue boxes and lines on the right in Fig.  1 , which are trained end-to-end simultaneously with the base reconstruction task. We extract an event embedding from the central transformer by including a learnable event vector in the inputs to the transformer. We append this learned event vector \({{{{{{{{\mathcal{E}}}}}}}}}_{E}\in {{\mathbb{R}}}^{D}\) to the list of embedded input vectors: \({{{{{{{\mathcal{E}}}}}}}}=\{{{{{{{{{\mathcal{E}}}}}}}}}_{1},{{{{{{{{\mathcal{E}}}}}}}}}_{2},\ldots ,{{{{{{{{\mathcal{E}}}}}}}}}_{n},{{{{{{{{\mathcal{E}}}}}}}}}_{L},{{{{{{{{\mathcal{E}}}}}}}}}_{G},{{{{{{{{\mathcal{E}}}}}}}}}_{E}\}\) prior to the central transformer (Fig.  1 ). This allows the central transformer to process this event vector using all of the information available in the observables.

We extract the encoded event vector after the central transformer and treat it as a latent summary representation of the entire event z E . We can then feed these latent features into simple feed-forward neural networks to perform signal vs background classification, \({{{{{{{\mathcal{S/B}}}}}}}}({z}_{E})\) , neutrino kinematics regression, η ν ( z E ), or any other downstream tasks. These tasks may additionally be learned after the main SPA-NET training as z E may be computed used a fixed set of SPA-NET weights and then used for other downstream tasks without altering the original SPA-NET.

These additional feed-forward networks are trained using their respective loss, either categorical log-likelihood or mean squared error (MSE). These auxiliary losses are simply added to the total SPA-NET loss, weighted by their respective hyperparameter α i . With the parton reconstruction loss, \({{{{{{{{\mathcal{L}}}}}}}}}_{{{{{{{{\rm{reconstruction}}}}}}}}}\) defined as the masked minimum permutation loss from Equation 6 of our previous work 16 , the SPA-NET loss becomes:

Particle detector

In our previous work 16 , we introduced the ability to reconstruct partial events by splitting the reconstruction task based on the event topology. This is a powerful technique that is particularly useful in complex events, where it is very likely that at least one of the partons will not have a corresponding detector object.

However, the assignment outputs are trained only on examples in which the event contains all detector objects necessary for a correct parton assignment. We refer to the reconstruction target particles in these examples as reconstructable. We must train this way because only reconstructable particles have truth-labeled detector objects, which are required for training, and we ignore non-reconstructable particles via the masked loss defined in Equation 6 of our previous work 16 . As a result of this training procedure, the SPA-NET assignment probability P a only represents a conditional assignment distribution over jet indices j i for each particle p given that the particle is reconstructable:

We use P ( p  reconstructable) =  P ( p ) and P ( p  not reconstructable) =  P ( ¬  p ) for conciseness. To construct an unconditional assignment distribution, we need to additionally estimate the probability that a given particle is reconstructable in the event, P d . This additional distribution may be used to produce a pseudo-marginal probability for the assignment. While \({P}_{a}(\,{j}_{1},{j}_{2},\ldots ,{j}_{{k}_{p}}| \neg p)=0\) is not a valid distribution, and therefore this marginal probability is ill-defined, we may still use this pseudo-marginal probability

as an overall measurement of the assignment confidence of the network.

We aim to estimate this reconstruction probability, P d ( p ), with an additional output head of SPA-NET. We will refer to this output as the detection output, because it is trained to detect whether or not a particle is reconstructable in the event. We train this detection output in a similar manner as the classification outputs but at the particle level instead of the event level. That is, we extract a summary particle vector from each of the particle transformer encoders using the same method as the event summary vector from the central transformer. We then feed these particle vectors into a feed-forward binary classification network to produce a Bernoulli probability for each particle. We have to also take into account the potential event-level symmetries in a similar manner to the assignment reconstruction loss from Equation 6 of our previous work 16 . We train this detection output with a cross-entropy loss over the symmetric particle masks:

The complete loss equation for the entire network can now be defined:

Baseline methods

We compare SPA-NET to two commonly used methods, the Kinematic Likelihood Fitter (KLFitter) 2 , and a Permutation Deep Neural Network (PDNN), which uses a fully connected deep neural network similar to existing literature 20 . Both methods are permutation-based, meaning they sequentially evaluate every possible permutation of particle assignments. This results in a combinatorial explosion, with for example 5!/2 = 60 possible assignments of the jets in a semi-leptonically decaying \(t\bar{t}\) + jet event (the reduction by a factor of two comes from the assignment symmetry between the hadronically decaying W boson decay products). That is, there are 60 different possible permutations that must be evaluated per event, even before considering systematic uncertainty evaluation or further additional jets. With typical analyses utilizing MC samples containing \({{{{{{{\mathcal{O}}}}}}}}(1{0}^{6}-1{0}^{8})\) events, which must be evaluated for \({{{{{{{\mathcal{O}}}}}}}}(1{0}^{2})\) systematic variations, complex events quickly become intractable or at least extremely computationally expensive, even before considering the decreasing performance of such methods as a function of object multiplicity. The performance of these algorithms is compared to SPA-NET in all presented results.

KLFitter has been extensively used in top quark analyses 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , especially for semi-leptonic \(t\bar{t}\) events. The method involves building every possible permutation of the event and constructing a likelihood score for each. The permutation with the maximum likelihood is thus taken as the best reconstruction for that event. The likelihood score, which has been updated ( ) since the original publication 2 , is defined as

where B represents Breit-Wigner functions, \({m}_{{q}_{1}{q}_{2}{q}_{3}}\) , \({m}_{{q}_{1},{q}_{2}}\) , \({m}_{{q}_{4}\ell \nu }\) , m ℓ ν are invariant masses computed from the final state particle momenta. The variables m t ( W ) and Γ t ( W ) are the masses and decay widths of the top quark ( W boson), respectively. The expressions \({E}_{\ell ,jet}^{(meas)}\) represents the (measured) energy of the leptons or jets, respectively, and the functions W v a r ( v a r A ∣ v a r B ) are the transfer function for the variable v a r A from v a r B .

This method suffers from several limitations. Firstly, the requirement to construct and test every possible permutation leads to a run-time that grows exponentially with the number of jets or other objects in the event. This quickly becomes a limiting factor in large datasets, which at the LHC often contain millions of events that must be evaluated hundreds of times each (once per systematic uncertainty shift). While semi-leptonic \(t\bar{t}\) can largely remain tractable, it can significantly slow down analyses due to the heavy computing cost, and it is typical to limit the evaluation to only a subset of the reconstructed objects in order to reduce this burden, which restricts the number of events that can be correctly reconstructed. More complex final states, for example \(t\bar{t}H\) production, require even more objects to be reconstructed and thus take even longer to compute, severely limiting the usability of the method in such channels.

A second limitation of the method is its treatment of partial events, which the likelihood is not designed to handle, and thus performance in these events is significantly degraded. Finally, the method does not take into account any correlations between the decay products of the target particles and the rest of the event, since only the particles hypothesized as originating from the targets are included in the likelihood evaluation. An advantage of the method is the use of transfer functions to represent detector effects, but these must be carefully derived for each detector to achieve maximum performance, which can be a difficult and time-consuming endeavor.

There are two variations of the KLFitter likelihood of interest in our studies: one in which the top quark mass is given an assumed value, and one in which it is not. Specifying the assumed mass leads to improved reconstruction efficiency at the expense of biasing towards permutations at this mass, causing sculpting of backgrounds and other undesirable effects. In the analyses presented in \(t\bar{t}H\) and \({Z}^{{\prime} }\) analyses, the top quark mass is fixed to a value of 173 GeV, since this biasing is less important than overall reconstruction efficiency. In contrast, the top quark mass measurement must avoid biasing towards a specific mass value, and thus the mass is not fixed in the likelihood for this analysis.

The PDNN uses a fully connected deep neural network that takes the kinematic and tagging information of the reconstructed objects as inputs, similar to the method described in existing literature 20 . Again, each possible permutation of the event is evaluated, and the assignment with the highest network output score is taken as the best reconstruction. Training is performed as a discrimination task, in which the correct permutations are marked as signal, and all of the other permutations are marked as background.

This method also suffers from several limitations, including the same exponentially growing run-time due to the permutation-based approach, the inability to adequately handle partial events, and the lack of inputs related to additional event activity. Further, the method does not incorporate the symmetries of the reconstruction problem due to the way in which input variables must be associated with the hypothesized targets. Recently, message-passing graph neural networks were applied to the all-hadronic \(t\bar{t}\) final state 17 , but as all studies presented here are performed in the lepton+jets channel, no comparison is made to such methods.

Datasets and training

Several datasets of simulated collisions are generated to test a variety of experimental analyses and effects. All datasets are generated at a center-of-mass energy of \(\sqrt{s}=13\) TeV using MADGRAPH_AMC@NLO 30 (v3.2.0, NCSA license) for the matrix element calculation, PYTHIA8 31 (v8.2, GPL-2) for the parton showering and hadronisation, and DELPHES 32 (v3.4.2, GPL-3) using the default CMS detector card for the simulation of detector effects. For all samples, jets are reconstructed using the anti- k T jet algorithm 33 with a radius parameter of R  = 0.5, a minimum transverse momentum of p T  > 25 GeV, and an absolute pseudo-rapidity of ∣ η ∣  < 2.5. To identify jets originating from b -quarks, a b -tagging algorithm with a p T -dependent efficiency and mis-tagging rate is applied. Electrons and muons are selected with the same p T and η requirements as for jets. No requirement is placed on the missing transverse momentum \({E}_{{{{{{{{\rm{T}}}}}}}}}^{{{{{{{{\rm{miss}}}}}}}}}\) .

A large sample of simulated Standard Model (SM) \(t\bar{t}\) production is generated with the top quark mass m t  = 173 GeV, and used for initial studies as well as the background model in the \({Z}^{{\prime} }\) studies. It contains approximately 11M events after a basic event selection of exactly one electron or muon and at least four jets of which at least two are b -tagged. We further produce samples for the top mass analysis: ~0.2M events each at mass points of m t  = 170, 171, 172, 173, 174, 175, 176 GeV in order to build templates, as well as a training sample of ~12M total \(t\bar{t}\) events produced in steps of 0.1 GeV to achieve an approximately flat m t distribution in the 166-176 GeV range. This sample is used for all \(t\bar{t}\) reconstruction studies as well as the top mass analysis. A final sample with m t  = 171.9 GeV was produced to be used as pseudo-data for the top mass analysis. The value used was initially known by only one member of the team to avoid bias in the final mass extraction.

A sample of simulated SM \(t\bar{t}H\) production, in which the Higgs boson decays to a pair of b -quarks, is generated to model the signal process for the \(t\bar{t}H\) analysis. This sample has the same event selection as applied to the \(t\bar{t}\) samples, with an additional requirement of at least six jets due to the additional presence of the Higgs boson. Training of SPA-NET is performed using 10M \(t\bar{t}H\) events with at least two b -tagged jet, while the final measurement is performed using a distinct sample where 0.2M of 1.1M events satisfy the more stringent requirement of containing least four b -tagged jets. Training with the two-tag requirement achieved better overall performance than on the tighter four-tag selection, which follows the most recent ATLAS analyses in this channel 34 . The background in this analysis is dominated by \(t\bar{t}+b\bar{b}\) production, which is modeled using a simulated sample in which the top and bottom pairs are explicitly included in the hard process generated by MADGRAPH_AMC@NLO; of the 1.3M events generated, 0.2M survive the event selection.

Finally, we produce Beyond the Standard Model (BSM) events containing a hypothetical \({Z}^{{\prime} }\) boson that decays to a pair of top quarks, using the vPrimeNLO model 35 in MADGRAPH_AMC@NLO. One sample of 0.2M events is produced at each of \({m}_{{Z}^{{\prime} }}=500,700,900\)  GeV to evaluate search sensitivity at a range of masses. A sample with an approximately flat \({m}_{{Z}^{{\prime} }}\) distribution is generated for network training by generating events in 1 GeV steps between 400 and 1000 GeV. We match jets to the original decay products of the top quarks and Higgs bosons using an exclusive \(\Delta R=\left.\right(\sqrt{{({\phi }_{j}-{\phi }_{d})}^{2}+{({\eta }_{j}-{\eta }_{d})}^{2}} < 0.4\) requirement, such that only one decay product can be matched to each jet and vice versa, always taking the closest match. This method is adopted both in ATLAS and CMS analyses and allows a crisp definition of the correct assignments as well as categorization of events based upon which particles are reconstructable, as explained in the Particle Detector subsection.

We train all models on a single machine with a AMD EPYC 7502 CPU and 4 NVidia 3090 GPUs for training. Each model was trained for a period of 24 hours on this machine, as we have found that to be sufficient time for models to converge in training and validation loss. We use the same hyperparameters derived in our previous work 16 as each event topology presented here may be interpreted as a variation of the same event topologies.

The data generated for this study is available in the our online repository ( ). The code used for training is available on github ( ).

Results and discussion

Reconstruction and regression performance.

We present the reconstruction efficiency for SPA-NET in semi-leptonic \(t\bar{t}\) and \(t\bar{t}H(H\to b\bar{b})\) events, compared to the performance of the benchmark methods KLFitter and PDNN. Efficiencies are presented relative to all events in the generated sample, as well as relative to the subset of events in which all top quark (and Higgs boson in the case of \(t\bar{t}H\) ) daughters are truth-matched to reconstructed jets, which we call Full Events. We also show efficiencies for each type of particle, with t H the hadronically decaying top quark, t L the leptonically decaying top quark, and H the Higgs boson. We present the efficiencies in three bins of jet multiplicity as well as inclusively.

In Table  1 , the efficiencies for accurate reconstruction of semi-leptonic \(t\bar{t}\) events are shown. We find that SPA-NET outperforms both benchmark methods in all categories. The performance of KLFitter is substantially lower than the other two methods everywhere, reaching only 12% for full-event efficiency in full events with ≥6 jets. The PDNN performance is close to SPA-NET in low jet multiplicity events, but the gap grows as the number of jets in the event increases. This is expected due to the encoded symmetries in SPA-NET that allow it to more efficiently learn the high multiplicity, more complex events, as well as the additional permutations that must be considered by the PDNN. SPA-NET is further suited to higher-multiplicity events due to not suffering from the large run-time scaling of the permutation based approaches. Results for \(t\bar{t}H(H\to b\bar{b})\) events, also presented in Table  1 , show similar trends.

Regression performance

In semi-leptonic \(t\bar{t}\) decays, there is a missing degree of freedom due to the undetected neutrino. The transverse component and ϕ angle of the neutrino can be well-estimated from the missing transverse momentum in the event, but the longitudinal component (or equivalently, the neutrino η ) cannot be similarly estimated at hadron colliders due to the unknown total initial momentum along the beam. A typical approach is to assume that the invariant mass of the combined lepton and neutrino four-vectors should be that of the W boson, m W  = 80.37 GeV. This assumption leads to a quadratic formula that can lead to an ambiguity if the equation has either zero or two real solutions, and assumes on-shell W bosons and perfect lepton and \({E}_{{{{{{{{\rm{T}}}}}}}}}^{{{{{{{{\rm{miss}}}}}}}}}\) reconstruction. When the equation has two real solutions, the one with the lower absolute value is adopted. If the solutions are complex, we take the real component.

SPA-NET has been extended to provide additional regression outputs, which can be used to directly estimate such missing components. In Fig.  2 a, b, distributions of truth versus predicted neutrino η show that the SPA-NET regression is more diagonal than the traditional W -mass-constraint method. Figure  2 c, d shows the distributions and residuals of neutrino η , making it clear that SPA-NET regression has improved resolution of this quantity. However, Fig.  2 e, f show that neither method is able to accurately reconstruct the W -mass distribution. This distribution is not regressed directly, but is calculated by combining the \({E}_{{{{{{{{\rm{T}}}}}}}}}^{{{{{{{{\rm{miss}}}}}}}}}\) and lepton information with the predicted value of η . The mass constraint method produces a large peak exactly at the W -mass as expected, with a large tail at high mass coming from events in which the quadratic solutions are complex. In contrast, the SPA-NET regression, which has no information on the expected value of the W -mass, has a similar shape above m W , and a broad shoulder at lower values. It may thus be useful to refine the regression step to incorporate physics constraints, such as the W boson mass, to help the network learn important, complex quantities such as this. Incorporating more advanced regression techniques, such as this or combining with alternative methods such as ν -Flows 36 , 37 , is left to future work.

figure 2

a , b show the true value on the x -axis versus predicted values from the SPA-NET regression and W -mass constraint respectively on the y -axis, with the one-dimensional distributions shown outside the axes. c compares the neutrino η from SPA-NET regression (blue dotted), W -mass constraint (red dashed), and the true distribution (black solid), with ( d ) showing the residuals between truth and SPA-NET regression (blue dotted) or W -mass constraint (red dashed). e , f show the same distributions, this time for the reconstructed leptonic W boson mass.

Particle presence outputs

The additional SPA-NET outputs, described in the Particle Detector subsection and shown in Fig.  3 , can be very useful in analysis. The KLFitter, PDNN, and SPA-NET event-level likelihoods are shown in Fig.  3 a–c. We note that the permutation methods only provide event-level scores for the entire assignment, and that the scores are highly overlapping with little separation between correctly and incorrectly reconstructed events. Figure  3 d–f shows the SPA-NET per-particle marginal (pseudo)-probabilities, which are summed to calculate the event-level likelihood. The distributions of the assignment probability, separated by events, which SPA-NET has predicted correctly or incorrectly, are shown in Fig.  3 g–i, and Fig.  3 j–l shows the distribution of the detection probability split by whether the particle is reconstructable or not. All of the SPA-NET scores show clear separation between these categories, and this separation can be used in a variety of ways, such as to remove incomplete or incorrectly matched events via direct cuts, separate different types of events into different regions, or provide separation power as inputs to an additional multivariate analysis. The top quark mass and \({Z}^{{\prime} }\) analyses both cut on these scores in order to remove incorrect/non-reconstructable events and improve signal-to-background ratio (S/B). In the \(t\bar{t}H\) analysis, these are used as inputs to a Boosted Decision Tree (BDT) to classify signal and background, and are found to provide a large performance gain.

figure 3

The KLFitter likelihood is shown in ( a ), the Permutation Deep Neural Network (PDNN) log-likelihood in ( b ), and the SPA-NET event-level log-likelihood in ( c ), split by correctly reconstructed events (blue), incorrect events (orange), and non-reconstructable events (green). Further, the SPA-NET marginal probabilities for leptonic top, hadronic top, and Higgs are shown in ( d – f , respectively, grouped in the same way. g – i ) show the SPA-NET assignment probabilities, grouped by correct (blue) and incorrect (orange) events. Finally, the SPA-NET detection probabilities, split by reconstructable (blue) and non-reconstructable (orange), are shown in ( j – l ).

Computational overhead

Performance tests are performed on an AMD EPYC 7502 CPU with 128 threads and an NVidia RTX 3090 GPU. Including all pre-initialization steps, we evaluate the average run time for the three methods—KLFitter, PDNN, and SPA-NET—for both \(t\bar{t}\) and \(t\bar{t}H\) events. We find that KLFitter averages 24 (2) events per second on \(t\bar{t}\) ( \(t\bar{t}H\) ). The PDNN averages 2626 (51) events per second when run on a CPU, and 3034 (101) events per second on a GPU, with the speed up from GPU hardware minimal due to the fact that permutation building dominates the computation time. In contrast, SPA-NET averages 705 (852) events per second on a CPU, and 4407 (3534) events per second on a GPU, showing reduced scaling with the more complex \(t\bar{t}H\) events as expected. We therefore conclude that inference of SPA-NET should not be a bottleneck to analyses, as is often the case for methods like KLFitter. These numbers are summarized in table form in Supplementary Table  1 .

Ablation studies

In this section, we present several studies designed to reveal what the networks have learned. We find that training is, in general, very robust, showing little dependence on details of inputs or hyperparameters. For example, training performance is unchanged within statistical uncertainties when representing particles using { M , p T , η , ϕ } or { p x , p y , p z , E } 4-vector representations. Reconstruction performance varies by less than 1% if the training sample with a single top mass value is replaced by that with a flat mass spectrum.

In addition, we find that the performance of the network in testing depends on the kinematic range of the training samples in a sensible way. For example, the performance of the network on independent testing events varies with the top quark pair invariant mass, reflecting the mass distribution of the training sample. Figure  4 shows the testing performance versus top quark pair mass for networks trained on the full range of masses, or only events with invariant mass less than 600 GeV. The performance at higher mass is degraded when high-mass samples are not included in the training, as the nature of the task depends on the mass, which impacts the momentum and collimation of the decay products. Furthermore, the network performance is independent of the process (SM \(t\bar{t}\) or BSM \({Z}^{{\prime} }\) ) used to generate the training sample. The performance is reliable in the full range in which training data is present. It is noteworthy that the SM training still achieves similar performance up to ~1 TeV as the network trained on \({Z}^{{\prime} }\) events, despite having fewer events at this value, indicating that the training distribution need not be completely flat so long as some examples are present in the full range.

figure 4

Shown is the performance for three networks with distinct training samples: \({Z}^{{\prime} }\to t\bar{t}\) events with the full range of invariant masses (blue), \({Z}^{{\prime} }\to t\bar{t}\) events with masses <600 GeV (orange), and SM \(t\bar{t}\) with the full range of invariant masses (green).

To evaluate if the network is learning the natural symmetries of the data, we perform two further tests. The first is to investigate the azimuthal symmetry of the events, which we evaluate by applying the network to events that are randomly rotated in the ϕ plane and/or mirrored across the beam axis, which should have no impact on the nature of the reconstruction task. We find that in 41% of test events, the difference in the marginal probabilities is <1% and 84% of all events have a difference of less than 5%. This implies that the network approximately learns the inherent rotational and reflection symmetries of the task, without explicitly encoding this into the the network architecture. The full residual distributions are shown in Supplementary Fig.  1 .

The impact of adding rotation invariance to the network has been evaluated by employing an explicitly invariant attention architecture which employs a matrix of relative Lorentz-covariant quantities between each pair of particles, similar to existing literature 18 , 38 . We focus specifically on the symmetry induced by rotations along the beam axis. We follow the covariant transformer architecture 18 , and treat the ϕ and η angles as covariant, and compute the difference between these angles for all pairs of jets in the event. The remaining features are treated as invariant and processed normally by the attention. Figure  5 a shows that employing the invariant attention mechanism improves performance for small datasets, but does not lead to higher overall performance. This observation is consistent with the findings of existing literature. 18 , 38 . The explicit invariance does bring visible improvement in training speed as seen in Fig.  5 b. After fully training both networks on various training data sizes, we examine the training log and determine how many batches (gradient updates) were necessary before achieving maximal validation accuracy. We see that the invariant attention significantly reduces the number of updates needed to train the network. The trade-off of this regime is to make each network larger and more memory intensive, as the inputs must now be represented as pairwise matrices of features instead of simple vectors. Since the overall performance in the end is the same, and since we notice that a regular network already learns to approximate this invariance, we proceed using the traditional attention architecture, and this invariant network is not used for any further studies presented here.

figure 5

Shown are ( a ) reconstruction purity and ( b ) training speed, with the regular transformer shown in dashed orange and the explicitly invariant transformer 18 in solid blue. The uncertainty bars in ( a ) show the variation in reconstruction purity across 16 separate trainings at each dataset size.

Search for \(t\bar{t}H(H\to b\bar{b})\)

While the previous sections have detailed the per-event performance of SPA-NET, in the following sections we demonstrate its expected impact on flagship LHC physics measurements and searches.

The central challenge of measuring the cross-section for \(t\bar{t}H\) production, in which the Higgs boson follows its dominant decay mode to a pair of b -quarks, is separating the \(t\bar{t}H\) signal from the overwhelming \(t\bar{t}\) + \(b\bar{b}\) background. Typically, machine learning algorithms such as deep neural networks or boosted decision trees are trained to distinguish signal and background using high-level event features 34 , 39 . Since the key kinematic difference between the signal and background is the presence of a Higgs boson, the performance of this separation is greatly dependent on the quality of the event reconstruction, where improvements by SPA-NET can make a significant impact on the final result.

Reconstruction and background rejection

Event reconstruction is performed with SPA-NET, KLFitter, and a PDNN. The reconstruction efficiency for each of these methods is shown in Table  1 , where it is already clear that SPA-NET outperforms both of the baseline methods.

The reconstructed quantities and likelihood or network scores are then used to train a classifier to distinguish between signal and background. The full input list is shown in Supplementary Table  2 , with most variable definitions taken from the latest ATLAS result 34 . A BDT is trained for each reconstruction algorithm with the same input definitions and hyperparameters using the XGBoost package 40 . Tests using a BDT trained on lower-level information, i.e., the four-vectors of the predicted lepton and jet assignments, found significantly weaker performance than these high-level BDTs. We also compare the performance of the BDTs to two different SPA-NET outputs that are trained to separate signal and background. The first, which we call SPA-NET Pretraining, is an additional output head of the primary SPA-NET network, which has the objective of separating signal and background events. The second, which we call SPA-NET Fine-tuning, uses the same embeddings and central transformer as the former method, but the signal versus background classification head is trained in a separate second step after the initial training is complete. In this way, the network is able to first learn the optimal embedding of signal events, and utilize this embedding as the inputs to a dedicated signal vs background network. We have implemented in the SPA-NET package an option to output directly the embeddings from the network such that they can be used in this or other ways by the end user.

The receiver operating curve for the various classification networks is shown in Fig.  6 . The best separation performance comes from the fine-tuned SPA-NET model, as expected. The BDT with kinematic variables reconstructed with the SPA-NET jet-parton assignment (SPA-NET+BDT setup) is next, followed by the purely pre-trained model. All of these substantially outperform both the KLFitter+BDT and PDNN+BDT baselines.

figure 6

Shown is signal efficiency versus background rejection for several SPA-NET based set ups—SPA-NET fine-tuning (solid blue), SPA-NET+Boosted Decision Tree (BDT) (dash-dot pink), and SPA-NET pretraining (dash-dot green)—as well as BDTs based on outputs of traditional reconstruction techniques Permutation Deep Neural Network (PDNN) (dotted red) and KLFitter (dot-dash yellow).

Impact on sensitivity

To estimate the impact of significantly improved signal-background separation from SPA-NET reconstruction, we perform an Asimov fit to the network output distributions with the pyhf package 41 , 42 . The signal is normalized to the SM cross-section of 0.507 pb 43 and corrected for the branching fraction and selection efficiency of our sample. The dominant \(t\bar{t}+b\bar{b}\) background is normalized similarly, using the cross-section calculated by MADGRAPH_AMC@NLO of 0.666 pb. We further multiply the background cross-section by a factor of 1.5, in line with measurements from ATLAS 34 and CMS 39 that found this background to be larger than the SM prediction, rounded up to account also for the LO→NLO cross-section enhancement. We neglect the sub-leading backgrounds. The distributions are binned according to the AutoBin feature 44 preferred by ATLAS in order to ensure no bias is introduced between the different methods due to the choice of binning. Results normalized to 140 fb −1 , the luminosity of Run 2 of the LHC, using 5 bins and assuming an overall systematic uncertainty of 10% are presented in Table  2 . The numbers in the parentheses in Table  2 are results of an LHC Run 3 analysis normalized to 300 fb −1 of data using 8 bins with an overall systematic uncertainty assumption of 7%. Although the Run 3 center-of-mass energy of the LHC is \(\sqrt{s}=13.6\) TeV, all results presented assume \(\sqrt{s}=13\) TeV for simplicity.

In both scenarios, the sensitivity tracks the signal-background separation performance shown in Fig.  6 , with SPA-NET fine-tuning achieving the greatest statistical power. Neither of the benchmark methods is able to reach the 3 σ statistical significance threshold in the Run 2 analysis, while both SPA-NET+BDT and fine-tuning reach this mark. Similarly, these methods both reach the crucial 5 σ threshold normally associated with discovery, with the benchmark methods at only roughly 4 σ .

SPA-NET thus provides a significant expected improvement over the benchmark methods. While the full LHC analysis will require a more complete treatment, including significant systematic uncertainties due to the choice of event generators, previous studies have demonstrated minimal dependence to such systematic uncertainties 16 .

Top mass measurement

The top quark mass m t is a fundamental parameter of the Standard Model that can only be determined via experimental measurement. These measurements are critical inputs to global electroweak fits 45 , and m t even has implications for the stability of the Higgs vacuum potential, which has cosmological consequences 46 , 47 . Precision measurements of the top quark mass are thus one of the most important pieces of the experimental program of the LHC, with the most recent results reaching sub-GeV precision 48 , 49 , 50 . We demonstrate in this section the improvement enabled by the use of SPA-NET in a template-based top mass extraction.

We perform a two-dimensional fit to the invariant mass distributions of the hadronic top quark and W boson as reconstructed by each method, using the basic preselection described in the Datasets and Training subsection. We further truncate the mass distributions to 120 ≤ m t ≤ 230 GeV and 40 ≤ m W  ≤ 120 GeV. The fraction of events with correct or incorrect predictions for the top quark jets has a strong impact on the resolution with which the mass can be extracted. Better reconstruction should thus improve the overall sensitivity to the top quark mass.

Incorporation of the W -mass information in the 2D fit allows for a simultaneous constraint on the jet energy scale uncertainty, often a leading contribution to the total uncertainty, by also fitting a global jet scale factor (JSF) to be applied to the p T of each jet. Further, events that do not contain a fully reconstructable top quark are removed by cutting on the various scores from each method. KLFitter events are required to have a log-likelihood score >−70, PDNN events must have a network score of >0.12, and SPA-NET events must have a marginal probability of >0.23, optimized in each case to minimize the uncertainty on the extracted top mass. We additionally compare each method to an idealized perfect reconstruction method, in which all unmatched events are removed, and the truth-matched reconstruction is used for all events. The perfect-matched method provides an indication of the hypothetical limit of improvement achievable through better event reconstruction. In all cases, we neglect background from other processes, since these backgrounds tend to be on the order of a few percent 25 , and would be further suppressed by the network score cuts.

The top quark mass and JSF are extracted using a template fit from Monte Carlo samples which have top quark masses in 1 GeV intervals between 170 and 176 GeV. Templates are constructed for varying mass and JSF hypotheses for both the top and W boson mass distributions. These templates are built separately for each of the correct, incorrect, and unmatched event categories as the sum of a Gaussian and a Landau distribution, with five free parameters: the mean μ and the width σ of each, as well as the relative fraction f . We found an approximately linear relation between the template parameters as a function of the top quark mass and JSF, allowing for linear interpolation between the mass points. Finally, we validate the mass extracted by a template fit in hypothetical similar experiments and find a small bias, for which we derive a correction.

The impact of various reconstruction techniques can be best measured by the resulting uncertainty on the top quark mass and JSF. Figure  7 shows the expected uncertainty ellipses for a dataset with luminosity of 140 fb −1 and assuming a JSF variation of ±4%. The final uncertainty on the top mass is 0.193 GeV for KLFitter, 0.176 GeV for PDNN, and 0.165 GeV for SPA-NET. This indicates a 15% improvement in top quark mass uncertainty when using SPA-NET compared to the benchmark methods. The idealized reconstruction technique achieves an uncertainty of 0.109 GeV, demonstrating how much room for improvement remains. The dominant contribution to the gap between the perfect and SPA-NET reconstruction comes from the perfect removal of all unmatched events.

figure 7

Shown are results for the KLFitter (blue), permutation deep neural network (PDNN) (yellow), SPA-NET (green), and an idealized perfect reconstruction (red). Also shown are 1 σ (solid) and 3 σ (dashed) uncertainty ellipses.

Search for \({Z}^{{\prime} }\to t\bar{t}\)

Many BSM theories hypothesize additional heavy particles which may decay to \(t\bar{t}\) pairs, such as heavy Higgs bosons or new gauge bosons ( \({Z}^{{\prime} }\) ). We investigate a generic search for such a \({Z}^{{\prime} }\) particle, for which accurate reconstruction of the \(t\bar{t}\) mass peak over the SM background plays a crucial role. We compare the performance of the benchmark reconstruction methods to that of various SPA-NET configurations by assessing the ability to discover a \({Z}^{{\prime} }\) signal.

An important aspect is the selection of training data, due to the unknown mass of the \({Z}^{{\prime} }\) , which strongly affects the kinematics of the \(t\bar{t}\) system. To avoid introducing bias into the network, the training sample is devised to be approximately flat in \({m}_{t\bar{t}}\) . The network training was otherwise identical to that described for the SM \(t\bar{t}\) network, and performance on SM \(t\bar{t}\) events was approximately the same in the mass range covered by both samples.

The basic \(t\bar{t}\) selection described in the Dataset and Training subsection is applied, and all events are reconstructed as described earlier in order to calculate the \(t\bar{t}\) invariant mass, \({m}_{t\bar{t}}\) . The mass resolution of a hypothetical resonance can often be improved by removing poorly- or partially-reconstructed events. In the context of the algorithms under comparison, this corresponds to a requirement on the KLFitter likelihood or network output scores. The threshold is chosen to optimize the analysis with each algorithm, leading to a significant reduction of the SM \(t\bar{t}\) background when using the PDNN and SPA-NET. For SPA-NET we require a marginal probability of >0.078, and for PDNN we require a score of >0.43. For KLFitter, no cut is applied, as no improvement was found. More details on these cuts and the effect on the background distributions are shown in Supplementary Figs.  2 and 3 in Supplementary Note  1 .

We use the pyhf 41 , 42 package to extract the \({Z}^{{\prime} }\) signal and assess statistical sensitivity.

The expected results for a Run 2 analysis, normalized to 140 fb −1 with 20 GeV bins and a systematic uncertainty of 10%, are shown in Table  3 . The discovery significance is improved by SPA-NET compared to the benchmark methods for all masses considered. For example, for a \({Z}^{{\prime} }\) of mass 700 GeV the limit improves from 1.6 σ using KLFitter to 3.1 σ using SPA-NET.

The expected sensitivity in a Run 3 dataset with the integrated luminosity of 300 fb −1 is computed with an optimistic systematic uncertainty of 5% as also shown in Table  3 . For all the three benchmark signals, discovery significance exceeds 5 σ using SPA-NET, while for the baseline methods only the high mass point for the PDNN reaches this threshold. At a \({Z}^{{\prime} }\) mass of 500 GeV, KLFitter does not reach the 3 σ evidence threshold, while SPA-NET is able to make a discovery. It is noteworthy that the neutrino regression does not lead to an improvement on the final sensitivity, despite showing improved resolution compared to the baseline mass constraint method. This is due to the effect on the background shape, which similarly improves in this case.

Improved reconstruction with SPA-NET can therefore greatly boost particle discovery potential. This finding should extend to other hypothetical resonances such as heavy Higgs bosons, \({W}^{{\prime} }\) bosons, or SUSY particles as well as non- \(t\bar{t}\) final states such as di-Higgs, di-boson, t b or any other in which reconstruction is crucial and challenging.


This paper describes significant extensions and improvements to SPA-NET, a complete package for event reconstruction and classification for high-energy physics experiments. We have demonstrated the application of our method to three flagship LHC physics measurements or searches, covering the full breadth of the LHC program; a precision measurement of a crucial SM parameter, a search for a rare SM process, and a search for a hypothetical new particle. In each case, the use of SPA-NET provides large improvements over benchmark methods. We have further presented studies exploring what the networks learn, demonstrating the ability to learn the inherent symmetries of the data and strong robustness to training conditions. SPA-NET is the most efficient, high-performing method for multi-object event reconstruction to date and holds great promise for helping unlock the power of the LHC dataset.

Data availability

Our data is available in an online repository.

Code availability

Our code is available on github ( ).

Snyder, S. S. Measurement of the top quark mass at D0. Ph.D. thesis, SUNY, Stony Brook (1995).

Erdmann, J. et al. A likelihood-based reconstruction algorithm for top-quark pairs and the KLFitter framework. Nucl. Instrum. Meth. A 748 , 18–25 (2014).

Article   ADS   Google Scholar  

Vaswani, A. et al. Attention is all you need. In: Advances in neural information processing systems, vol. 30 (2017).

Qu, H. & Gouskos, L. Jet tagging via particle clouds. Phys. Rev. D 101 , 056019 (2020).

Moreno, E. A. et al. JEDI-net: a jet identification algorithm based on interaction networks. Eur. Phys. J. C 80 , 58 (2020).

Mikuni, V. & Canelli, F. ABCNet: an attention-based method for particle tagging. Eur. Phys. J. Plus 135 , 463 (2020).

Article   Google Scholar  

Lu, Y., Romero, A., Fenton, M. J., Whiteson, D. & Baldi, P. Resolving extreme jet substructure. JHEP 08 , 046 (2022).

Ju, X. & Nachman, B. Supervised jet clustering with graph neural networks for lorentz boosted bosons. Phys. Rev. D 102 , 075014 (2020).

Guo, J., Li, J., Li, T. & Zhang, R. Boosted Higgs boson jet reconstruction via a graph neural network. Phys. Rev. D 103 , 116025 (2021).

Dreyer, F. A. & Qu, H. Jet tagging in the Lund plane with graph networks. JHEP 03 , 052 (2021).

Bogatskiy, A., Hoffman, T., Miller, D. W. & Offermann, J. T. PELICAN: Permutation equivariant and lorentz invariant or covariant aggregator network for particle physics (2022).

Gong, S. et al. An efficient Lorentz equivariant graph neural network for jet tagging. JHEP 07 , 030 (2022).

Article   ADS   MathSciNet   Google Scholar  

Qu, H., Li, C. & Qian, S. Particle transformer for jet tagging. In: Proceedings of the 39th International Conference on Machine Learning, 18281–18292 (2022).

Bogatskiy, A., Hoffman, T., Miller, D. W., Offermann, J. T. & Liu, X. Explainable equivariant neural networks for particle physics: PELICAN (2023).

Fenton, M. J. et al. Permutationless many-jet event reconstruction with symmetry preserving attention networks. Phys. Rev. D 105 , 112008 (2022).

Shmakov, A. et al. SPANet: Generalized permutationless set assignment for particle physics using symmetry preserving attention. SciPost Phys. 12 , 178 (2022).

Ehrke, L., Raine, J. A., Zoch, K., Guth, M. & Golling, T. Topological reconstruction of particle physics processes using graph neural networks. Phys. Rev. D 107 , 116019 (2023).

Qiu, S., Han, S., Ju, X., Nachman, B. & Wang, H. Holistic approach to predicting top quark kinematic properties with the covariant particle transformer. Phys. Rev. D 107 , 114029 (2023).

Workman, R. L. et al. Review of particle physics. PTEP 2022 , 083C01 (2022).

Google Scholar  

Erdmann, J., Kallage, T., Kröninger, K. & Nackenhorst, O. From the bottom to the top—reconstruction of \(t\bar{t}\) events with deep learning. JINST 14 , P11015 (2019).

ATLAS Collaboration. Measurements of normalized differential cross sections for \(t\bar{t}\) production in pp collisions at \(t\bar{t}\) TeV using the ATLAS detector. Phys. Rev. D 90 , 072004 (2014).

ATLAS Collaboration. Measurement of the top-quark mass in the fully hadronic decay channel from ATLAS data at \(\sqrt{s}=7{{{{{{{\rm{\,TeV}}}}}}}}\) . Eur. Phys. J. C 75 , 158 (2015).

ATLAS Collaboration. Measurements of spin correlation in top-antitop quark events from proton-proton collisions at \(\sqrt{s}=7\) TeV using the ATLAS detector. Phys. Rev. D 90 , 112016 (2014).

ATLAS Collaboration. Search for the Standard Model Higgs boson produced in association with top quarks and decaying into \(b\bar{b}\) in pp collisions at \(b\bar{b}\) = 8 TeV with the ATLAS detector. Eur. Phys. J. C 75 , 349 (2015).

ATLAS Collaboration. Measurements of top-quark pair differential and double-differential cross-sections in the ℓ +jets channel with p p collisions at \(\sqrt{s}=13\) TeV using the ATLAS detector. Eur. Phys. J. C 79 , 1028 (2019). [Erratum: Eur.Phys.J.C 80, 1092 (2020)].

ATLAS Collaboration. Measurement of the charge asymmetry in top-quark pair production in association with a photon with the ATLAS experiment. Phys. Lett. B 843 , 137848 (2023).

CMS Collaboration. Measurement of the top quark forward-backward production asymmetry and the anomalous chromoelectric and chromomagnetic moments in pp collisions at \(\sqrt{s}\) = 13 TeV. JHEP 06 , 146 (2020).

ATLAS & CMS Collaborations. Combination of the W boson polarization measurements in top quark decays using ATLAS and CMS data at \(\sqrt{s}=\) 8 TeV. JHEP 08 , 051 (2020).

ATLAS & CMS Collaborations. Combination of inclusive and differential \({{{{{{{\rm{t}}}}}}}}\overline{{{{{{{{\rm{t}}}}}}}}}\) charge asymmetry measurements using ATLAS and CMS data at \({{{{{{{\rm{t}}}}}}}}\overline{{{{{{{{\rm{t}}}}}}}}}\) and 8 TeV. JHEP 04 , 033 (2018).

Alwall, J. et al. The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations. JHEP 07 , 079 (2014).

Sjöstrand, T. et al. An introduction to PYTHIA 8.2. Comput. Phys. Commun. 191 , 159–177 (2015).

de Favereau, J. et al. DELPHES 3, A modular framework for fast simulation of a generic collider experiment. JHEP 02 , 057 (2014).

Cacciari, M., Salam, G. P. & Soyez, G. The anti- k t jet clustering algorithm. JHEP 04 , 063 (2008).

ATLAS Collaboration. Measurement of Higgs boson decay into b -quarks in associated production with a top-quark pair in p p collisions at \(\sqrt{s}=13\) TeV with the ATLAS detector. JHEP 06 , 097 (2022).

Fuks, B. & Ruiz, R. A comprehensive framework for studying W \({}^{{\prime} }\) and Z \({}^{{\prime} }\) bosons at hadron colliders with automated jet veto resummation. JHEP 32 , 5 (2017).

Leigh, M., Raine, J. A., Zoch, K. & Golling, T. ν -flows: conditional neutrino regression. SciPost Phys. 14 , 159 (2023).

Raine, J. A., Leigh, M., Zoch, K. & Golling, T. Fast and improved neutrino reconstruction in multineutrino final states with conditional normalizing flows. Phys. Rev. D 109 , 012005 (2024).

Li, C. et al. Does Lorentz-symmetric design boost network performance in jet physics? (2022).

CMS Collaboration. Measurement of the t \(\overline{{{{{{{{\rm{t}}}}}}}}}\) H and tH production rates in the \({{{{{{{\rm{H}}}}}}}}\to {{{{{{{\rm{b}}}}}}}}\overline{{{{{{{{\rm{b}}}}}}}}}\) decay channel with 138 fb −1 of proton-proton collision data at \(\sqrt{s}=13\,{{{{{{{\rm{TeV}}}}}}}}\) . Tech. Rep., CERN, Geneva. . (2023).

Chen, T. & Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , KDD ’16, 785–794 (ACM, New York, NY, USA, 2016).

Heinrich, L., Feickert, M. & Stark, G. pyhf: v0.7.3. .

Heinrich, L., Feickert, M., Stark, G. & Cranmer, K. pyhf: pure-python implementation of histfactory statistical models. J. Open Source Softw. 6 , 2823 (2021).

de Florian, D. et al. Handbook of LHC Higgs Cross Sections: 4. Deciphering the Nature of the Higgs Sector 2/2017 (2016).

Calvet, T. P.Search for the production of a Higgs boson in association with top quarks and decaying into a b-quark pair and b-jet identification with the ATLAS experiment at LHC. Ph.D. thesis, Aix-Marseille University, . (2017).

ALEPH, CDF, D0, DELPHI, L3, OPAL, SLD, LEP Electroweak Working Group, Tevatron Electroweak Working Group, SLD Electroweak, Heavy Flavour Groups. In : Precision Electroweak Measurements and Constraints on the Standard Model. CERN-PH-EP-2010-095 (2010).

Degrassi, G. et al. Higgs mass and vacuum stability in the Standard Model at NNLO. JHEP 08 , 098 (2012).

Andreassen, A., Frost, W. & Schwartz, M. D. Scale Invariant Instantons and the Complete Lifetime of the Standard Model. Phys. Rev. D 97 , 056006 (2018).

CMS Collaboration. Measurement of the top quark mass using a profile likelihood approach with the lepton + jets final states in proton–proton collisions at \(\sqrt{s}=13\,\,{{\mbox{Te}}}\,\hspace{-0.79982pt}\,{{\mbox{V}}}\,\) . Eur. Phys. J. C 83 , 963 (2023).

CMS Collaboration. Measurement of the differential \(t\overline{t}\) production cross section as a function of the jet mass and extraction of the top quark mass in hadronic decays of boosted top quarks. Eur. Phys. J. C 83 , 560 (2023).

ATLAS Collaboration. Measurement of the top-quark mass using a leptonic invariant mass in pp collisions at \(\sqrt{s}\) = 13 TeV with the ATLAS detector. JHEP 06 , 019 (2023).

Download references


We would like to thank Ta-Wei Ho for assistance in generating some of the samples used in this paper. D.W. and M.F. are supported by DOE grant DE-SC0009920. The work of A.S. and P.B. in part supported by ARO grant 76649-CS to P.B. H.O. and Y.L. are supported by NSFC under contract no. 12075060, and SCH is supported by NSF under Grant no. 2110963.

Author information

These authors contributed equally: Michael James Fenton, Alexander Shmakov.

Authors and Affiliations

Department of Physics and Astronomy, University of California, Irvine, Irvine, 92607, CA, USA

Michael James Fenton & Daniel Whiteson

Department of Computer Science, University of California, Irvine, Irvine, 92607, CA, USA

Alexander Shmakov & Pierre Baldi

Institute of High Energy Physics, Chinese Academy of Sciences, Shijingshan, 100049, Beijing, China

Hideki Okawa

Institute of Modern Physics, Fudan University, Yangpu, 200433, Shanghai, China

Department of Physics, National Tsing Hua University, Hsingchu City, 30013, Taiwan

Ko-Yang Hsiao

Department of Physics and Astronomy, University of Washington, Seattle, 98195-4550, WA, USA

Shih-Chieh Hsu

You can also search for this author in PubMed   Google Scholar


Michael Fenton: conception, direction, supervision of all students, manuscript preparation. Alexander Shmakov: development, implementation, and training of SPA-NET and PDNN, manuscript preparation. Hideki Okawa: MC production, \({Z}^{{\prime} }\) analysis lead, manuscript preparation, supervision of Y. Li. Yuji Li: \(t\bar{t}H\) analysis lead Ko-Yang Hsiao: top mass analysis lead Shih-Chieh Hsu: supervision of K-Y Hsiao Daniel Whiteson: manuscript preparation, supervision of A. Shmakov. Pierre Baldi: manuscript editing, machine learning developments, supervision of A. Shmakov.

Corresponding authors

Correspondence to Michael James Fenton or Alexander Shmakov .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Communications Physics thanks Daniel Murnane and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Peer review file, supplementary information, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit .

Reprints and permissions

About this article

Cite this article.

Fenton, M.J., Shmakov, A., Okawa, H. et al. Reconstruction of unstable heavy particles using deep symmetry-preserving attention networks. Commun Phys 7 , 139 (2024).

Download citation

Received : 08 November 2023

Accepted : 11 April 2024

Published : 30 April 2024


Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

conditional assignment net

Teen facing terror-related charges as part of Wakeley church stabbing investigation was already on conditional bail for gun possession charges

Two police officers in tactical gear stand on a street with their backs to the camera

A 14-year-old boy trying to secure bail over terrorism-related charges appeared before the same court a fortnight earlier on gun possession charges.

He was the youngest of five juveniles taken into custody on Wednesday, as a joint counterterrorism team investigated what they labelled a "network" of people sharing a "similar violent extremist ideology", following a "terror act" at a western Sydney church last week.

The boy, who cannot be named for legal reasons, appeared before the Children's Court in Parramatta on Thursday charged with possessing or controlling extremist material, with the prosecution alleging his mobile phone contained videos of beheadings produced by the Islamic State.

He has spent the last two nights in custody, as he waits for the outcome of a challenge to his release on conditional bail.

The court on Thursday heard the 14-year-old was already on conditional bail for separate offences.

Court documents show he appeared before the same court on April 12 charged with possess unauthorised firearm and possess unauthorised pistol.

He also faces five more charges related to assault and aggravated robbery.

The matter is ongoing and the boy has not been convicted of any offence.

Bail approved, pending challenge

Police charged the 14-year-old with possessing or controlling extremist material, which has a maximum jail term of five years.

He appeared before the Children's Court in Parramatta to make an application for his release.

The court heard among the several video files on his phone were some which depicted people being run over by vehicles, and a cartoon advocating violence towards homosexual men.

Magistrate Paul Mulroney described the footage as depicting "the worst behaviour of humanity".

"He has material that is clearly violent, extremist material, material that is distressing, that is reprehensible," he said.

The magistrate approved the 14-year-old's application for bail with strict conditions, citing his youth, the lack of evidence suggesting the videos were distributed, and the support of his family.

"14-year-olds lack maturity, they have a considerably reduced capacity to consider the consequences of their behaviour," he said.

"A very reasonable perspective is that the young person has received it, has seen it, and has done nothing about it."

The bail conditions prohibit him from using a smartphone, contacting select individuals, and require him to live at home and see a psychologist.

However, his release has been delayed after the court heard the Acting Commonwealth Director of Public Prosecutions is appealing the decision.

  • X (formerly Twitter)

Related Stories

Police allege two teenagers charged after counterterrorism raids in sydney had videos of beheadings on phones.

Two police officers in tactical gear stand on a street with their backs to the camera

  • Courts and Trials
  • Terrorist Attacks


  1. ?: operator

    In a conditional ref expression, the type of consequent and alternative must be the same. Conditional ref expressions aren't target-typed. Conditional operator and an if statement. Use of the conditional operator instead of an if statement might result in more concise code in cases when you need conditionally to compute a value. The following ...

  2. IDE0045: Use conditional expression for assignment

    Overview. This style rule concerns the use of a ternary conditional expression versus an if-else statement for assignments that require conditional logic.. Options. Options specify the behavior that you want the rule to enforce. For information about configuring options, see Option format.. dotnet_style_prefer_conditional_expression_over_assignment

  3. c#

    If the assignment fails then dog is null, which prevents the contents of the for loop from running, because it is immediately broken out of. If the assignment succeeds then the for loop runs through the iteration. At the end of the iteration, the dog variable is assigned a value of null, which breaks out of the for loop.

  4. ?? and ??= operators

    In expressions with the null-conditional operators ?. and ?[], you can use the ?? operator to provide an alternative expression to evaluate in case the result of the expression with null-conditional operations is null:

  5. Conditional operator(?:) in C#

    Nested conditional operator (?:) in C#. In some scenarios, where there are cascading if-else conditions of variable assignment. We can use chaining conditional operators to replace cascading if-else conditions to a single line, by including a conditional expression as a second statement. Let's take below example of cascading/nested if-else ...

  6. C# ?: Ternary Operator (Conditional Operator)

    Example: Ternary operator. int x = 10, y = 100; var result = x > y ? "x is greater than y" : "x is less than y"; Console.WriteLine(result); output: x is less than y. Thus, a ternary operator is short form of if else statement. The above example can be re-write using if else condition, as shown below. Example: Ternary operator replaces if statement.

  7. Assignment Solution for Conditional Statements

    Note: This program is an example of usage of conditional if statement without else. The conditional if is usually used to perform validation operations like below. Solution: #include <iostream>. using namespace std; int main () {. int a, b, result, choice; cout << "Enter the value of a and b : " << endl;

  8. Ternary conditional operator

    For example, to pass conditionally different values as an argument for a constructor of a field or a base class, it is impossible to use a plain if-else statement; in this case we can use a conditional assignment expression, or a function call. Bear in mind also that some types allow initialization, but do not allow assignment, or even that the ...

  9. Using inline IF statement

    EDIT: If you use VB.NET from ver 2008 onward you could use also the. IF(expression,truepart,falsepart) and this is even better because it provides the short-circuit functionality. Dim R as string = stringA & " * sample text" & _. stringB & " * sample text2" & _. stringC & " * sameple text3" & _. If(ApplyValue IsNot Nothing AndAlso ApplyValue ...

  10. Conditional Coalescing and Assignment Operators

    The alternatives include using traditional conditional statements or ternary operators, but these may result in less readable and more verbose code. The addition of conditional assignment operators would offer a concise and expressive way to handle conditional assignments, especially when dealing with minimum and maximum value assignments.

  11. VB .NET Basics: Conditionals, Logical Operators and Short-Circuiting

    First of all, in VB .NET, the = operator (which is the same one used in assigning values to variables) is used for equality comparison. This is quite different from C-like languages which use = for assignment and == for comparison. Secondly, VB .NET is sensitive to spacing. If you try to put the Then on the next line, like this:

  12. Null-conditional assignment · Issue #6045 · dotnet/csharplang

    It doesn't match teh grammar for that feature. Specifically, the grammar is: null_conditional_assignment. : null_conditional_member_access assignment_operator expression. : null_conditional_element_access assignment_operator expression. Neither of which match the above. As such, this is normal assignment.

  13. If Operator

    An IIf function always evaluates all three of its arguments, whereas an If operator that has three arguments evaluates only two of them. The first If argument is evaluated and the result is cast as a Boolean value, True or False. If the value is True, argument2 is evaluated and its value is returned, but argument3 is not evaluated. If the value ...

  14. Verilog Assignments

    The LHS can be a bit-select of a net, part-select of a net, variable or a net but cannot be the reference to an array and bit/part select of a variable. The force statment will override all other assignments made to the variable until it is released using the release keyword. reg o, a, b; initial begin force o = a & b; ... release o; end

  15. csharplang/proposals/ at main

    The null conditional assignment grammar is defined as follows: : null_conditional_member_access assignment_operator expression. : null_conditional_element_access assignment_operator expression. See §11.7.7 and §11.7.11 for reference. When the null conditional assignment appears in an expression-statement, its semantics are as follows:

  16. Using List Comprehensions with Conditional Assignment

    Here are some scenarios where list comprehensions with conditional assignment shine: Filtering and modifying elements based on specific criteria: When you need to create a new list from an existing one, applying conditions to include or modify elements, list comprehensions with conditional assignment provide a concise and efficient solution.

  17. Compiler tricks in x86 assembly: Ternary operator optimization

    Compiler tricks in x86 assembly: Ternary operator optimization. One relatively common compiler optimization that can be handy to quickly recognize relates to conditional assignment (where a variable is conditionally assigned either one value or an alternate value). This optimization typically happens when the ternary operator in C ("?:") is ...

  18. Using Continuous Assignment to Model Combinational Logic in Verilog

    The verilog code below shows the general syntax for continuous assignment using the assign keyword. assign <variable> = <value>; The <variable> field in the code above is the name of the signal which we are assigning data to. We can only use continuous assignment to assign data to net type variables.

  19. Continuous Assignment and Combinational Logic in SystemVerilog

    This approach is known as explicit continuous assignment. The SystemVerilog code below shows the general syntax for continuous assignment using the assign keyword. assign <variable> = <value>; In this construct, we use the <variable> field to give the name of the signal which we are assigning data to.

  20. Assignment operators

    In this article. The assignment operator = assigns the value of its right-hand operand to a variable, a property, or an indexer element given by its left-hand operand. The result of an assignment expression is the value assigned to the left-hand operand. The type of the right-hand operand must be the same as the type of the left-hand operand or implicitly convertible to it.

  21. Reconstruction of unstable heavy particles using deep symmetry ...

    As a result of this training procedure, the SPA-NET assignment probability P a only represents a conditional assignment distribution over jet indices j i for each particle p given that the ...

  22. How to build a conditional assignment in bash?

    90. If you want a way to define defaults in a shell script, use code like this: : ${VAR:="default"} Yes, the line begins with ':'. I use this in shell scripts so I can override variables in ENV, or use the default. This is related because this is my most common use case for that kind of logic. ;]

  23. Flowchart Maker & Online Diagram Software

    Flowchart Maker and Online Diagram Software. is free online diagram software. You can use it as a flowchart maker, network diagram software, to create UML online, as an ER diagram tool, to design database schema, to build BPMN online, as a circuit diagram maker, and more. can import .vsdx, Gliffy™ and Lucidchart™ files .

  24. Teen facing terror-related charges as part of Wakeley church stabbing

    Court documents have revealed a 14-year-old boy charged with terrorism-related offences as part of an investigation into the Wakeley Church stabbing was on conditional bail for a gun possession ...