In Focus Network Open Science

LHCONE – A model of network evolution in the 21st Century

Geneva, Switzerland - December 02, 2019: CERN - European Organization for Nuclear Research - Microcosm Exhibition - Large Hadron Collider (LHC) - Geneva, Switzerland

The LHCONE network overlay service is by far the largest multidomain service provided to the global Research and Education user community. It was created in response to the evolving needs of the LHC experiments computing and has grown from a small-scale technical exercise to today’s global service. The LHCONE service is currently deployed by 40 national, regional and intercontinental, R&E network operators, connecting more than 150 sites, in four continents.

This network replaced the original hybrid private/shared topology initially developed in the late 1990s which rapidly became unfit to adapt to the changing profile of data processing in the community, shifted in response to both the increased data production of the LHC and the different ways this data was being used.

First steps – Hierarchical hybrid network

In 1998, the requirements of managing the globally distributed computing needs of LHC experiments were being studied. The LHC experiments (ATLAS, CMS, LHCb and ALICE) were about to start producing data in previously unheard-of volumes and this data needed to be processed and analysed by the science community around the world.  The challenge was to ensure that data was available to the correct people when and where they needed it.

To support this a strict hierarchy of sites was defined, with three tiers.

  • Tier 0 – CERN – Data Acquisition
  • Tier 1 – Eight Regional pre-processing and data preparation sites (France, UK, Italy, Germany, Netherlands and the USA)
  • Tier 2 – User analysis and local caching.

Tier 1 Sites were to be connected to CERN using dedicated private point-to-point circuits (initially the USA Tier 1 site data transfers were undertaken using air-freight of hard drives but this was deprecated early in the process). This private network was dubbed the LHC Optical Private Network (LHCOPN). These Tier 1 sites where tasked with two main functions: to complement the Tier 0 in ingesting and storing the raw data during the LHC runs, and to do the pre-processing of the raw data into a format that could be used for the user analysis.

Tier 2 sites accessed data from their local regional Tier 1 site using the NREN IP services in that region.

MONARC Networking Architecture

This networking structure was referred to as the Models of Networked Analysis at Regional Centres (MONARC)

Step Two – Larger, faster, more complex

Over time, a number of factors resulted in this MONARC model becoming less sustainable. Primary amongst these was the massively increased amount of computation that was needed, as a consequence of the much higher performance of the LHC, compared to the original plans. This meant a lot more data was being stored and processed, providing a much larger amount of information on which to run the users’ analysis.

As a tactical solution, extra Tier 1 sites were implemented in Canada, Taiwan, Korea and Scandinavia and Tier 2 sites were able to access data from any suitable Tier 1 site without the geographic constraints previously imposed.

Additionally, a group of “Tier 3” sites were implemented with far lower on-site storage capacity requirements. These “diskless” sites relied far more on remote access to data – placing larger demands on the NREN networks.

These operational changes were combined with technological changes in the network and processing equipment that altered the networking environment. The costs of network links reduced dramatically and the available capacity on existing fibre infrastructure increased, however the change in processing capacity meant that the existing hierarchical model was beginning to limit the maximum performance that sites could support.  What had initially enabled the distribution of data was now beginning to limit it.

Step Three – The LHC Open Network Environment

The profoundly changed computing model had the effect of bringing a lot more pressure on the network, not just in terms of traffic, but also of added capabilities required to allow for a more effective use of the increased capacity, primarily on the long-distance geographic links but also on the access links to the sites.

The answer to this new set of requirements led to the LHC Open Network Environment (LHCONE) concept. Created at a workshop organised by Caltech and CERN in June 2010, and starting with a pilot with a small number of participating sites, LHCONE quickly grew organically to include the major LHC computing sites and the R&E network operators in all regions of the world.

Very quickly the decision was made to base LHCONE on the “Virtual Forwarding and Routing” (VRF) model which was rapidly deployed in 2012 with remarkable success. Sustained flows of 10-20 Gbps where soon observed across Internet2 and ESnet in the USA and GÉANT in Europe, and LHCONE flows of up 10 Gbps were seen at several sites in Italy, USA and Canada.

The VRF-based LHCONE uses the underlying NRENs and GÉANT network fabric but operates separately from the normal IP data flows. In simple terms it acts as a completely separate logical network over the same shared physical infrastructure.

This approach creates a closed and trusted environment, relaxing the need of secure access as it obviates the need for dedicated firewalls and allows the use of Access Control List (ACL) based edge security with commensurate improvements in performance.

VRF Based LHCONE Architecture

Simpler Troubleshooting

Because the LHCONE routing table is separated from the general R&E IP one, and is a lot smaller, this allows for much quicker troubleshooting.

This is complemented by a network of perfSONAR probes that are deployed in almost every site connected to LHCONE, which allows for a very granular monitoring of the underlying network, including routing anomalies, jitter fluctuations, throughput limitations, etc.

As LHCONE’s network is a logically separate network, a specific Acceptable Use Policy has been put in place to ensure that the service is used only by eligible participants in acceptable and authorised ways. Both end-sites and network operators need to adhere to the AUP.

LHCONE Today

The very first group of NRENs and RRENs to start the implementation around 2010 were a few European NRENs and ESnet in the USA, but very soon the participation started to grow steadily.

As of today, LHCONE is implemented by 33 NRENs, four regional and three intercontinental RENs, spanning five continents (Europe, Asia, North America, South America, Australia). Currently, over 150 sites are connected to LHCONE.

The use of the network has also been expanded from the original four LHC experiments to include: Belle II, Pierre Auger Observatory, NOvA, XENON1T, JUNO and DUNE.  This widening of the scope allows the LHCONE infrastructure to be shared across a wider community, simplifying the operations of these other particle physics experiments and reducing duplication and costs for sites involved in multiple experiments.

LHCONE stands as a remarkable example of the benefits of international collaboration between NRENs and RRENs in order to support the rapidly changing requirements of advanced research.

 

 

Skip to content