Words: Dimitris Mitropoulos, Head of the Reliability Engineering Directorate and Assistant Professor at the University of Athens
In 2019, GRNET (National Infrastructures for Research and Technology), the Greek NREN, under the auspices of the Ministry of Digital Governance, became involved with the digital transformation of the Greek public sector. Specifically, it became responsible for the development and maintenance of the gov.gr portal and several governmental services including the electronic issuance of documents signed by the Greek state.
Offering services for the general public of a country introduces a set of challenges that differ from those associated with services targeted at the academic and research communities. Some key issues involve faster development and deployment cycles, scalability and resiliency, security, and public perception. To cope with such challenges, GRNET teams, including primarily the SRE (Site Reliability Engineering) team, developed AppStack, a cloud-native platform with an enabling environment for integrating open-source software (OSS) components. Currently, AppStack hosts more than 200 user flows for governmental services.
Through AppStack, technical teams work together based on a common pipeline with different capabilities. The pipeline starts with the developers working on software artifacts and applying Continuous Integration (CI) practices using GitLab CI. Once they have finished an application version, they generate a corresponding container. All container images are then placed in Harbor, a container registry that offers a way to process and distribute such images. To deploy and operate corresponding containers, we utilise Kubernetes (K8s), a well-established container-orchestration system. K8s comes with several key properties including resource isolation and effective workload distribution. We simplify the process of managing applications on Kubernetes by using Helm. Helm provides a means to package applications into reusable packages called charts. AppStack incorporates a monitoring stack which includes Sentry, an error tracking framework, the Prometheus alerting toolkit, and the EFK (Elasticsearch, Fluentd, and Kibana) suite that offers log management capabilities.
As internet traffic reaches GRNET’s data center, our routers forward the traffic to the Load Balancers (LBs) that stand in front of the Kubernetes clusters. The communication between the router and the LBs is done by using the Border Gateway Protocol (BGP). Notably, BGP supports features such as policy-based routing and traffic engineering. Our LBs employ (1) BIRD to establish BGP connections with the router, and (2) HAProxy to distribute the network traffic across the clusters. In this manner, SREs are able to control the rate of requests and change load balancing methods on the fly.
On the clusters’ side, the Ingress-NGINX controllers receive and manage traffic. To secure our services from attacks, we have also integrated the ModSecurity Web Application Firewall (WAF) as an additional layer of Ingress-NGINX. To handle traffic inside clusters we use Calico, a networking solution for containerised workloads that uses BGP to handle large numbers of routes. All data is stored in a separate database cluster. As a backend database we use PostgreSQL, a relational database management system that employs concurrency control mechanisms and indexing capabilities to optimise query performance. To improve the performance of our databases we utilise a connection pooler named PgBouncer.
The scalable architecture of AppStack allows for multiple deployments per day even with thousands of users connected. In this manner, we are able to respond to changing conditions and feedback quickly and release new features and updates. Our experience from running AppStack indicates that it has great potential, and our plan is to expand it as more services are scheduled.
This article is featured on CONNECT 46, the latest issue of the GÉANT CONNECT Magazine!
Read or download the full magazine here