EELA Project: creating a human network between Europe and Latin America
The EELA (E-infrastructure shared between Europe and Latin America) project aims at establishing a bridge between the existing e-Infrastructures in Europe and those emerging in Latin America, through the creation of an interoperable Grid Infrastructure - based on the RedCLARA and GÉANT2 networks - for the development and deployment of advanced applications in Biomedicine, High Energy Physics, e-Learning and Climate. EELA is expected to help reducing the digital divide in Latin-America, making available to researchers a high performance e-Infrastructure for advanced investigations, later extendable to a larger community of users.
Objectives
The EELA project has three main objectives:
- Establish a collaboration network between European institutions where Grid expertise exists (e.g. EGEE Project), and Latin American institutions where Grid activities are emerging.
- Set up a pilot e-Infrastructure in LA, interoperable with the EGEE one in Europe allowing to run enhanced applications thus enabling dissemination of knowledge and experience on Grid technology.
- Set up a steady framework for e-Science collaboration between Europe and Latin America.
Consortium
The EELA consortium involves 21 leading institutions around Europe and Latin America:
- In Europe:
- In Latin America:
- International Organizations:
Organization
The EELA project is organized in four Work Packages (WP):
- WP1: Project Administration and Technical Management.
- WP2: Pilot Test-bed Operation and Support. It implements all necessary services to access the LA-EU e-Infrastructure, provides a framework for the end user and makes available resources to run applications.
- WP3: Identification and Support of Grid Enhanced Applications.
- WP4: Dissemination Activities. It imparts the Grid knowledgeadvertises EELA activities and provides training on Grid technology.
Infrastructure
The Work Package 2 is in charge of create and manage a common interoperable Grid Pilot Test-bed, starting from existing resources in Latin America and Europe, distributed over 15 resource centres. The Pilot Test-bed is built upon the network infrastructure provided by GÉANT2 in Europe, and RedCLARA in Latin America.
The Pilot Test-bed is the heart of Work Package 2, without the services provided by this Infrastructure, the other Work Packages will not be able to run applications or disseminate knowledge.
The Pilot Test-bed is organised in three layers: At the highest level, the EELA Operations Centre (old Grid Operations Centre) coordinates the interaction between the subordinate CORE Services Centres (CSC) and the Resource Centres (RCs) which provide computing power and data storage. At the base level, the Additional Service Providers (ASPs) provide support for services needed by proper operation of the Pilot Test-bed but not directly related to the middleware utilised, such as Certification Authority management and Virtual Organisation Management Services (VOMS) and File Catalogues.
The responsibility of maintaining this infrastructure is divided within the tasks that comprise Work Package 2:
- Task 2.1 is responsible for the coordination between the other tasks, and for the interactions with other Work Packages and external projects.
- Task 2.2 is responsible for ensuring proper operation of the ASPs;
- Task 2.3 is responsible for operation of the EOC, CSCs and the various RCs;
- Task 2.4 is responsible for provisioning of the network connections between the various partners involved in the Test-bed.
Applications
The big effort that has been done for integrating different sites from diverse institutions spread around Europe and Latin America would be vain if no applications were run on the Infrastructure. That is why several scientific fields have been chosen for developing e-Science applications that could benefit the whole community, but with special emphasis on the Latin American one.
Biomedical Applications
Speaking out of applications, one of the pillars of the EELA Project is Biomedicine. The main reason is not only the enormous medical problems that happen in some zones of Latin America, but the fact that this scientific community has been one of the first to be gridified. The ones that have been selected fall in the three typical categories of Bioinformatics Applications, Computational Biochemical Processes and Biomedical Models and have been started to be deployed on the pilot EELA infrastructures for both production and dissemination purposes.
The EELA biomedicine applications are:
- New Applications:
- BiG (BLAST in Grid) is a Grid-enabled BLAST Interface.
- BLAST (Basic Local Alignment Search Tool) is a Bioinformatics Procedure Applied to Identify Compatible Protein and Nucleotids Sequences in Protein and DNA Databases.
- MrBayes is a Tool for Phylogeny Studies.
- A Phylogeny is a Reconstruction of the Evolutionary History of a Group of Organisms.
- BiG (BLAST in Grid) is a Grid-enabled BLAST Interface.
- EGEE-Ported Applications:
- GATE is an Environment for the Monte-Carlo Simulation of Particle Physics Emission in the Medical Field.
- It is Focused Towards Thyroid Cancer and Treatment of Metastasis with P32.
- WISDOM (Wide In-Silico Docking Of Malaria) is a Deployment of a High-Throughput Virtual Screening Platform in the Perspective of In-Silico [...] Discovery for Neglected Diseases.
- GATE is an Environment for the Monte-Carlo Simulation of Particle Physics Emission in the Medical Field.
A more detailed explanation of the applications can be found in EELA Deliverable D3.1.1.
High Energy Physics Applications
The High Energy Physics (HEP) community world-wide was one of the first communities to embrace Grid-based computing because of the highly parallelisable nature of the problems, as manifests itself in the contributions to the various Computing in High Energy Physics (CHEP) conferences since the year 2000 and in the support for the EGEE. Because of the computing and data intensive nature and because of the long duration of their data-challenges, HEP applications provide means to stress-test the Grid infrastructure set up as part of the EELA project. At the same time, the HEP community in LA benefits from the infrastructure set up by EELA. Today’s HEP applications are mainly batch oriented applications, but applications for interactive data analysis are underway. This computational and data challenge has resulted in a programme of research, development and deployment of Grid technologies oriented to the Large Hadron Collider experiment, known as LCG (LHC Computing Grid). To benefit the international collaborations behind the experiments, it was important to incorporate the EELA resources into their Grid infrastructure. This has been done and we are anticipating the execution of jobs of the international collaboration in the following selected applications.
The EELA HEP applications are:
- Initial applications
- ALICE (A Large Ion Collider Experiment) collaboration is building a dedicated heavy ion detector to exploit the unique physics potential of nucleus-nucleus interactions at LHC energies.
- LHCb: This experiment is to full investigate the CP violation in the Bd and Bs systems, to possibly renew the new physics beyond the standard model and is a specialized experiment that makes use of the fact that particles (called mesons) that contain a b-quark will be copiously produced at the LHC.
- Other LHC applications
- ATLAS (A Toroidal LHC ApparatuS) , this is, a particle physics experiment that will explore the fundamental nature of matter and the basic forces that shape our universe.
- CMS (Compact Muon Solenoid ) is one of two large general purpose particle physics detectors being built also on the proton-proton Large Hadron Collider (LHC) at CERN.
- New projects
- Pierre Auger Observatory (PAO) is an international cosmic ray observatory designed to detect ultra high energy cosmic rays and it is sited in western Argentina in the province of Mendoza.
e-Learning Applications
The main objective of EELA of identifying and promoting a sustainable framework for e-Science is reachable not only by deploying mature Grid applications, but also by making available the new applications that are currently been developed by the European and Latin-American scientific communities. e-Learning is crucial to support the learning processes in Latin America, which are obviously affected by the geographical constrains; the benefits of this will be for the whole Latin-American society.
The EELA e-Learning applications are:
- CuGfL: The Learning Management System Cuba Grid for Learning is a One-Stop-Centre for quality assured online learning content, that with the aims to promote and support the lifelong learning agenda in Cuba.
- VoD: Video on Demand consists of a distributed interactive multimedia server (RIO). As part of this e-Learning initiative, a prototype virtual lab is being built for the study of the mechanical oscillations.
- LEMDist: Is not only an application, but a project itself. Thus, its goal is to get web access to laboratory equipment and another web service to help e-Science and e-Learning users.
- PILP: Parallel Inductive Logic Programming is intended to discover hidden data from relational databases.
- SATyrus: Is a novel approach to the specification and solving of optimization problems.
Climate Applications
Modern climate science deals with different sources of global climate simulations and geographically distributed observational data (surface, atmosphere, ocean, etc.) stored in different platforms and formats. These sources of data can jointly help to solve many important problems, such as the effects of climate change on different regions of interest. To this aim, efficient problem-driven statistical analysis tools are required for discovering knowledge, or useful information, within the huge amount of information. Data mining and machine learning techniques have been developed in the last decades to deal with this task, and different alternatives have been studied to make easier the process in a distributed environment such as the Grid.
Climate Application is composed of three sub-applications working together for forecast climate:
- CAM Model (Community Atmospheric Model): The Community Atmosphere Model (CAM) is the latest in a series of global atmosphere models developed at NCAR for the weather and climate research communities.
- WRF Model (Weather Research & Forecasting Model): Is a limited-area model designed to simulate or predict regional atmospheric circulation. This model can work with nested domains with different resolutions and require as input the boundary conditions from a global model (e.g., the CAM model).
- SOM (Self Organizing Maps) for climate data: Due to the high-dimensional character of the data involved in the climate simulations, it is necessary to first analyze and simplify the data in order to extract some useful knowledge. Some data mining techniques are appropriate for this context. Unsupervised clustering techniques allow partitioning the simulation databases, producing realistic weather or climate models of great variability governing the global dynamics. Self-Organizing Maps (SOM) are amongst the most popular clustering algorithms, which are especially suitable for high dimensional data visualization and modelling.
Other Applications: Volcano Sonification
Current knowledge of volcanic eruptions does not yet allow scientists to predict future eruptions. The EGEE and EELA Projects are trying to put the scientific community one step nearer to the prediction asset by means of the sonification of volcano seismograms. Thus, the translation of the patterns of Mount Etna (Italy) and Mount Tungurahua’s (Ecuador) volcanic behaviour into sound waves has been carried out within the context of these Projects.
Data sonification is currently used in several fields and for different purposes: science and engineering, education and training. It acts mainly as data analysis and interpretation tool.
Training and events
Several tutorials and workshops are frequently organised by the project. The former aims to ensures that all users fully understand the characteristics of the offered grid services and that they have enough technical knowledge to properly use the EELA infrastructure. All training material produced and used in EELA training events has been published in an open repository at http://documents.eu-eela.org/
The main goal of the workshops is to present EELA project to the local authorities, decision makers and scientific community, besides assessing the interest of local institutions to collaborate with EELA.
Stay tuned:
Conferences and Workshops
Tutorials
First EELA Grid School
Held in Itacuruçá island (Brazil), EGRIS-1 was the most important training activity in 2006. For this event a complete Grid infrastructure composed by 66 nodes and a link at 10 Mbit/s working 24/7 during 2 weeks was installed in a small tropical island. EGRIS-1 was aimed at creating the necessary environment in Latin America for the “gridification” of new applications to be run on the EELA Infrastructure. Eight application development teams were chosen by the selection board, comprising researchers from EELA partners, technical industries and non-EELA research groups, both from Europe and Latin America. At the end of this event, all development teams went back to their labs carrying the necessary knowledge to port their applications into EELA testbed.
Watch the EGRIS-1 video on YouTube.
Innovation
EELA proposes innovative technologies and strategies. Technologically, the challenge is to efficiently share large amount of data across a wide network area through a global file system. This integrated platform shall enhance the computing capability in Europe and even more in Latin America, thanks to the possibility of redistributing the global computational workload by migrating jobs across national borders, in a way that is totally transparent to end users.
From the strategic point of view, the EELA project will deploy a computing and storage infrastructure through a deep integration of existing national high-end platforms, tightly coupled to a dedicated network by means of advanced Grid software. Strategies of coordinated operation have been identified and agreed. The result will be an integrated infrastructure whose capabilities shall be more efficient than the sum of its constituent parts.
The benefits of the Grid enhanced applications running on the EELA infrastructure are twofold: besides their obvious scientific importance, several of these applications will have a noticeable social impact. New inhibitors for Malaria, Influenza and other neglected diseases (responsible for the daily death of thousands of people), access to Education for isolated people and powerful climate prediction are some good examples.
es:Proyecto EELA