A 9.8 million euro funding over three years in support of the DataGrid project was authorized by the EC Information Society Programme (within the Fifth Framework Research Programme for technology development) at the end of December 2000 and a contract has been awarded to CERN* as leader of the project.
This project is seen by international computing experts and EU authorities as an ideal test case for the development of a new model of world-wide distributed computing and the natural evolution of the World Wide Web, which was also developed at CERN. The DataGrid project was submitted to the EU on 8 May, and after a successful review the consortium was invited to negotiate a contract for 9.8 million euro of EU funding over three years. The negotiation was rapidly and successfully concluded at the end of October 2000. The formal signature of the relative contract took place on December 29th.
As the World Wide Web is exploited by more and more people, its limitations in dealing with the huge amounts of data involved become more apparent. Its successor, the Grid, owes its name to the romance of the pioneering era when the electrical power grid was a symbol of liberty, allowing people to tap into a valuable new resource, either as suppliers or users, or both. The computing Grid shares the same pioneering spirit, transferring it to a distributed Grid of computing resources in which supercomputers, processor farms, disks, major databases, informatic systems, collaborative tools and people are linked by a high speed network.
The DataGrid project will develop and implement a novel distributed computing environment, which is specifically designed to analyse and move vast amounts of data. It will build on emerging Grid technologies, using ‘open source' code to create a new world-wide data and computational Grid on a scale not attempted previously, a 'World Wide Grid'. The resources will be made available transparently to a widespread community through layers of new ‘middleware', the really innovative part of the DataGrid project. Middleware could be described as software ‘glue', which sits between the computing operating systems and the applications, enabling collaborative working in new ways. A major activity in the DataGrid project will be the dissemination of information and experience, with a strong emphasis on ensuring that the middleware created is made available to industry, potential partners and research areas.
The DataGrid project will provide scientists around the world with flexible access to unprecedented levels of computing resources and will initiate a new era of e-Science. It will enable next generation scientific exploration using shared databases up to a Petabyte in size (equivalent to the data contents of a pile of CD-ROMs standing about a mile high), across widely distributed scientific communities. It will allow distributed data and CPU intensive scientific computing models, drawn from the scientific disciplines of physics, biology and earth sciences, to be demonstrated on a geographically distributed Grid.
One of the first challenges for the DataGrid will be to handle the mass of data generated by CERN's next accelerator the Large Hadron Collider (LHC) which starts up in 2005. Bunches of protons will collide some 40 million times per second at the centre of huge detectors. The resulting tidal wave of data is equivalent to every person on the planet talking into 20 telephones at once. The computing power required to handle and process these data at CERN is estimated to be equivalent to about 100 000 of today's PC computers. At least three times the same power will be needed in the collaborating institutes world-wide. Clearly CERN and the four LHC collaborations cannot financially and practically sustain this effort. For this reason, CERN is placing its trust in the Grid.
The DataGrid project will help to co-ordinate national Grid projects, many of which are already underway. International connectivity will be achieved using an advanced research networking infrastructure, which will be made available by the EU Geant project. The project will explore a new scale of data-intensive Grid computing, and will provide a solid base of knowledge and experience. The middleware will be developed in collaboration with some of the leading centres in Grid technology, leveraging practice and experience from previous and current Grid activities in Europe and elsewhere.
The six main partners in the project are:
- CERN - The European Organisation for Nuclear Research near Geneva
- CNRS (France) - Le Comité National de la Recherche Scientifique
- ESRIN - The European Space Agency's Centre in Frascati (near Rome), Italy
- INFN (Italy) - Istituto Nazionale di Fisica Nucleare
- NIKHEF (The Netherlands) - The Dutch National Institute for Nuclear Physics and High Energy Physics, in Amsterdam
- PPARC (United Kingdom) - Particle Physics and Astronomy Research Council
Alongside the 17 pan-European research organisations in the project, three companies are associated partners of the collaboration: CS-Systemes D'Information, Clamart, France – A company which provide computer services and products for the Internet market. Datamat, Rome, Italy – A company which provide computer services for a large set of applications and high performance computing markets. IBM-UK, Feltham, UK - The High Performance Computing Unit of this internationally renowned computing company.
There is also an Industry Forum, which will bring together research institutions and companies from around the world with the goal to develop open Grid technologies ensuring a seamless non-proprietary Grid for all.
Further information from The DataGrid website
Or from the Project Director: Fabrizio GAGLIARDI, CERN-IT, 1211 Geneva 23, Switzerland
Phone: +41 22 767 2374