CERN1 announced today the successful completion of a major data challenge aimed at pushing the limits of data storage to tape. Using 45 newly installed StorageTek2 9940B tape drives, capable of writing to tape at 30megabyte/s, Bernd Panzer and his team at the IT Division of CERN were able to achieve storage-to-tape rates of 1.1 gigabyte/s for periods of several hours, with peaks of 1.2 gigabyte/s – roughly equivalent to storing a whole movie on DVD every four seconds. The average sustained over a three day period was of 920megabytes/s. Previous best results by other research labs were typically less than 850megabytes/s.
Schematic of the data setup for the storage-to-tape challenge. Data – in this case generated by 40 compute servers, is temporarily stored to disk, then copied to the 45 StorageTek tape servers. Once the Large Hadron Collider is running at CERN, the data will come directly from the experiments and a copy will be distributed onto the DataGrid that CERN and partners are currently developing.
The significance of this result, and the purpose of the data challenge, was to show that the CERN's IT Division is on track to cope with the enormous data rates expected from experiments on CERN's Large Hadron Collider (LHC), the next generation particle accelerator currently under construction. These experiments will produce data at rates in excess of 100megabytes/s, and one experiment alone, called Alice, is expected to produce data at rates of 1.25 gigabytes/s.
In all, the LHC experiments are anticipated to spew out over 10 petabytes of data a year, which will be stored on tape as well as being distributed around the world onto disk, for subsequent analysis using advanced "Grid" technologies for distributed computing and data storage. The data will contain information about the result of protons colliding in the accelerator at unprecedented energies, and recreating for a brief instant the extreme conditions that existed just after the Big Bang. Scientists will spend years sifting painstakingly through this data, in an effort to better understand the fundamental laws that govern matter in the Universe.
While waiting for the LHC to be completed, Dr. Panzer's team generated an equivalent stream of artificial data, using 40 compute servers. This data was stored temporarily to 60 disk servers before being transferred to the StorageTek tape servers (see Fig. 1). A data compression factor of 1.3 was deliberately chosen during this data challenge, as this is characteristic of the compression that can be achieved with real experimental data.
As Wolfgang von Rüden, head of the CERN IT Division, put it "only by pushing leading edge technology to the limit can we verify the architecture and scalability of the data storage solutions we are developing for the LHC. This latest breakthrough with StorageTek is very reassuring, and means we are in striking distance of production requirements for the LHC."
Beat Schüle, Country Manager Switzerland/Austria, from Storage Tek commented "CERN is a demanding customer, and this is just what we need to keep our technology at the forefront. By upgrading to our latest 9940B series, CERN was able to achieve unprecedented rates of storage-to-tape. This confirms that tape remains a competitive storage medium for the extreme data challenges of the LHC Computing Grid."
Besides the StorageTek equipment, a key contributing factor to the success of the data challenge was a high performance switched network from Enterasys Networks with 10gigabit/s ethernet capability, which routed the data from PC to disk and from disk to tape. This switched network is part of the CERN opencluster, an advanced computer cluster also involving technology from HP and Intel. Close collaboration with a range of IT suppliers is part of a long tradition at CERN, a pioneer in information technologies famous for being the place where the World Wide Web was born.
About CERN and the CERN opencluster
CERN is the European Organization for Nuclear Research, the world's largest particle physics centre near Geneva, Switzerland. Technological development at CERN has given the world advances as varied as medical imaging and the World Wide Web. Founded in 1954, the laboratory was one of Europe's first joint ventures and has become a shining example of international collaboration. From the original 12 signatories of the CERN convention, membership has grown to the present 20 member states. The CERN opencluster is the first common project in the CERN openlab for DataGrid applications, a partnership with industry. The CERN openlab is a response to the new level of intensive industrial collaboration needed to solve the unprecedented computing challenge of the Large Hadron Collider project, currently under construction at CERN. The current partners in the CERN openlab are Enterasys Networks, HP, IBM and Intel. The CERN opencluster currently involves 64-bit processor technology from Intel, advanced servers from HP, and a 10 gigabit switching environment from Enterasys Networks.
StorageTek (NYSE:STK), a $2 billion worldwide company with headquarters in Louisville, Colo., delivers a broad range of storage solutions for digitized data. StorageTek solutions are easy to manage and allow universal access to data across servers, media types and storage networks. StorageTek is the innovator and global leader in virtual storage solutions for tape automation, disk storage systems and storage networking and is a voting member of the SNIA. Because of StorageTek, customers can manage and leverage their digital assets as their businesses grow and can maximize IT productivity to ensure enterprise-class business continuity.
For more information call 1.800.786.7835.
Contact at CERN
CERN IT Division