For its databases SkyMapper uses Postgresql, which is front-ended by a standard Web form through which relational searches can be done.
“You may be interested in a galaxy at a certain position, so you can get on to the Web page, download the data and image of that galaxy. In this way we can save astronomers a lot of time,” Keller says. “They don’t have to go and survey it themselves to decide if they’re interested in it for further research.”
(Check out CIO's SkyMapper slideshow here.)
This is particularly important for the current generation of massive 20-30 metre telescopes, Keller says. These behemoths cost about a billion dollars each, so time on them is extremely valuable.
Keller says that with the increasing importance of online data as a reference for the sky, astronomy is on the verge of a paradigm shift. This new online data is served by the International Virtual Observatory Alliance, a consortium of international astronomical facilities that make their data freely available to researchers and scientists.
“SkyMapper will be a key component in the Virtual Observatory by providing coverage for the southern sky, allowing astronomers to cross match objects seen in gamma-rays through optical to radio wavelengths and open new windows of exploration,” Keller says.
High Performance Data Transfer
Ben Evans, head ANU Supercomputer Facility and manager at the NCI National Facility, says that data transfer between SkyMapper and the NCI National Facility site is performed using GridFTP, which is designed to provide a more reliable and high performance file transfer for grid computing applications.
To handle data replication, the NCI National Facility uses a modified version of the data replication techniques in the Globus Alliance’s Globus Toolkit to verify that a full data copy has been received in Canberra before the images created at its Siding Spring Observatory are deleted, Evans says.
“What’s unique in Australia is us expanding the way in which we manage data,” Evans says. “Normally, we would just manage data in the local domain, but the grid software allows us to push our management technique right out to the instrument. That’s not typically how grid software is being used in other parts of the world.”
According to Evans, the decision to use on open source software as a control mechanism to manage the SkyMapper project’s large volumes of data came down to simplicity and flexibility -- and a lack of commercial software choices.
“We just wanted to adapt something that was already out there rather than develop our own,” he says. “There aren’t too many commercial software apps that do this and there is too much already available in the open source domain.”
Evans says the bulk of the analysis of the SkyMapper data will be done on a brand new, next generation Sun supercomputer kitted out with 12,000 cores. Due to be fully online by December, the supercomputer will offer a tenfold increase in performance over the facility’s current set up of two SGI machines, each with just under 3500 cores in total.
Along with processing data from SkyMapper, the new Sun machine will also be used for atmospheric and weather research as well as serving other high-performance computing needs around the country, Evans says.
Data hosting will be done on a data storage cloud hosted next to the supercomputer to allow for easy access for data processing. This cloud is based on a hybrid of software and hardware including: SAN QFS software from Sun to help manage the storage domain; virtualization software from VMware; Linux as a core operating system; Solaris ; and databases from MySQL and PostgreSQL.
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.