Transmitting data from the middle of nowhere

Deduplication is key to low-bandwidth transmissions

To migrate data from one location to another, Seth Georgion has had to devise some creative ways to transmit data from remote locations, including ships in the Pacific Ocean and taverns in the Australian outback, using T1 lines and satellites. Georgion, manager of IT for marine survey firm Fugro USA, also had to reduce the amount of information he was transmitting by culling duplicate data that accounted for as much as half of what he was sending.

In the geographical and marine survey business, mobility is key, Georgion said. Most of the time, that means setting up a full data center near any location being surveyed to process the terabytes of data collected by sonar and laser equipment. But for Fugro, with 250 offices in about 55 countries, it was extremely costly to build data centers and staff them for every job. It also made Georgion's company less competitive. His solution: Setting up servers that not only capture, but deduplicate and then replicate data over distance.

For example, Fugro is now performing the largest hydrographic survey in California's history, producing massive amounts of data about the state's entire coastline to help it determine whether fisheries are healthy. One type of sonar unit aboard a research ship records not only the seabed, but -- at specific intervals -- information about the entire water column from the surface to the bottom. The rate at which the sonar unit transmits the data to the ship is so fast that Georgion is unable to use network-attached storage (NAS) with gigabit speeds.

A full water column scan shows every substance in the water, displaying the kinds of gases and fish that are there as well as what is occurring in the soil underneath. The amount of data from 100 feet of water can be tremendous.

"We have to use iSCSI just to get the throughput performance to write it from the sonar head to the disk - about 75MB/sec," said Georgion, who noted that the California coastal survey project has been under way for about five months and has another eight months to go.

For Fugro, which often produces seabed imagery for oil exploration companies such as Exxon, various military organizations and government conservation agencies, throughput is critical because the faster survey data is processed, the faster it can be acted upon. Fugro's data center in San Diego relies on six NAS arrays from NetApp as storage for processing hydrographic data. That data is backed up using IBM's Tivoli software. Georgion uses a StorageTek tape library equipped with LTO-3 and LTO-4 tape drives for archiving data.

But for fast collection, deduplication and transmission of data, Georgion said only one company's appliance fit the bill and that was Data Domain's DD120 Remote Office Data Protection Appliance.

Join the newsletter!

Or

Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

Tags storage

Show Comments
[]