High-speed databases rev corporate apps
- 18 January, 2006 10:00
Relational database management systems have become all but ubiquitous in enterprise computing since 1970, when they were first devised by E.F. Codd. But as powerful and flexible as those databases are, they've proved inadequate for a handful of ultrademanding applications that have to process hundreds or thousands of transactions per second and never go down. Now, the very-high-performance database technologies that sprang up to serve these niche markets, such as options trading and telephone call processing, are poised to move into mainstream computing.
Some of the new products simply move the action from disk to memory, where access is a million times faster. Others are more radical departures from tradition, such as "streaming" technologies that store queries and pass data through them rather than run queries against stored data. Still others have found clever ways to sidestep much of the overhead -- such as table locking -- associated with the traditional RDBMS.
While some of these products do "store" data in memory-resident data-bases -- either relational or object-oriented -- the tools are primarily designed to speed transaction processing and analytics, not to act as data repositories.
Thanks for the memory
Interact, a Lincoln-based communications service provider, has for more than 10 years used the in-memory database capabilities of Hewlett-Packard Co.'s NonStop servers to do real-time pricing of incoming telephone calls. But the big, expensive computers were overkill for some Interact customers, such as small mobile telephony resellers, says Tom Massey, director of business development.
So 18 months ago, Interact began to offer a call-pricing service that runs on Linux and Unix servers and uses Oracle's TimesTen In-Memory database. "NonStop is big iron and more geared to larger operators," Massey says. "Linux and Unix platforms scale down much better, and operators often prefer them because they are not knowledgeable about NonStop."
Oracle acquired the TimesTen technology last June. Oracle saw the in-memory database as a way to extend its enterprise back-end data storage capabilities to high-performance real-time applications such as Interact's. Interact uses Oracle for back-end data storage as well and does not yet interface those databases with TimesTen, but Massey says he plans to do so.
"We need sub-10-millisecond response time, and you can't get that performance out of an Oracle relational database," says Ed McKee, director of applications at Interact. "To get that kind of performance, the amount of iron you'd have to have would be very significant."
On the other hand, he notes, the in-memory TimesTen product isn't suitable for large-scale data archiving. Interact can serve 1 million telephone subscribers with just 2GB of data in memory because only customer balance information is needed online.
Aspect Software uses TimesTen for call center services. Traditional databases don't have the sub-500-millisecond call- routing capability it requires, says Chief Technology Officer Gary Barnett. Aspect's customers call centers typically have big databases of customer history behind them but cache key information upfront, in memory, for near-instantaneous response to customer requests, he says.
But deciding what data to replicate forward, and how often, can be tricky, Barnett warns. "There's a trade-off. The more data you have [in memory], the more intelligent we can be in routing calls. But the more data in real time, the more expensive it is."
Some users choose a memory- resident product for its features and then gain high performance as a byproduct. For example, Interstate Hotels & Resorts chose the TM1 financial analysis tool from Applix Inc. in Westboro, Mass., because it was easy to use. "We wanted to consolidate all the hotels to close the books each month," says Paul Bushman, senior vice president for IT at the Arlington, Va.-based manager of more than 300 hotels. "But now we use it on a daily basis."
TM1, which Applix calls "the world's fastest business intelligence analytical engine," moves disk-resident data from Oracle databases in Interstate's accounting system into memory in an Excel spreadsheet format. From there, users without technical expertise can run what-if financial models as well as do financial rollups by a variety of user-specified criteria.
They can also perform online consolidations of the type more typically performed in month-end batch processes, Bushman says. "In the accounting system, to produce one financial statement for one hotel for one month could take 30 minutes in a relational database," he says. "But here we can do a consolidation of all 300 hotels in a couple of seconds."
While most of these "real time" products achieve their scorching performance by moving data into memory, one high-performance product -- the StreamBase "stream-processing engine" from StreamBase Systems -- just grabs incoming data and analyzes it as it flies by.
StreamBase applications use an "inbound" query-processing model, in which records are processed before they're indexed and stored. The records flow through the query, which can also transform the data while it's moving.
Vision Systems & Technology (VSTI) is helping several defense and intelligence agencies evaluate StreamBase prototypes. StreamBase can filter torrents of incoming data -- structured or unstructured -- and decide on the fly which should be presented to an analyst at once, which can be stored for later queries and which can be discarded, says Carol Lundquist, an IT consultant at VSTI.
The technology can generate alerts when a passing record contains, for example, a certain name or phone number. "You can put keywords in an Oracle table, and anytime a keyword is added, it gets dumped down to StreamBase immediately," Lundquist says.
"Some government systems are being flooded with data," she explains. "The Oracle systems are having trouble keeping up, and you get data falling on the floor." One government system Lundquist worked on loaded 1 billion records in a day, she says.
Filtering can be the salvation for some of those systems, says Bryan Harris, CTO of VSTI. "The idea is to load the needles, not the haystack," he says.
Harris says streaming technologies may complement rather than compete with traditional back-end RDBMSs. And they are not necessarily an alternative to the in-memory products, either, he says. "If you are doing queries across many different IT systems, that introduces a lot of processing across the entire network," Harris says. "In-memory data caching, if done right, can greatly reduce the amount of system resources used in total. But it doesn't really address the streaming issue." Depending on its mix of applications, a company could benefit from a combination of back-end databases, in-memory databases and streaming technology, he says.
Another variation on the in-memory database theme comes from Ants Software in California. Because Ants Data Server is a SQL-compliant relational database, the company says, it is readily compatible with the major back-end databases, such as Microsoft's SQL Server, IBM 's DB2 and Informix, and products from Oracle, Sybase and MySQL AB. And because it can reside on disk, in memory or both, there's no need to build an interface between different front- and back-end databases. The combination of these characteristics makes Ants easily scalable, the company says.
But Ants' major claim to fame is that, although it uses relational technology, it avoids almost all of the table row locking that can slow a traditional RDBMS to a crawl under heavy loads. According to Ants CEO Boyd Pearce, the reason traditional RDBMSs fail under load is because they aren't very clever at detecting when a real conflict is occurring and a lock is needed. "There are very few cases where you really need locks," he says. Bellevue, Wash.-based Wireless Services Corp., which provides hosted data services to wireless carriers, chose Ants primarily because it provides an easy upgrade path from SQL Server. Small customers may reside entirely on a single SQL server, says CTO Curt Miller. When that system grows, it migrates to two or more SQL Server boxes. And when it reaches a certain point, Miller adds Ants servers for high-performance front-end message processing. Without Ants, the table-locking function at volumes much above 1,000 messages per minute "kills me," he says.
Because Ants supports the Open Database Connectivity standard, the migration is a snap. "Now if I want to use Ants, I just change the driver parameters to say, 'Talk to Ants,' " Miller says. While the traditional back-end database will continue to be the choice of most users with big data repositories, the rise of multitier systems, as well as an increasing number of applications that process torrents of data, seem likely to pull these newer technologies into the mainstream.
Jim Groff, a senior vice president at Oracle and the former CEO of TimesTen Inc., says two broad trends are driving the migration of in-memory databases from niche markets, such as securities trading, into more mainstream computing. The first, on the hardware side, is the rise of inexpensive 64-bit microprocessor architectures that lift the old limit of 2GB of physical memory available to Wintel applications.
The second, in software, is "the emergence of an intelligent middle tier of the enterprise architecture," Groff says. In the middle tier, "where application servers live, where middleware lives, where business activity monitoring and Web services live, that's where there's a tremendous amount of action in enterprise IT today."
And all that action entails a lot of data queries and exchanges among application servers, database servers and storage networks. "The scaling can't keep up; the back end can't keep up," Groff says. "So intelligently caching the right information in the middle tier is emerging as a key solution." Cached information could include key customer data pulled forward into a customer call center at the beginning of a call so that common questions can be answered without delays, he explains.
Streaming technology will move into supply chain systems when radio- frequency identification tags go from the pallet level to the individual item level and a warehouse or store generates huge volumes of product-movement transactions, says Mike Stonebraker, founder and CTO of StreamBase Systems. "Long term, the huge market will be in the area of sensor networks. Everything on the planet of material significance may be tagged," he says.
"I definitely think these products deserve to be more in the mainstream," says Curt A. Monash, a Computerworld columnist and president of Monash Information Services, an IT consultancy in Acton, Mass. But, he adds, a lot depends on how vendors position them and how hard big vendors like Oracle push them.
Says Monash, "These are products you'd buy for a limited group of applications, and for those, they can be very valuable. But they are not general-purpose systems."
The following companies offer high-performance databases:
Name: Ants Software. URL: www.ants.com Product: Ants Data Server Claim to fame: SQL-compliant RDBMS resides in memory or on disk, or it spans both. Avoids most table-locking.
Name: Applix URL: www.applix.com Product: TM1 Claim to fame: Financial analysis/modeling in memory in Excel or Web client formats on data from back-end databases.
Name: Db4objects URL: www.db4o.com Product: db4o Claim to fame: Open-source object database for Java and .Net environments. No database administrator needed.
Name: GemStone Systems URL: www.gemstone.com Product: GemFire Enterprise Data Fabric Claim to fame: Data virtualization, distributed caching and complex event processing.
Name: Kx Systems URL: www.kx.com Product: kdb+ Claim to fame: Integrated RDBMS spans memory and disk for real-time streaming and back-end storage.
Name: Oracle URL: www.oracle.com Product: TimesTen In-Memory Claim to fame: In-memory RDBMS for embedded applications or front-end data caching.
Name: Progress Software URL: www.progress.com Product: ObjectStore ODBMS Claim to fame: Real-time object database management and modeling for Java and C++ environments.
Name: Skyler Technology URL: www.skylertech.com Product: Prime Processing Claim to fame: Real-time data-processing engine uses prime number theory for in-memory analytics for financial services.
Name: Solid Information Technology. URL: www.solidtech.com Products: EmbeddedEngine and BoostEngine Claim to fame: Integrated in-memory and on-disk RDBMSs.
Name: StreamBase Systems URL: www.streambase.com Product: StreamBase Claim to fame: High-volume, real-time, memory-resident data-stream processing engine.
Name: Vhayu Technologies URL: www.vhayu.com Product: Velocity Claim to fame: Analysis of real-time streaming and historical securities market data.