Data detectives: How to take the guesswork out of data

IT pros know all too well the tediousness of collecting and preparing data for analysis.

IT pros know all too well the tediousness of collecting and preparing data for analysis. In fact, a recent survey found that 80 percent of data scientists’ time is spent preparing data for analysis. With the amount of data created and stored growing exponentially, it’s not so easy these days to play detective to identify patterns and links between data.

Enterprise data is straining at the seams and disparate systems are making the problem worse. In addition, people across the organisation are becoming more and more in tune with the power of leveraging data insights.

Most companies are missing out on a single 360-degree view of their data which can bring a wealth of benefits to any business group, including valuable and potentially revenue-generating insights into business processes or customer preferences.

The quickest route to overcoming barriers and simplifying data detective work is to address the key perpetrator: data silos.

For the majority of enterprises, most data sits in multiple, unconnected data silos, often a legacy from earlier departmental initiatives or the result of mergers and acquisitions. This has led to multiple copies of data spread out across silos that threaten data integrity.

But there are ways to solve these data integration challenges and extract more value.

Solving the data conundrum

Because up to 80 percent of today’s enterprise data is unstructured or semi-structured – for example PDFs, online data, audio files, and video clips – it makes sense to build a central operational data hub that can handle all these different data types. An operational data hub not only acts as a virtual filing cabinet that creates a single, unified 360-degree view of all data, it also allows companies to ask complex questions of the data.

For data-rich enterprises, it is important to be able to integrate and model complex data to reveal new relationships, patterns and trends. An operational data hub allows integrated search and semantic capabilities which make it easier to discover these inferred facts and relationships.

It also supports full enterprise-grade ACID compliance. When a database has ACID capability, even the largest datasets are processed consistently and reliably so none of the data is ever altered or lost.

Here are two real-world examples of how operational data hubs can be used to solve complex data integration and analysis challenges.

The case of investment banking

Banks need to ensure their trade data is high quality, accessible and searchable in order to mitigate risk and maintain compliance with regulatory imperatives. A leading global investment bank chose to build an operational data hub that provided a single unified view of derivatives trades and allowed a full audit trail for auditors and regulators. It replaced 20 Sybase databases with a single database, making trade information retrievable as well as actionable in real-time.

As well as enhancing compliance reporting, it has dramatically reduced maintenance costs as the system is built on a commodity scale-out architecture. The result is a lower cost per trade – a key competitive differentiator for the company. The bank can also now develop and deploy new software and therefore launch new products in response to the market much more quickly.

The case of fraud detection

According to the Insurance Fraud Bureau of Australia, insurance fraud costs more than $2 billion annually. The problem is, most fraud is only noticed long after the crime has been committed and it is too expensive to claim lost monies back.

Current rates of fraud detection that rely solely on human expertise are at best 10 percent, and often far lower. By using an operational data hub, insurers can take advantage of the power of big data, semantics and inference to detect previously unknown fraudulent behaviour.

With a 360-degree view of data, it becomes possible to evaluate the claim and its context, and compare the claim with other similar transactions and previous claims in order to identify patterns.

By analysing data, assigning a risk score to each claim, alerting the right people in real-time and delaying payment to settle all high scoring suspicious claims, insurers can take quick action.

Data strategy breakthrough

Solving today’s complex data integration challenges will allow enterprises to take advantage of the power of big data, semantics and inference to gain those crucial and subtle insights needed to remain competitive in today’s fast-paced business environment.

The ability to easily sleuth for data gold and business-critical insights starts with having full access to all your data. Since the data itself - along with the technology to analyse it - will continue to grow and change, using an operational data hub will help enterprises more readily respond to market changes. Those that are prepared with this level of flexibility and agility will have a decisive competitive edge.

By Tim Macdermid is area vice president, APAC, at MarkLogic.

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

Tags analyticsMarkLogic

More about APACMarkLogic

Show Comments