8 case studies and real world examples of how Big Data has helped keep on top of competition

8 case studies and real world examples of how Big Data has helped keep on top of competition

Fast, data-informed decision-making can drive business success. Managing high customer expectations, navigating marketing challenges, and global competition – many organizations look to data analytics and business intelligence for a competitive advantage.

Using data to serve up personalized ads based on browsing history, providing contextual KPI data access for all employees and centralizing data from across the business into one digital ecosystem so processes can be more thoroughly reviewed are all examples of business intelligence.

Organizations invest in data science because it promises to bring competitive advantages.

Data is transforming into an actionable asset, and new tools are using that reality to move the needle with ML. As a result, organizations are on the brink of mobilizing data to not only predict the future but also to increase the likelihood of certain outcomes through prescriptive analytics.

Here are some case studies that show some ways BI is making a difference for companies around the world:

1) Starbucks:

With 90 million transactions a week in 25,000 stores worldwide the coffee giant is in many ways on the cutting edge of using big data and artificial intelligence to help direct marketing, sales and business decisions

Through its popular loyalty card program and mobile application, Starbucks owns individual purchase data from millions of customers. Using this information and BI tools, the company predicts purchases and sends individual offers of what customers will likely prefer via their app and email. This system draws existing customers into its stores more frequently and increases sales volumes.

The same intel that helps Starbucks suggest new products to try also helps the company send personalized offers and discounts that go far beyond a special birthday discount. Additionally, a customized email goes out to any customer who hasn’t visited a Starbucks recently with enticing offers—built from that individual’s purchase history—to re-engage them.

2) Netflix:

The online entertainment company’s 148 million subscribers give it a massive BI advantage.

Netflix has digitized its interactions with its 151 million subscribers. It collects data from each of its users and with the help of data analytics understands the behavior of subscribers and their watching patterns. It then leverages that information to recommend movies and TV shows customized as per the subscriber’s choice and preferences.

As per Netflix, around 80% of the viewer’s activity is triggered by personalized algorithmic recommendations. Where Netflix gains an edge over its peers is that by collecting different data points, it creates detailed profiles of its subscribers which helps them engage with them better.

The recommendation system of Netflix contributes to more than 80% of the content streamed by its subscribers which has helped Netflix earn a whopping one billion via customer retention. Due to this reason, Netflix doesn’t have to invest too much on advertising and marketing their shows. They precisely know an estimate of the people who would be interested in watching a show.

3) Coca-Cola:

Coca Cola is the world’s largest beverage company, with over 500 soft drink brands sold in more than 200 countries. Given the size of its operations, Coca Cola generates a substantial amount of data across its value chain – including sourcing, production, distribution, sales and customer feedback which they can leverage to drive successful business decisions.

Coca Cola has been investing extensively in research and development, especially in AI, to better leverage the mountain of data it collects from customers all around the world. This initiative has helped them better understand consumer trends in terms of price, flavors, packaging, and consumer’ preference for healthier options in certain regions.

With 35 million Twitter followers and a whopping 105 million Facebook fans, Coca-Cola benefits from its social media data. Using AI-powered image-recognition technology, they can track when photographs of its drinks are posted online. This data, paired with the power of BI, gives the company important insights into who is drinking their beverages, where they are and why they mention the brand online. The information helps serve consumers more targeted advertising, which is four times more likely than a regular ad to result in a click.

Coca Cola is increasingly betting on BI, data analytics and AI to drive its strategic business decisions. From its innovative free style fountain machine to finding new ways to engage with customers, Coca Cola is well-equipped to remain at the top of the competition in the future. In a new digital world that is increasingly dynamic, with changing customer behavior, Coca Cola is relying on Big Data to gain and maintain their competitive advantage.

4) American Express GBT

The American Express Global Business Travel company, popularly known as Amex GBT, is an American multinational travel and meetings programs management corporation which operates in over 120 countries and has over 14,000 employees.

Challenges:

Scalability – Creating a single portal for around 945 separate data files from internal and customer systems using the current BI tool would require over 6 months to complete. The earlier tool was used for internal purposes and scaling the solution to such a large population while keeping the costs optimum was a major challenge

Performance – Their existing system had limitations shifting to Cloud. The amount of time and manual effort required was immense

Data Governance – Maintaining user data security and privacy was of utmost importance for Amex GBT

The company was looking to protect and increase its market share by differentiating its core services and was seeking a resource to manage and drive their online travel program capabilities forward. Amex GBT decided to make a strategic investment in creating smart analytics around their booking software.

The solution equipped users to view their travel ROI by categorizing it into three categories cost, time and value. Each category has individual KPIs that are measured to evaluate the performance of a travel plan.

Reducing travel expenses by 30%

Time to Value – Initially it took a week for new users to be on-boarded onto the platform. With Premier Insights that time had now been reduced to a single day and the process had become much simpler and more effective.

Savings on Spends – The product notifies users of any available booking offers that can help them save on their expenditure. It recommends users of possible saving potential such as flight timings, date of the booking, date of travel, etc.

Adoption – Ease of use of the product, quick scale-up, real-time implementation of reports, and interactive dashboards of Premier Insights increased the global online adoption for Amex GBT

5) Airline Solutions Company: BI Accelerates Business Insights

Airline Solutions provides booking tools, revenue management, web, and mobile itinerary tools, as well as other technology, for airlines, hotels and other companies in the travel industry.

Challenge: The travel industry is remarkably dynamic and fast paced. And the airline solution provider’s clients needed advanced tools that could provide real-time data on customer behavior and actions.

They developed an enterprise travel data warehouse (ETDW) to hold its enormous amounts of data. The executive dashboards provide near real-time insights in user-friendly environments with a 360-degree overview of business health, reservations, operational performance and ticketing.

Results: The scalable infrastructure, graphic user interface, data aggregation and ability to work collaboratively have led to more revenue and increased client satisfaction.

6) A specialty US Retail Provider: Leveraging prescriptive analytics

Challenge/Objective: A specialty US Retail provider wanted to modernize its data platform which could help the business make real-time decisions while also leveraging prescriptive analytics. They wanted to discover true value of data being generated from its multiple systems and understand the patterns (both known and unknown) of sales, operations, and omni-channel retail performance.

We helped build a modern data solution that consolidated their data in a data lake and data warehouse, making it easier to extract the value in real-time. We integrated our solution with their OMS, CRM, Google Analytics, Salesforce, and inventory management system. The data was modeled in such a way that it could be fed into Machine Learning algorithms; so that we can leverage this easily in the future.

The customer had visibility into their data from day 1, which is something they had been wanting for some time. In addition to this, they were able to build more reports, dashboards, and charts to understand and interpret the data. In some cases, they were able to get real-time visibility and analysis on instore purchases based on geography!

7) Logistics startup with an objective to become the “Uber of the Trucking Sector” with the help of data analytics

Challenge: A startup specializing in analyzing vehicle and/or driver performance by collecting data from sensors within the vehicle (a.k.a. vehicle telemetry) and Order patterns with an objective to become the “Uber of the Trucking Sector”

Solution: We developed a customized backend of the client’s trucking platform so that they could monetize empty return trips of transporters by creating a marketplace for them. The approach used a combination of AWS Data Lake, AWS microservices, machine learning and analytics.

  • Reduced fuel costs
  • Optimized Reloads
  • More accurate driver / truck schedule planning
  • Smarter Routing
  • Fewer empty return trips
  • Deeper analysis of driver patterns, breaks, routes, etc.

8) Challenge/Objective: A niche segment customer competing against market behemoths looking to become a “Niche Segment Leader”

Solution: We developed a customized analytics platform that can ingest CRM, OMS, Ecommerce, and Inventory data and produce real time and batch driven analytics and AI platform. The approach used a combination of AWS microservices, machine learning and analytics.

  • Reduce Customer Churn
  • Optimized Order Fulfillment
  • More accurate demand schedule planning
  • Improve Product Recommendation
  • Improved Last Mile Delivery

How can we help you harness the power of data?

At Systems Plus our BI and analytics specialists help you leverage data to understand trends and derive insights by streamlining the searching, merging, and querying of data. From improving your CX and employee performance to predicting new revenue streams, our BI and analytics expertise helps you make data-driven decisions for saving costs and taking your growth to the next level.

Most Popular Blogs

big data case study

Elevating User Transitions: JML Automation Mastery at Work, Saving Hundreds of Manual Hours

Smooth transition – navigating a seamless servicenow® upgrade, seamless integration excellence: integrating products and platforms with servicenow®.

TechEnablers-ep4

TechEnablers Episode 4: Transforming IT Service Managem

TechEnablers-Nitesh

TechEnablers Episode 3: Unlocking Efficiency: Accelerat

TechEnablers-Asmita

TechEnablers Episode 2: POS Transformation: Achieving S

Robin Sutara

Diving into Data and Diversity

P14

Navigating the Future: Global Innovation, Technology, a

Podcast-ep13-banner

Revolutionizing Retail Supply Chains by Spearheading Di

big data case study

AWS Named as a Leader for the 11th Consecutive Year…

Introducing amazon route 53 application recovery controller, amazon sagemaker named as the outright leader in enterprise mlops….

  • Made To Order
  • Cloud Solutions
  • Salesforce Commerce Cloud
  • Distributed Agile
  • IT Strategy & Consulting
  • Data Warehouse & BI
  • Security Assessment & Mitigation
  • Case Studies
  • News and Events

Quick Links

How companies are using big data and analytics

Few dispute that organizations have more data than ever at their disposal. But actually deriving meaningful insights from that data—and converting knowledge into action—is easier said than done. We spoke with six senior leaders from major organizations and asked them about the challenges and opportunities involved in adopting advanced analytics: Murli Buluswar, chief science officer at AIG; Vince Campisi, chief information officer at GE Software; Ash Gupta, chief risk officer at American Express; Zoher Karu, vice president of global customer optimization and data at eBay; Victor Nilson, senior vice president of big data at AT&T; and Ruben Sigala, chief analytics officer at Caesars Entertainment. An edited transcript of their comments follows.

Interview transcript

Challenges organizations face in adopting analytics.

Murli Buluswar, chief science officer, AIG: The biggest challenge of making the evolution from a knowing culture to a learning culture—from a culture that largely depends on heuristics in decision making to a culture that is much more objective and data driven and embraces the power of data and technology—is really not the cost. Initially, it largely ends up being imagination and inertia.

What I have learned in my last few years is that the power of fear is quite tremendous in evolving oneself to think and act differently today, and to ask questions today that we weren’t asking about our roles before. And it’s that mind-set change—from an expert-based mind-set to one that is much more dynamic and much more learning oriented, as opposed to a fixed mind-set—that I think is fundamental to the sustainable health of any company, large, small, or medium.

Ruben Sigala, chief analytics officer, Caesars Entertainment: What we found challenging, and what I find in my discussions with a lot of my counterparts that is still a challenge, is finding the set of tools that enable organizations to efficiently generate value through the process. I hear about individual wins in certain applications, but having a more sort of cohesive ecosystem in which this is fully integrated is something that I think we are all struggling with, in part because it’s still very early days. Although we’ve been talking about it seemingly quite a bit over the past few years, the technology is still changing; the sources are still evolving.

Zoher Karu, vice president, global customer optimization and data, eBay: One of the biggest challenges is around data privacy and what is shared versus what is not shared. And my perspective on that is consumers are willing to share if there’s value returned. One-way sharing is not going to fly anymore. So how do we protect and how do we harness that information and become a partner with our consumers rather than kind of just a vendor for them?

Capturing impact from analytics

Ruben Sigala: You have to start with the charter of the organization. You have to be very specific about the aim of the function within the organization and how it’s intended to interact with the broader business. There are some organizations that start with a fairly focused view around support on traditional functions like marketing, pricing, and other specific areas. And then there are other organizations that take a much broader view of the business. I think you have to define that element first.

That helps best inform the appropriate structure, the forums, and then ultimately it sets the more granular levels of operation such as training, recruitment, and so forth. But alignment around how you’re going to drive the business and the way you’re going to interact with the broader organization is absolutely critical. From there, everything else should fall in line. That’s how we started with our path.

Vince Campisi, chief information officer, GE Software: One of the things we’ve learned is when we start and focus on an outcome, it’s a great way to deliver value quickly and get people excited about the opportunity. And it’s taken us to places we haven’t expected to go before. So we may go after a particular outcome and try and organize a data set to accomplish that outcome. Once you do that, people start to bring other sources of data and other things that they want to connect. And it really takes you in a place where you go after a next outcome that you didn’t anticipate going after before. You have to be willing to be a little agile and fluid in how you think about things. But if you start with one outcome and deliver it, you’ll be surprised as to where it takes you next.

art

The need to lead in data and analytics

Ash Gupta, chief risk officer, American Express: The first change we had to make was just to make our data of higher quality. We have a lot of data, and sometimes we just weren’t using that data and we weren’t paying as much attention to its quality as we now need to. That was, one, to make sure that the data has the right lineage, that the data has the right permissible purpose to serve the customers. This, in my mind, is a journey. We made good progress and we expect to continue to make this progress across our system.

The second area is working with our people and making certain that we are centralizing some aspects of our business. We are centralizing our capabilities and we are democratizing its use. I think the other aspect is that we recognize as a team and as a company that we ourselves do not have sufficient skills, and we require collaboration across all sorts of entities outside of American Express. This collaboration comes from technology innovators, it comes from data providers, it comes from analytical companies. We need to put a full package together for our business colleagues and partners so that it’s a convincing argument that we are developing things together, that we are colearning, and that we are building on top of each other.

Examples of impact

Victor Nilson, senior vice president, big data, AT&T: We always start with the customer experience. That’s what matters most. In our customer care centers now, we have a large number of very complex products. Even the simple products sometimes have very complex potential problems or solutions, so the workflow is very complex. So how do we simplify the process for both the customer-care agent and the customer at the same time, whenever there’s an interaction?

We’ve used big data techniques to analyze all the different permutations to augment that experience to more quickly resolve or enhance a particular situation. We take the complexity out and turn it into something simple and actionable. Simultaneously, we can then analyze that data and then go back and say, “Are we optimizing the network proactively in this particular case?” So, we take the optimization not only for the customer care but also for the network, and then tie that together as well.

Vince Campisi: I’ll give you one internal perspective and one external perspective. One is we are doing a lot in what we call enabling a digital thread—how you can connect innovation through engineering, manufacturing, and all the way out to servicing a product. [For more on the company’s “digital thread” approach, see “ GE’s Jeff Immelt on digitizing in the industrial space .”] And, within that, we’ve got a focus around brilliant factory. So, take driving supply-chain optimization as an example. We’ve been able to take over 60 different silos of information related to direct-material purchasing, leverage analytics to look at new relationships, and use machine learning to identify tremendous amounts of efficiency in how we procure direct materials that go into our product.

An external example is how we leverage analytics to really make assets perform better. We call it asset performance management. And we’re starting to enable digital industries, like a digital wind farm, where you can leverage analytics to help the machines optimize themselves. So you can help a power-generating provider who uses the same wind that’s come through and, by having the turbines pitch themselves properly and understand how they can optimize that level of wind, we’ve demonstrated the ability to produce up to 10 percent more production of energy off the same amount of wind. It’s an example of using analytics to help a customer generate more yield and more productivity out of their existing capital investment.

Winning the talent war

Ruben Sigala: Competition for analytical talent is extreme. And preserving and maintaining a base of talent within an organization is difficult, particularly if you view this as a core competency. What we’ve focused on mostly is developing a platform that speaks to what we think is a value proposition that is important to the individuals who are looking to begin a career or to sustain a career within this field.

When we talk about the value proposition, we use terms like having an opportunity to truly affect the outcomes of the business, to have a wide range of analytical exercises that you’ll be challenged with on a regular basis. But, by and large, to be part of an organization that views this as a critical part of how it competes in the marketplace—and then to execute against that regularly. In part, and to do that well, you have to have good training programs, you have to have very specific forms of interaction with the senior team. And you also have to be a part of the organization that actually drives the strategy for the company.

Murli Buluswar: I have found that focusing on the fundamentals of why science was created, what our aspirations are, and how being part of this team will shape the professional evolution of the team members has been pretty profound in attracting the caliber of talent that we care about. And then, of course, comes the even harder part of living that promise on a day-in, day-out basis.

Yes, money is important. My philosophy on money is I want to be in the 75th percentile range; I don’t want to be in the 99th percentile. Because no matter where you are, most people—especially people in the data-science function—have the ability to get a 20 to 30 percent increase in their compensation, should they choose to make a move. My intent is not to try and reduce that gap. My intent is to create an environment and a culture where they see that they’re learning; they see that they’re working on problems that have a broader impact on the company, on the industry, and, through that, on society; and they’re part of a vibrant team that is inspired by why it exists and how it defines success. Focusing on that, to me, is an absolutely critical enabler to attracting the caliber of talent that I need and, for that matter, anyone else would need.

Developing the right expertise

Victor Nilson: Talent is everything, right? You have to have the data, and, clearly, AT&T has a rich wealth of data. But without talent, it’s meaningless. Talent is the differentiator. The right talent will go find the right technologies; the right talent will go solve the problems out there.

We’ve helped contribute in part to the development of many of the new technologies that are emerging in the open-source community. We have the legacy advanced techniques from the labs, we have the emerging Silicon Valley. But we also have mainstream talent across the country, where we have very advanced engineers, we have managers of all levels, and we want to develop their talent even further.

So we’ve delivered over 50,000 big data related training courses just this year alone. And we’re continuing to move forward on that. It’s a whole continuum. It might be just a one-week boot camp, or it might be advanced, PhD-level data science. But we want to continue to develop that talent for those who have the aptitude and interest in it. We want to make sure that they can develop their skills and then tie that together with the tools to maximize their productivity.

Zoher Karu: Talent is critical along any data and analytics journey. And analytics talent by itself is no longer sufficient, in my opinion. We cannot have people with singular skills. And the way I build out my organization is I look for people with a major and a minor. You can major in analytics, but you can minor in marketing strategy. Because if you don’t have a minor, how are you going to communicate with other parts of the organization? Otherwise, the pure data scientist will not be able to talk to the database administrator, who will not be able to talk to the market-research person, who which will not be able to talk to the email-channel owner, for example. You need to make sound business decisions, based on analytics, that can scale.

Murli Buluswar is chief science officer at AIG, Vince Campisi is chief information officer at GE Software, Ash Gupta is chief risk officer at American Express, Zoher Karu is vice president of global customer optimization and data at eBay, Victor Nilson is senior vice president of big data at AT&T, and Ruben Sigala is chief analytics officer at Caesars Entertainment.

Explore a career with us

Related articles.

Insights_The-need-to-lead-in-data-and-analytics536x1536_0_Standard

Big data: Getting a better read on performance

Transforming insurance_1536x1536_Original

Transforming into an analytics-driven insurance carrier

tableau.com is not available in your region.

  • Data Center
  • Applications
  • Open Source

Logo

Datamation content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More .

A growing number of enterprises are pooling terabytes and petabytes of data, but many of them are grappling with ways to apply their big data as it grows. 

How can companies determine what big data solutions will work best for their industry, business model, and specific data science goals? 

Check out these big data enterprise case studies from some of the top big data companies and their clients to learn about the types of solutions that exist for big data management.

Enterprise case studies

Netflix on aws, accuweather on microsoft azure, china eastern airlines on oracle cloud, etsy on google cloud, mlogica on sap hana cloud.

Read next: Big Data Market Review 2021

Netflix is one of the largest media and technology enterprises in the world, with thousands of shows that its hosts for streaming as well as its growing media production division. Netflix stores billions of data sets in its systems related to audiovisual data, consumer metrics, and recommendation engines. The company required a solution that would allow it to store, manage, and optimize viewers’ data. As its studio has grown, Netflix also needed a platform that would enable quicker and more efficient collaboration on projects.

“Amazon Kinesis Streams processes multiple terabytes of log data each day. Yet, events show up in our analytics in seconds,” says John Bennett, senior software engineer at Netflix. 

“We can discover and respond to issues in real-time, ensuring high availability and a great customer experience.”

Industries: Entertainment, media streaming

Use cases: Computing power, storage scaling, database and analytics management, recommendation engines powered through AI/ML, video transcoding, cloud collaboration space for production, traffic flow processing, scaled email and communication capabilities

  • Now using over 100,000 server instances on AWS for different operational functions
  • Used AWS to build a studio in the cloud for content production that improves collaborative capabilities
  • Produced entire seasons of shows via the cloud during COVID-19 lockdowns
  • Scaled and optimized mass email capabilities with Amazon Simple Email Service (Amazon SES)
  • Netflix’s Amazon Kinesis Streams-based solution now processes billions of traffic flows daily

Read the full Netflix on AWS case study here .

AccuWeather is one of the oldest and most trusted providers of weather forecast data. The weather company provides an API that other companies can use to embed their weather content into their own systems. AccuWeather wanted to move its data processes to the cloud. However, the traditional GRIB 2 data format for weather data is not supported by most data management platforms. With Microsoft Azure, Azure Data Lake Storage, and Azure Databricks (AI), AccuWeather was able to find a solution that would convert the GRIB 2 data, analyze it in more depth than before, and store this data in a scalable way.

“With some types of severe weather forecasts, it can be a life-or-death scenario,” says Christopher Patti, CTO at AccuWeather. 

“With Azure, we’re agile enough to process and deliver severe weather warnings rapidly and offer customers more time to respond, which is important when seconds count and lives are on the line.”

Industries: Media, weather forecasting, professional services

Use cases: Making legacy and traditional data formats usable for AI-powered analysis, API migration to Azure, data lakes for storage, more precise reporting and scaling

  • GRIB 2 weather data made operational for AI-powered next-generation forecasting engine, via Azure Databricks
  • Delta lake storage layer helps to create data pipelines and more accessibility
  • Improved speed, accuracy, and localization of forecasts via machine learning
  • Real-time measurement of API key usage and performance
  • Ability to extract weather-related data from smart-city systems and self-driving vehicles

Read the full AccuWeather on Microsoft Azure case study here .

China Eastern Airlines is one of the largest airlines in the world that is working to improve safety, efficiency, and overall customer experience through big data analytics. With Oracle’s cloud setup and a large portfolio of analytics tools, it now has access to more in-flight, aircraft, and customer metrics.

“By processing and analyzing over 100 TB of complex daily flight data with Oracle Big Data Appliance, we gained the ability to easily identify and predict potential faults and enhanced flight safety,” says Wang Xuewu, head of China Eastern Airlines’ data lab.  

“The solution also helped to cut fuel consumption and increase customer experience.”

Industries: Airline, travel, transportation

Use cases: Increased flight safety and fuel efficiency, reduced operational costs, big data analytics

  • Optimized big data analysis to analyze flight angle, take-off speed, and landing speed, maximizing predictive analytics for engine and flight safety
  • Multi-dimensional analysis on over 60 attributes provides advanced metrics and recommendations to improve aircraft fuel use
  • Advanced spatial analytics on the travelers’ experience, with metrics covering in-flight cabin service, baggage, ground service, marketing, flight operation, website, and call center
  • Using Oracle Big Data Appliance to integrate Hadoop data from aircraft sensors, unifying and simplifying the process for evaluating device health across an aircraft
  • Central interface for daily management of real-time flight data

Read the full China Eastern Airlines on Oracle Cloud case study here .  

Etsy is an e-commerce site for independent artisan sellers. With its goal to create a buying and selling space that puts the individual first, Etsy wanted to advance its platform to the cloud to keep up with needed innovations. But it didn’t want to lose the personal touches or values that drew customers in the first place. Etsy chose Google for cloud migration and big data management for several primary reasons: Google’s advanced features that back scalability, its commitment to sustainability, and the collaborative spirit of the Google team.

Mike Fisher, CTO at Etsy, explains how Google’s problem-solving approach won them over. 

“We found that Google would come into meetings, pull their chairs up, meet us halfway, and say, ‘We don’t do that, but let’s figure out a way that we can do that for you.'”

Industries: Retail, E-commerce

Use cases: Data center migration to the cloud, accessing collaboration tools, leveraging machine learning (ML) and artificial intelligence (AI), sustainability efforts

  • 5.5 petabytes of data migrated from existing data center to Google Cloud
  • >50% savings in compute energy, minimizing total carbon footprint and energy usage
  • 42% reduced compute costs and improved cost predictability through virtual machine (VM), solid state drive (SSD), and storage optimizations
  • Democratization of cost data for Etsy engineers
  • 15% of Etsy engineers moved from system infrastructure management to customer experience, search, and recommendation optimization

Read the full Etsy on Google Cloud case study here .

mLogica is a technology and product consulting firm that wanted to move to the cloud, in order to better support its customers’ big data storage and analytics needs. Although it held on to its existing data analytics platform, CAP*M, mLogica relied on SAP HANA Cloud to move from on-premises infrastructure to a more scalable cloud structure.

“More and more of our clients are moving to the cloud, and our solutions need to keep pace with this trend,” says Michael Kane, VP of strategic alliances and marketing, mLogica 

“With CAP*M on SAP HANA Cloud, we can future-proof clients’ data setups.”

Industry: Professional services

Use cases: Manage growing pools of data from multiple client accounts, improve slow upload speeds for customers, move to the cloud to avoid maintenance of on-premises infrastructure, integrate the company’s existing big data analytics platform into the cloud

  • SAP HANA Cloud launched as the cloud platform for CAP*M, mLogica’s big data analytics tool, to improve scalability
  • Data analysis now enabled on a petabyte scale
  • Simplified database administration and eliminated additional hardware and maintenance needs
  • Increased control over total cost of ownership
  • Migrated existing customer data setups through SAP IQ into SAP HANA, without having to adjust those setups for a successful migration

Read the full mLogica on SAP HANA Cloud case study here .

Read next: Big Data Trends in 2021 and The Future of Big Data

Subscribe to Data Insider

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more.

Similar articles

Hubspot crm vs. salesforce: head-to-head comparison (2024), 15 top cloud computing companies: get cloud service in 2024, ultimate guide to data visualization jobs, get the free newsletter.

Subscribe to Data Insider for top news, trends & analysis

Latest Articles

Best open source software..., hubspot crm vs. salesforce:..., 15 top cloud computing..., ultimate guide to data....

Logo

A new initiative at UPS will use real-time data, advanced analytics and artificial intelligence to help employees make better decisions.

As chief information and engineering officer for logistics giant UPS, Juan Perez is placing analytics and insight at the heart of business operations.

Big data and digital transformation: How one enables the other

Drowning in data is not the same as big data. Here's the true definition of big data and a powerful example of how it's being used to power digital transformation.

"Big data at UPS takes many forms because of all the types of information we collect," he says. "We're excited about the opportunity of using big data to solve practical business problems. We've already had some good experience of using data and analytics and we're very keen to do more."

Perez says UPS is using technology to improve its flexibility, capability, and efficiency, and that the right insight at the right time helps line-of-business managers to improve performance.

The aim for UPS, says Perez, is to use the data it collects to optimise processes, to enable automation and autonomy, and to continue to learn how to improve its global delivery network.

Leading data-fed projects that change the business for the better

Perez says one of his firm's key initiatives, known as Network Planning Tools, will help UPS to optimise its logistics network through the effective use of data. The system will use real-time data, advanced analytics and artificial intelligence to help employees make better decisions. The company expects to begin rolling out the initiative from the first quarter of 2018.

"That will help all our business units to make smart use of our assets and it's just one key project that's being supported in the organisation as part of the smart logistics network," says Perez, who also points to related and continuing developments in Orion (On-road Integrated Optimization and Navigation), which is the firm's fleet management system.

Orion uses telematics and advanced algorithms to create optimal routes for delivery drivers. The IT team is currently working on the third version of the technology, and Perez says this latest update to Orion will provide two key benefits to UPS.

First, the technology will include higher levels of route optimisation which will be sent as navigation advice to delivery drivers. "That will help to boost efficiency," says Perez.

Second, Orion will use big data to optimise delivery routes dynamically.

"Today, Orion creates delivery routes before drivers leave the facility and they stay with that static route throughout the day," he says. "In the future, our system will continually look at the work that's been completed, and that still needs to be completed, and will then dynamically optimise the route as drivers complete their deliveries. That approach will ensure we meet our service commitments and reduce overall delivery miles."

Once Orion is fully operational for more than 55,000 drivers this year, it will lead to a reduction of about 100 million delivery miles -- and 100,000 metric tons of carbon emissions. Perez says these reductions represent a key measure of business efficiency and effectiveness, particularly in terms of sustainability.

Projects such as Orion and Network Planning Tools form part of a collective of initiatives that UPS is using to improve decision making across the package delivery network. The firm, for example, recently launched the third iteration of its chatbot that uses artificial intelligence to help customers find rates and tracking information across a series of platforms, including Facebook and Amazon Echo.

"That project will continue to evolve, as will all our innovations across the smart logistics network," says Perez. "Everything runs well today but we also recognise there are opportunities for continuous improvement."

Overcoming business challenges to make the most of big data

"Big data is all about the business case -- how effective are we as an IT team in defining a good business case, which includes how to improve our service to our customers, what is the return on investment and how will the use of data improve other aspects of the business," says Perez.

These alternative use cases are not always at the forefront of executive thinking. Consultant McKinsey says too many organisations drill down on a single data set in isolation and fail to consider what different data sets mean for other parts of the business.

However, Perez says the re-use of information can have a significant impact at UPS. Perez talks, for example, about using delivery data to help understand what types of distribution solutions work better in different geographical locations.

"Should we have more access points? Should we introduce lockers? Should we allow drivers to release shipments without signatures? Data, technology, and analytics will improve our ability to answer those questions in individual locations -- and those benefits can come from using the information we collect from our customers in a different way," says Perez.

Perez says this fresh, open approach creates new opportunities for other data-savvy CIOs. "The conversation in the past used to be about buying technology, creating a data repository and discovering information," he says. "Now the conversation is changing and it's exciting. Every time we talk about a new project, the start of the conversation includes data."

By way of an example, Perez says senior individuals across the organisation now talk as a matter of course about the potential use of data in their line-of-business and how that application of insight might be related to other models across the organisation.

These senior executive, he says, also ask about the availability of information and whether the existence of data in other parts of the business will allow the firm to avoid a duplication of effort.

"The conversation about data is now much more active," says Perez. "That higher level of collaboration provides benefits for everyone because the awareness across the organisation means we'll have better repositories, less duplication and much more effective data models for new business cases in the future."

Read more about big data

  • Turning big data into business insights: The state of play
  • Choosing the best big data partners: Eight questions to ask
  • Report shows that AI is more important to IoT than big data insights

Better than Ring? This video doorbell has all the benefits and no subscription fees

You can make big money from ai - but only if people trust your data, instacart users will soon be able to order takeout from local restaurants too.

  • 12 min read

26 Big Data Use Cases and Examples for Business

Big Data Use Cases and Examples for Business

Importance of Big Data

Benefits of big data use cases, big data use cases in business, customer analytics and marketing, fraud detection and prevention, supply chain optimization, predictive maintenance, operational efficiency, risk management and mitigation, big data use cases in healthcare, electronic health records (ehrs), clinical decision-making, disease surveillance and prevention, personalized medicine, drug discovery and development, big data use cases in finance, credit risk analysis and management, trading and portfolio optimization, regulatory compliance, big data use cases in government, law enforcement and public safety, environmental monitoring and management, disaster response and recovery, social program management and optimization, transportation and traffic management, big data use cases in science and research, astronomy and cosmology, genomics and bioinformatics, climate science, particle physics, social science and humanities, challenges and limitations of big data use cases, best practices for implementing big data use cases.

Big Data is a term that refers to the large volume of data, both structured and unstructured, that inundates a business on a day-to-day basis. The use of Big Data has become a vital component of business strategy as organizations seek to harness the enormous amount of data generated by various sources. Big Data has become essential for businesses to gain a competitive advantage, make better-informed decisions, and create new products and services.

Big Data has become more essential in today's world than ever due to the explosion of data from various sources, including social media, sensors, and devices. The ability to collect and analyze large volumes of data has become a critical factor in decision-making, innovation, and growth. Big Data helps businesses to:

  • Understand customer needs and preferences
  • Improve operational efficiency
  • Reduce costs
  • Identify new market opportunities
  • Manage risks
  • Create new revenue streams

Big Data use cases provide several benefits for organizations. These include:

  • Improved decision-making: Big Data provides businesses with insights to help them make better decisions.
  • Enhanced customer experiences: Big Data helps organizations to understand customer needs and preferences, enabling them to offer more personalized experiences.
  • Improved operational efficiency: Big Data helps organizations to streamline their operations, optimize supply chains, and reduce costs.
  • Increased revenue: Big Data enables organizations to identify new market opportunities, create new products and services, and drive revenue growth.

Businesses are among the most prominent users of Big Data. There are several ways in which organizations use Big Data to achieve their business objectives. Some of the most common Big Data use cases in business include:

Big Data helps organizations to analyze customer behavior, preferences, and purchasing patterns. This enables businesses to offer personalized marketing and sales campaigns that increase customer engagement and loyalty. With Big Data, companies can:

  • Identify high-value customers
  • Improve customer retention
  • Increase sales
  • Enhance customer experiences

Big Data is an essential tool for detecting and preventing fraud. Businesses can detect patterns and anomalies that indicate fraudulent activities by analyzing large volumes of data. Big Data helps organizations to:

  • Identify fraudulent transactions
  • Monitor suspicious activities
  • Prevent financial losses
  • Protect brand reputation

Big Data helps organizations to optimize their supply chain operations. By analyzing data from various sources, including sensors and logistics systems, businesses can streamline their supply chain processes and reduce costs. With Big Data, businesses can:

  • Optimize inventory management
  • Improve delivery times
  • Reduce transportation costs
  • Improve supplier performance

Big Data enables organizations to predict equipment failures before they occur, reducing downtime and maintenance costs. Businesses can identify patterns that indicate potential equipment failures by analyzing sensor data and other sources. With Big Data, companies can:

  • Reduce maintenance costs
  • Improve equipment uptime
  • Increase operational efficiency
  • Enhance safety and compliance

Big Data helps organizations to improve their operational efficiency by providing insights into their business processes. By analyzing data from various sources, businesses can identify areas of inefficiency and optimize their operations. With Big Data, businesses can:

  • Reduce waste
  • Increase productivity
  • Improve quality

Big Data helps organizations to manage and mitigate risks by providing insights into potential risks and their impact on the business. By analyzing data from various sources, including financial and operational data, businesses can identify potential risks and take proactive measures to mitigate them. With Big Data, businesses can:

  • Reduce financial losses
  • Enhance business continuity
  • Improve regulatory compliance

Google Data Studio Tutorial For Beginners

Google Data Studio is a powerful data visualization tool. Learn how to create beautiful dashboards in our Google Data Studio tutorial for beginners.

The healthcare industry has also embraced Big Data to improve patient outcomes, enhance clinical decision-making, and drive operational efficiency. Some of the most common Big Data use cases in healthcare include:

Big Data helps healthcare providers to manage and analyze electronic health records, providing insights that can improve patient care and outcomes. With Big Data, healthcare providers can:

  • Monitor patient health status
  • Identify potential health risks
  • Improve disease management
  • Enhance clinical decision-making

Big Data helps healthcare providers to make more informed clinical decisions by providing insights into patient health status, treatment effectiveness, and outcomes. With Big Data, healthcare providers can:

  • Improve diagnosis accuracy
  • Personalize treatment plans
  • Enhance patient outcomes
  • Reduce healthcare costs

Big Data helps public health officials to monitor and track disease outbreaks, enabling them to take proactive measures to prevent the spread of diseases. With Big Data, public health officials can:

  • Monitor disease incidence and prevalence
  • Predict disease outbreaks
  • Plan and implement disease prevention and control strategies
  • Improve public health outcomes

Big Data helps healthcare providers to develop personalized treatment plans based on patient characteristics, including genetic makeup, lifestyle, and health status. With Big Data, healthcare providers can:

  • Improve treatment effectiveness
  • Reduce side effects
  • Personalize healthcare delivery

Big Data helps pharmaceutical companies to identify new drug targets, develop more effective drugs, and reduce drug development costs. With Big Data, pharmaceutical companies can:

  • Identify disease biomarkers
  • Improve drug safety and efficacy
  • Optimize clinical trial design
  • Reduce drug development time and costs

The finance industry has adopted Big Data to improve risk management, reduce fraud, and enhance customer experiences. Some of the most common Big Data use cases in finance include:

Big Data helps financial institutions to detect and prevent fraud by analyzing transaction data and identifying patterns that indicate fraudulent activities. With Big Data, financial institutions can:

  • Monitor transaction activities
  • Identify suspicious transactions

Big Data helps financial institutions to analyze customer credit risk and make informed lending decisions. With Big Data, financial institutions can:

  • Evaluate creditworthiness
  • Identify potential credit risks
  • Optimize credit risk management
  • Reduce loan defaults

Big Data helps financial institutions to analyze market trends and identify profitable investment opportunities. With Big Data, financial institutions can:

  • Analyze market trends
  • Identify profitable investment opportunities
  • Optimize trading and portfolio management
  • Improve investment returns

Big Data helps financial institutions to analyze customer behavior and preferences, enabling them to offer personalized marketing and sales campaigns that increase customer engagement and loyalty. With Big Data, financial institutions can:

Big Data helps financial institutions to comply with regulatory requirements by providing insights into compliance risks and potential violations. With Big Data, financial institutions can:

  • Monitor regulatory compliance
  • Identify potential compliance risks
  • Improve regulatory reporting
  • Reduce regulatory penalties

Government agencies have also adopted Big Data to improve public services, enhance public safety, and drive economic growth. Some of the most common Big Data use cases in government include:

Big Data helps law enforcement agencies to prevent and solve crimes by analyzing crime data and identifying patterns that indicate criminal activities. With Big Data, law enforcement agencies can:

  • Monitor criminal activities
  • Identify potential criminal activities
  • Predict criminal behavior
  • Improve public safety outcomes

Big Data helps government agencies to monitor and manage the environment, enabling them to identify potential environmental risks and take proactive measures to mitigate them. With Big Data, government agencies can:

  • Monitor air and water quality
  • Identify potential environmental risks
  • Plan and implement environmental mitigation strategies
  • Enhance public health and environmental outcomes

Big Data helps government agencies to respond to and recover from disasters by providing real-time data and analytics that enable them to make informed decisions. With Big Data, government agencies can:

  • Monitor disaster events
  • Assess damage and impact
  • Plan and implement disaster response and recovery strategies
  • Enhance public safety and disaster recovery outcomes

Big Data helps government agencies to manage and optimize social programs, enabling them to improve service delivery and outcomes for citizens. With Big Data, government agencies can:

  • Monitor program effectiveness
  • Identify potential program improvements
  • Optimize program management
  • Enhance program outcomes and citizen satisfaction

Big Data helps government agencies manage transportation and traffic systems, improving safety, reducing congestion, and optimizing transportation infrastructure. With Big Data, government agencies can:

  • Monitor traffic flows
  • Optimize transportation infrastructure
  • Improve safety and reduce congestion
  • Enhance transportation outcomes and citizen satisfaction

What Is Data Analysis Methods Tools and Best Practices

Discover what Data Analysis is, its methods, examples, best practices, and top tools used to gain insights and make informed decisions with your data.

Scientists and researchers have also adopted Big Data to advance knowledge and discovery in various fields. Some of the most common Big Data use cases in science and research include:

Big Data helps astronomers and cosmologists to analyze large volumes of data from telescopes and other sources, enabling them to gain insights into the universe's origin and evolution. With Big Data, astronomers and cosmologists can:

  • Identify new celestial objects
  • Understand the structure and evolution of the universe
  • Test cosmological theories
  • Discover new phenomena

Big Data helps geneticists and bioinformaticians analyze large volumes of genetic data, enabling them to identify genetic variations contributing to diseases and develop personalized treatment plans. With Big Data, geneticists and bioinformaticians can:

  • Analyze genetic data
  • Identify genetic variations that contribute to diseases
  • Develop personalized treatment plans
  • Improve patient outcomes

Big Data helps climate scientists to monitor and analyze large volumes of climate data, enabling them to understand climate patterns, predict weather events, and develop mitigation strategies. With Big Data, climate scientists can:

  • Monitor climate data
  • Analyze climate patterns
  • Predict weather events
  • Develop climate mitigation strategies

Big Data helps particle physicists to analyze large volumes of data from particle accelerators and other sources, enabling them to gain insights into the fundamental nature of matter and the universe. With Big Data, particle physicists can:

  • Analyze particle accelerator data
  • Test fundamental physics theories
  • Discover new particles and phenomena
  • Understand the fundamental nature of matter and the universe

Big Data helps social scientists, and humanists analyze large volumes of data from various sources, including social media and archives, to gain insights into human behavior, culture, and history. With Big Data, social scientists and humanists can:

  • Analyze social media data
  • Study cultural trends and patterns
  • Understand historical events and processes
  • Develop new theories and insights

Despite the numerous benefits of Big Data use cases, several challenges and limitations exist. These include:

  • Data Quality and Reliability: Big Data is only useful if the data is accurate, complete, and reliable. Poor data quality can lead to erroneous insights and decisions.
  • Privacy and Security: Big Data contains sensitive information, and its analysis and storage raise privacy and security concerns. Organizations must implement robust security measures to protect against data breaches and unauthorized access.
  • Technical Infrastructure and Expertise: Big Data analysis requires specialized technical infrastructure, tools, and expertise. Organizations must invest in the proper hardware, software, and personnel to ensure successful Big Data analysis.
  • Data Integration and Interoperability: Big Data comes from various sources and formats, making data integration and interoperability a significant challenge. Organizations must develop robust data integration and management strategies to ensure data consistency and usability.
  • Ethical and Social Issues: Big Data analysis raises ethical and social issues, including privacy, fairness, bias, and transparency. Organizations must ensure that their Big Data analysis practices align with moral and social standards.

Organizations must follow best practices to realize the full benefits of Big Data use cases. These include:

  • Define Clear Objectives and Goals: Organizations must define clear objectives and goals for their Big Data use cases, ensuring they align with their overall business strategy.
  • Develop a Comprehensive Data Strategy: Organizations must develop a comprehensive data strategy that outlines data collection, storage, analysis, and governance processes.
  • Choose the Right Tools and Technologies: Organizations must choose the right tools and technologies for their Big Data use cases, ensuring they meet their data processing and analysis needs.
  • Ensure Data Quality and Governance: Organizations must ensure data quality and governance, ensuring that their data is accurate, complete, and reliable and that their data practices align with regulatory and ethical standards.
  • Foster a Culture of Data-Driven Decision-Making: Organizations must foster a culture of data-driven decision-making, ensuring that decision-makers have the necessary data and analysis to make informed decisions.

Big Data is a powerful tool that provides organizations with insights and opportunities to improve their operations, drive innovation, and enhance customer experiences. Big Data use cases are numerous and span various industries, including business, healthcare, finance, government, science, and research.

Despite the countless benefits of Big Data and practical Big Data use cases, several challenges and limitations exist, including data quality and reliability, privacy and security, technical infrastructure and expertise, data integration and interoperability, and ethical and social issues. Organizations must follow best practices to ensure successful Big Data use cases, including defining clear objectives and goals, developing a comprehensive data strategy, choosing the right tools and technologies, ensuring data quality and governance, and fostering a culture of data-driven decision-making.

As Big Data continues to grow and evolve, organizations embracing and using it effectively will gain a competitive advantage and drive growth and innovation.

Hady has a passion for tech, marketing, and spreadsheets. Besides his Computer Science degree, he has vast experience in developing, launching, and scaling content marketing processes at SaaS startups.

Layer is now Sheetgo

Automate your procesess on top of spreadsheets.

Data and Analytics Case Study

Made possible by ey, exclusive global insights case study sponsor.

MIT Sloan Management Review Logo

GE’s Big Bet on Data and Analytics

Seeking opportunities in the internet of things, ge expands into industrial analytics., february 18, 2016, by: laura winig.

If software experts truly knew what Jeff Immelt and GE Digital were doing, there’s no other software company on the planet where they would rather be. –Bill Ruh, CEO of GE Digital and CDO for GE

In September 2015, multinational conglomerate General Electric (GE) launched an ad campaign featuring a recent college graduate, Owen, excitedly breaking the news to his parents and friends that he has just landed a computer programming job — with GE. Owen tries to tell them that he will be writing code to help machines communicate, but they’re puzzled; after all, GE isn’t exactly known for its software. In one ad, his friends feign excitement, while in another, his father implies Owen may not be macho enough to work at the storied industrial manufacturing company.

Owen's Hammer

Ge's ad campaign aimed at millennials emphasizes its new digital direction..

The campaign was designed to recruit Millennials to join GE as Industrial Internet developers and remind them — using GE’s new watchwords, “The digital company. That’s also an industrial company.” — of GE’s massive digital transformation effort. GE has bet big on the Industrial Internet — the convergence of industrial machines, data, and the Internet (also referred to as the Internet of Things) — committing $1 billion to put sensors on gas turbines, jet engines, and other machines; connect them to the cloud; and analyze the resulting flow of data to identify ways to improve machine productivity and reliability. “GE has made significant investment in the Industrial Internet,” says Matthias Heilmann, Chief Digital Officer of GE Oil & Gas Digital Solutions. “It signals this is real, this is our future.”

While many software companies like SAP, Oracle, and Microsoft have traditionally been focused on providing technology for the back office, GE is leading the development of a new breed of operational technology (OT) that literally sits on top of industrial machinery.

About the Author

Laura Winig is a contributing editor to MIT Sloan Management Review .

1. Predix is a trademark of General Electric Company.

2. M. LaWell, “Building the Industrial Internet With GE,” IndustryWeek, October 5, 2015.

3. D. Floyer, “Defining and Sizing the Industrial Internet,” June 27, 2013, http://wikibon.org.

i. S. Higginbotham, “BP Teams Up With GE to Make Its Oil Wells Smart,” Fortune, July 8, 2015.

More Like This

Add a comment cancel reply.

You must sign in to post a comment. First time here? Sign up for a free account : Comment on articles and get access to many more articles.

Comment (1)

BusinessTechWeekly.com

Big Data Use Case: How Amazon uses Big Data to drive eCommerce revenue

Big Data Use Case

Amazon is no stranger to big data. In this big data use case, we’ll look at how Amazon is leveraging data analytic technologies to improve products and services and drive overall revenue.

Big data has changed how we interact with the world and continue strengthening its hold on businesses worldwide. New data sets can be mined, managed, and analyzed using a combination of technologies.

These applications leverage the fallacy-prone human brain with computers. If you can think of applications for machine learning to predict things, optimize systems/processes, or automatically sequence tasks – this is relevant to big data.

Amazon’s algorithm is another secret to its success. The online shop has not only made it possible to order products with just one mouse click, but it also uses personalization data combined with big data to achieve excellent conversion rates.

On this page:

Amazon and Big data

Amazon’s big data strategy, amazon collection of data and its use, big data use case: the key points.

The fascinating world of Big Data can help you gain a competitive edge over your competitors. The data collected by networks of sensors, smart meters, and other means can provide insights into customer spending behavior and help retailers better target their services and products.

RELATED: Big Data Basics: Understanding Big Data

Machine Learning (a type of artificial intelligence) processes data through a learning algorithm to spot trends and patterns while continually refining the algorithms.

Amazon is one of the world’s largest businesses, estimated to have over 310 million active customers worldwide. They recently accomplished transactions that reached a value of $90 billion. This shows the popularity of online shopping on different continents. They provide services like payments, shipping, and new ideas for their customers.

Amazon is a giant – it has its own clouds. Amazon Web Services (AWS) offers individuals, companies, and governments cloud computing platforms . Amazon became interested in cloud computing after its Amazon Web Services was launched in 2003.

Amazon Web Services has expanded its business lines since then. Amazon hired some brilliant minds in the field of analytics and predictive modeling to aid in further data mining of Amazon’s massive volume of data that it has accumulated. Amazon innovates by introducing new products and strategies based on customer experience and feedback.

Big Data has assisted Amazon in ascending to the top of the e-commerce heap.

Amazon uses an anticipatory delivery model that predicts the products most likely to be purchased by its customers based on vast amounts of data.

This leads to Amazon assessing your purchase pattern and shipping things to your closest warehouse, which you may use in the future.

Amazon stores and processes as much customer and product information as possible – collecting specific information on every customer who visits its website. It also monitors the products a customer views, their shipping address, and whether or not they post reviews.

Amazon optimizes the prices on its websites by considering other factors, such as user activity, order history, rival prices, product availability, etc., providing discounts on popular items and earning a profit on less popular things using this strategy. This is how Amazon utilizes big data in its business operations.

Data science has established its preeminent place in industries and contributed to industries’ growth and improvement.

RELATED: How Artificial Intelligence Is Used for Data Analytics

Ever wonder how Amazon knows what you want before you even order it? The answer is mathematics, but you know that.

You may not know that the company has been running a data-gathering program for almost 15 years now that reaches back to the site’s earliest days.

In the quest to make every single interaction between buyers and sellers as efficient as possible, getting down to the most minute levels of detail has been essential, with data collection coming from a variety of sources – from sellers themselves and customers with apps on their phones – giving Amazon insights into every step along the way.

Voice recording by Alexa

Alexa is a speech interaction service developed by Amazon.com. It uses a cloud-based service to create voice-controlled smart devices. Through voice commands, Alexa can respond to queries, play music, read the news, and manage smart home devices such as lights and appliances.

Users may subscribe to an Alexa Voice Service (AVS) or use AWS Lambda to embed the system into other hardware and software.

You can spend all day with your microphone, smartphone, or barcode scanner recording every interaction, receipt, and voice note. But you don’t have to with tools like Amazon Echo.

With its always-on Alexa Voice Service, say what you need to add to your shopping list when you need it. It’s fast and straightforward.

Single click order

There is a big competition between companies using big data. Using big data, Amazon realized that customers might prefer alternative vendors if they experience a delay in their orders. So, Amazon has created Single click ordering.

You need to mention the address and payment method by this method. Every customer is given a time of 30 minutes to decide whether to place the order or not. After that, it is automatically determined.

Persuade Customers

Persuasive technology is a new area at Amazon. It’s an intersection of AI, UX, and the business goal of getting customers to take action at any point in the shopping journey.

One of the most significant ways Amazon utilizes data is through its recommendation engine. When a client searches for a specific item, Amazon can better anticipate other items the buyer may be interested in.

Consequently, Amazon can expedite the process of convincing a buyer to purchase the product. It is estimated that its personalized recommendation system accounts for 35 percent of the company’s annual sales.

The Amazon Assistant helps you discover new and exciting products, browse best sellers, and shop by department—there’s no place on the web with a better selection of stuff. Plus, it automatically notifies you when price drops or items you’ve been watching get marked down, so customers get the best deal possible.

Price dropping

Amazon constantly changes the price of its products by using Big data trends. On many competitor sites, the product’s price remains the same.

But Amazon has created another way to attract customers by constantly changing the price of the products. Amazon continually updates prices to deliver you the best deals.

Customers now check the site constantly that the price of the product they want can be low at any time, and they can buy it easily.

Shipping optimization

Shipping optimization by Amazon allows you to choose your preferred carrier, service options, and expected delivery time for millions of items on Amazon.com. With Shipping optimization by Amazon, you can end surprises like unexpected carrier selection, unnecessary service fees, or delays that can happen with even standard shipping.

Today, Amazon offers customers the choice to pick up their packages at over 400 U.S. locations. Whether you need one-day delivery or same-day pickup in select metro areas, Prime members can choose how fast they want to get their goods in an easy-to-use mobile app.

RELATED: Amazon Supply Chain: Understanding how Amazon’s supply chain works

Using shipping partners makes this selection possible, allowing Amazon to offer the most comprehensive selection in the industry and provide customers with multiple options for picking up their orders.

To better serve the customer, Amazon has adopted a technology that allows them to receive information from shoppers’ web browsing habits and use it to improve existing products and introduce new ones.

Amazon is only one example of a corporation that uses big data. Airbnb is another industry leader that employs big data in its operations; you can also review their case study. Below are four ways big data plays a significant role in every organization.

1. Helps you understand the market condition: Big Data assists you in comprehending market circumstances, trends, and wants, as well as your competitors, through data analysis.

It helps you to research customer interests and behaviors so that you may adjust your products and services to their requirements.

2. It helps you increase customer satisfaction: Using big data analytics, you may determine the demographics of your target audience, the products and services they want, and much more.

This information enables you to design business plans and strategies with the needs and demands of customers in mind. Customer satisfaction will grow immediately if your business strategy is based on consumer requirements.

3. Increase sales: Once you thoroughly understand the market environment and client needs, you can develop products, services, and marketing tactics accordingly. This helps you dramatically enhance your sales.

4. Optimize costs: By analyzing the data acquired from client databases, services, and internet resources, you may determine what prices benefit customers, how cost increases or decreases will impact your business, etc.

You can determine the optimal price for your items and services, which will benefit your customers and your company.

Businesses need to adapt to the ever-changing needs of their customers. Within this dynamic online marketplace, competitive advantage is often gained by those players who can adapt to market changes faster than others. Big data analytics provides that advantage.

RELATED: Top 5 Big Data Privacy Issues Businesses Must Consider

However, the sheer volume of data generated at all levels — from individual consumer click streams to the aggregate public opinions of millions of individuals — provides a considerable barrier to companies that would like to customize their offerings or efficiently interact with customers.

'  data-src=

James joined BusinessTechWeekly.com in 2018, following a 19-year career in IT where he covered a wide range of support, management and consultancy roles across a wide variety of industry sectors. He has a broad technical knowledge base backed with an impressive list of technical certifications. with a focus on applications, cloud and infrastructure.

10 tips for better wireless network security

The Importance of Digitally Transforming your Business

What is Threat Hunting: Proactively Defending against Cyber Threats

How Multi-Factor Authentication (MFA) keeps business secure

How to get Etsy Sales? 20 Valuable tips and insights to grow your Etsy Sales

TikTok Ads for eCommerce: Everything you need to Know to start advertising on TikTok

Effective Big Data Analytics Use Cases in 20+ Industries

Arya bharti.

  • January 06, 2022

If we have to talk about the modern technologies and industry disruptions that can benefit every industry and every business organization, then Big Data Analytics fits the bill perfectly. 

The big data analytics market is slated to hit 103 bn USD by 2023 and 70% of the large enterprise business setups are using big data.

Organizations continue to generate heaps of data every year, and the global amount of data created, stored, and consumed by 2025 is slated to surpass 180 zettabytes.

However, they are unable to put this huge amount of data to the right use because they are clueless about putting their big data to work.

Here, we are discussing the top big data analytics use cases for a wide range of industries. So, take a thorough read and get started with your big data journey.  

Let us begin with understanding the term Big Data Analytics.

What is Big Data Analytics?

Big data analytics is the process of using advanced analytical techniques against extremely large and diverse data sets, with huge blocks of unstructured or semi-structured, or structured data. It is a complex process where the data is processed and parsed to discover hidden patterns, market trends, and correlations and draw actionable insights from them. 

The following image shows some benefits of big data analytics:

Big data analytics enables business organizations to make sense of the data they are accumulating and leverage the insights drawn from it for various business activities. 

The following visual shows some of the direct benefits of using big data analytics:

Before we move on to discuss the use cases of big data analytics, it is important to address one more thing – What makes big data analytics so versatile?

Core Strengths of Big Data Analytics

Big data analytics is a combination of multiple advanced technologies that work together to help business organizations use the best set of technologies to get the best value out of their data.

Some of these technologies are machine learning, data mining, data management, Hadoop, etc.

Below, we discuss the core strengths of big data.

1. Cost Reduction

Big data analytics offers data-driven insights for the business stakeholders and they can take better strategic decisions, streamline and optimize the operational processes and understand their customers better. All this helps in cost-cutting and adds efficiency to the business model. 

Big data analytics also streamline the supply chains to reduce time, effort, and resource consumption.

Studies also reveal that big data analytics solutions can help companies reduce the cost of failure by 35% via:

  • Real-time monitoring
  • Real-time visualization
  • In-memory Analytics 
  • Product Monitoring
  • Effective Fleet Management

2. Reliable and Continuous Data

As big data analytics allows business enterprises to make use of organizational data, they don’t have to rely upon third-party market research or tools for the same. Further, as the organizational data expands continually, having a reliable and robust big data analytics platform ensures reliable and continuous data streams. 

3. New Products and Services

Because of the availability of a set of diverse and advanced technologies in the form of big data analytics, you can take better decisions related to developing new products and services. 

Also, you always have the best market and customer or end-user insights to steer the development processes in the right direction.

Hence, big data analytics also facilitates faster decision-making stemming from data-driven actionable insights.

4. Improved Efficiency

Big data analytics improves accuracy, efficiency, and overall decision-making in business organizations. You can analyze the customer behavior via the shopping data and leverage the power of predictive analytics to make certain calculations, such as checkout wait times, etc. Stats reveal that 38% of companies use big data for organizational efficiency.

Actionable Advice for Data-Driven Leaders

Struggling to reap the right kind of insights from your business data? Get expert tips, latest trends, insights, case studies, recommendations and more in your inbox.

5. Better Monitoring and Tracking

Big data analytics also empowers organizations with real-time monitoring and tracking functionalities and amplifies the results by suggesting the appropriate actions or strategizing nudges stemming from predictive data analytics.

These tracking and monitoring capabilities are of extreme importance in:

  • Security posture management
  • Mitigating cybersecurity attacks and minimizing the damage
  • Database backup 
  • IT infrastructure management

6. Better Remote Resource Management 

Be it hiring or remote team management and monitoring, big data analytics offers a wide range of capabilities to enterprises. Big data analytics can empower business owners with core insights to make better decisions regarding employee tracking, employee hiring, performance management, etc. 

This remote resource management capability works well for IT infrastructure management as well. 

7. Taking Right Organizational Decisions

Take a look at the following visual that shows how big data analytics can help companies take better and data-driven organizational decisions.

Now, we discuss the top big data analytics use cases in various industries.

Big Data Analytics Use Cases in Various Industries

1. banking and finance (fraud detection, risk & insurance, and asset management).

Futuristic banks and financial institutions are capitalizing on big data in various ways, ranging from capturing new markets and market opportunities to fraud reduction and investment risk management. These organizations are able to leverage big data analytics as a powerful solution to gain a competitive advantage as well. 

Take a look at the following image that shows various use cases of big data analytics in the finance and banking sector:

Recent studies suggest that big data analytics is going to register a CAGR of 22.97% over the period of 2021 to 2026. As the amount of data generated and government regulations increase, they are fueling the demand for big data analytics in the sector.

2. Accounting 

Data is Accounting’s heart and using big data analytics in accounting will certainly deliver more value to the accounting businesses. The accounting sector has various activities, such as different types of audits, checking and maintaining ledger, transaction management, taxation, financial planning, etc. 

The auditors have to deal with numerous sorts of data that might be structured or unstructured, and big data analytics can help them in:

  • Outliers identification
  • Exclude exceptions 
  • Focus on data blocks of greatest risk areas
  • Visualize data 
  • Connect financial and non-financial data 
  • Compare predicted outcomes for improving forecasting etc

Using big data analytics will also improve regulatory efficiency, and minimize the redundancy in accounting.

3. Aviation 

Studies reveal that the aviation analytics market will hit the 3bn USD by 2025 and will register a CAGR of 11.5% over the forecast period. 

The major growth drivers of the aviation market are:

  • Increasing demand for optimized business operations
  • COVID-19 outbreak affecting the normal aviation operations
  • Mergers, acquisitions, and joint ventures

Recent trends and changes in the Original Equipment Manufacturer (OEM) and user segment of the aviation industry One of the most bankable big data analytics opportunities in the aviation industry is cloud-based real-time data collection and analytics, which requires diverse data models. 

Likewise, big data analytics has a huge potential in the airlines’ industry as well, improving basic operations, such as maintenance, distribution of resources, flight safety, flight services, to business goals, such as loyalty programs and route optimization. 

The following image shows the various points of data generation in the aviation industry (flights only), that can be a valid use case for big data analytics:

4. Agriculture

UN estimates reveal that the world population will hit the 9.8 billion mark by 2050 and to fulfill the food demands of such a large population, agriculture needs modification. However, the climate changes have not only rendered the majority of farmlands unfit for farming, but have also impacted the rainfall patterns, and dried a number of water sources. 

This means that apart from increasing crop production, farmers have to improve the other farming-related activities. 

Big data analytics can help agriculture and agribusiness stakeholders in the following ways:

  • Precision farming techniques stemming from advanced technologies, such as big data, IoT , analytics, etc.
  • Offer advance warnings and climate change predictions
  • Ethical and wise use of pesticides
  • Farm equipment optimization
  • Supply chain optimization and streamlining

Some of the ideal case studies in this regard are:

  • IBM food trust

5. Automotive

Be it research and development, or marketing planning, big data analytics has a huge scope in the automotive industry that is a combination of a number of individual industries. Being a core infrastructure segment empowering a number of crucial public and private ecosystems, the automobile sector generates huge loads of data every single day!

Hence, it is one of the most critical use cases for big data analytics.

Some common applications are:

  • Improve the design and manufacturing process via a definitive cost analysis of various designs and concepts.
  • Vehicle use and maintenance constraints 
  • Tracking and monitoring the manufacturing processes to ensure Zero fault in production
  • Predicting market trends for sales, manufacturing, and technologies used by the automotive companies
  • Supply chain and logistics analysis
  • Streamlining the manufacturing to stay ahead of market competition
  • Excellent quality analytics to create extremely user-friendly and high-performing vehicles

Take a look at the following visual to have an overall idea of the big analytics use cases in the value chain of the automotive industry:

6. Biomedical Research and Healthcare (Cancer, Genomic medicine, COVID-19 Management)

Recent stats reveal that the big data analytics market in healthcare will be around 67.82 bn USD by 2025. Healthcare is a huge industry generating mountains of data that is extremely crucial for the patients, medical institutions, insurance companies, government, and research as well. 

With proper analysis of huge data blocks, big data analytics can not only help medical researchers to devise more targeted and successful treatment plans but also procure medical supplies from all over the world. 

Organ donation, betterment of treatment facilities, development of better medicines, and prediction of pandemic or epidemic outbreaks to contain their ferocity – there are multiple ways big data analytics can benefit the healthcare industry.

Take a look at the following image for a better understanding:

Also, big data analytics is playing a huge role in COVID-19 management by predicting the outbreaks, red zones, and facilitating crucial data for the frontline workers. 

Finally, when we talk about Biomedical research, big data analytics emerges as a powerful tool for:

  • Data sourcing, processing, and reporting
  • Predicting trends, and offering hidden patterns from historic data blocks
  • Genome research and individual genetic data processing for personalized medicine development

The biomedical research and healthcare industry is a huge use case for big data analytics and the applications can themselves form a topic of lengthy discussion. 

Various applications of big data analytics in biomedical informatics:

7. Business and Management

95% of businesses cite unstructured data management as a major problem and 97.2% of business organizations are investing in AI and big data to streamline operations, implement digitization and introduce automation, among other business objectives. 

However, the business organizations suffer from multiple data pain points, such as:

  • Unstructured data
  • Fragmented data
  • Database incompatibility
  • Unstructured data storage and management
  • Data loss due to cyber crimes

Big data analytics can thus be a knight in shining armor for business process streamlining and management with its massive capability set. 

Business owners can take more targeted, data-driven, and smart decisions based on the data insights provided by big data analytics, and do much more, as ideated in the following visual:

8. Cloud Computing 

45% of businesses across the globe are running at least one big data workload on the cloud, and public cloud services will drive 90% of innovation in analytics and data. 

Cloud computing has many challenges, and security is one of them. In fact, security is becoming a major concern for business organizations across the world as well. ‘

Also, big data analytics has rigorous network, data, and server requirements that persuade business organizations across the globe to outsource the hassle and operational overloads to third parties. It is spurring a number of new opportunities that support big data analytics and help organizations overcome architectural hurdles.

9. Cybersecurity

In cybersecurity, big data security analytics is an emerging trend and helps business organizations to improve security via:

  • Identify outliers and anomalies in security data to detect malicious or suspicious activities
  • Workflow automation for responding to threats, such as disrupting obvious malware attacks

53% of the companies that are already using big data security analytics say that they experienced high benefits from big data analytics.

10. Government and Law Enforcement

Government and public infrastructure produce a large amount of data in various forms, such as body cameras, CCTV footage, satellites, public schemes, registrations, certifications, social media, etc.

Big data analytics can empower the government and public services sector in many ways, some of which are mentioned below:

  • Open data initiatives to manage, monitor, and track the private company data
  • Encouraging public participation and transparency in open data initiatives by the government
  • Predicting consumer frauds, political shifts, and tracking the border security
  • Defense and consumer protection
  • Public safety via a rapid and efficient address of public grievances
  • Transportation and city infrastructure management
  • Public health management
  • Efficient and data-driven management of energy, environment, and public utilities

Also, big data analytics are of extreme importance in the law enforcement segment as well. Tracking crimes, real-time and 24X7 policing of sensitive areas, real-time monitoring and tracking of criminals, smugglers, and tracing money launderers – there are various ways big data analytics can help law enforcement stakeholders.

The following visual shows how big data analytics can help the law enforcement and national security sectors:

11. Oil, Gas & Renewable Energy

From offering new ways to innovate for various sectors to using data sensors for tracking and monitoring new preserves, big data analytics offers many use cases in the energy industry. 

Some common application areas include:

  • Tracking and monitoring of oil well and equipment performance
  • Monitor well activity
  • Predictive equipment maintenance in remote and deep-water locations 
  • Oil exploration and optimizing drilling sites
  • Optimization of oil production via unstructured sensor and historical data

Some other potential areas where data analytics is of extreme importance are the safety of oil sites, supply pipes, and saving time via automation. 

Improvement of fuel transportation, supply chain, and logistics are some other areas where big data analytics can be of help. 

Further, in the renewable energy sector, the technology can offer actionable insights such as geographical data insights for installing renewable energy plants, deforestation maps, efficiency, and cost-benefit analysis of various methods of energy production, as shown below:

12. Manufacturing & Supply Chain Management

When the world is on the verge of the fourth industrial revolution, the manufacturing sector and supply chains are subject to an intense revolution in many ways. The manufacturers are looking for ways to harness massive data they generate in order to streamline the business processes, dig hidden patterns and market trends from huge data blocks to drive profits, and boost their business equities.

There are three core segments in the manufacturing industry that form crucial application areas of big data analytics:

  • Predictive Maintenance – Predict equipment failure, discover potential issues in the manufacturing units as well as products, etc.
  • Operational Efficiency – Analysis and assessment of production processes, proactive customer feedback, future demand forecasts, etc.
  • Production Optimization – Optimizing the production lines to decrease the costs and increase business revenue, and identify the processes or activities causing delays in production.

Big data analytics can help businesses revolutionize the supply chains in various ways, such as:

13. Retail 

The modern retail landscape is alight with fierce competition and is becoming increasingly volatile with industry disruptions and the break-neck pace of technological advancements. Businesses are focusing on many granular aspects of customers and business offerings, irrespective of them being product-based vendors or service-based vendors. 

Some of the big data analysis use cases in retail are:

  • Product Development – Predictive business models, market research for developing products that are high in demand, and get deep insights from huge consumer and market data from multiple platforms.
  • Customer Experience and Service – Providing personalized and hyper-personalized services and customer experiences throughout the customer journeys and addressing crucial events, such as customer complaints, customer churn, etc. 

Customer Lifetime Value – Rich actionable insights on customer behavior, purchase patterns , and motivation to offer a highly personalized lifetime plan to all the customers.

14. Stock Market 

Another crucial industry that walks in parallel with retail, and drives the economy is the Stock Market. And, big data analytics can be a game-changer here as well. 

Experts say that big data analytics has changed finance and stock market trading by:

  • Offering smart automated investment and trading modules
  • Smart modules for funds planning and management of stocks based on real-time market insights
  • Using predictive insights for gaining more by trading well ahead of time 
  • Estimation of outcomes and returns for investments of all sizes and all types.

15. Telecom 

The telecom industry is in for a huge wave of digital transformation and revolution by advanced technologies and data analytics. As the number of smartphone users increases and technologies like 5G is all set to penetrate the developing countries as well, big data analytics emerges as a credible tool to tackle multiple issues.

Some applications are shown in the following image:

Some use cases for big data in the telecom industry are:

  • Optimizing Network Capacity – Analysis of network usage for deciding rerouting bandwidth, managing the network limitation, and decoding infrastructure investments with data-driven insights from multiple areas. 
  • Telecom Customer Churn – With multiple options available in the market, the business operators are always at a risk of losing customers to their competitors.
  • With insights collected from data about customer satisfaction, market research, and service quality, the brands can address the issue with much clarity.
  • New Product Offerings – With predictive analytics and thorough market research, the telecom companies can come up with new product offerings that are unique, address the customer pain points, and cater to usability concerns, instead of generic brand offerings.

16. Media and Entertainment

In the Media and Entertainment industry, big data analytics can offer insights about the various content preferences, reception, and cost/subscription ideas to the brands. 

Further, analysis of customer behavior and content consumption can be used to offer more personalized content recommendations and get insights for creating new shows. Market potential, market segmentation, and insights about customer sentiments can also help drive core business decisions to increase revenue and decrease the odds of creating flop or lopsided content.

17. Education

Market forecasts suggest that the big data analytics market in education will stand at 57.14 bn USD by 2030. Despite being extremely useful in various segments of the industry, the technology valuation differs greatly from the industries mentioned above. 

There are many reasons for the same, such as regional education policies, lack of digitization , and technological advancements in the sector. 

Some core areas of application are shown in the following visual:

18. Pharmacy 

In the Pharmacy sector, big data analytics is of extreme importance in the following areas:

  • Standardization of images, numerical, data processing methods 
  • Gaining insights from hoards of analytical and medical data that is still siloed in the research files
  • Clinical monitoring
  • Personalized drug development and digitized data analysis
  • Operations management in institutes and manufacturing units
  • Addressing the failure of traditional data processing methods 
  • Taking model-based decisions

19. Psychology

If you are unable to grasp the relationship between psychology and data analytics, take a look at the graphical relationship diagram below:

Big data analytics has a big role in psychology, such and in its multiple branches, such as organizational psychology to understand employee motivation and satisfaction in a better manner, etc, and safety psychology to make counseling and medical consultation better.

Further, when it comes to therapeutic counseling, big data analytics can help the practitioners by offering behavioral models of a patient and their tendencies, and develop personalized therapy programs or diagnosing severe psychological disorders for criminal cases, etc.

20. Project Management

The global business dissatisfaction with project management techniques is increasing despite innovation in workplace tech. Also, only 78% of the projects meet original goals and only 64% of them are completed on time. 

Project management is a huge use case for big data analytics, and some application areas are:

  • Deriving project feasibility stats from initial work plans and SRS documents
  • Predicting the success and failure of the development process 
  • Checking the market relevance, budgeting, etc 

Some other applications of big data analytics in project management are:

21. Marketing and Sales (Advertising)

Market research is a complex industry with various independent surveys and studies going on simultaneously. Apart from generating a huge amount of data, these studies also generate a huge number of redundancies because of the unstructured nature of data. 

Big data analytics can not only make study results better but also help organizations to leverage them better by allowing them to define specific test cases and custom parameters. 

Also, when it comes to sales and sales processes, big data analytics is of paramount importance as it surpasses the “ dry ” nature of data.

It can go beyond the statistics to discover the underlying trends, such as behavioral analytics, sentiment analysis, predictive analysis of customer comments in informal or regional language to decode customer satisfaction levels, etc.

The following visual shows how these stats help businesses make important decisions:

Thus, the brands can market more, better, and with proper customer targets in mind. 

22. Social Media Management

Another crucial segment of marketing and sales is social media management and monitoring as more and more people are now using social media platforms for shopping, reviewing, and interacting with brands. 

However, when it comes to drawing sensible business-relevant insights from the huge amounts of social media data, the majority of brands succumb to feeble data analytics software.

Big data analytics can uncover excellent data insights from the social media channels and platforms to make marketing, customer service, and advertising better and more aligned to business goals.

23. Hospitality, Restaurants, and Tourism

Ranging from an increase in online revenue to a reduction in guest complaints, and increasing customer satisfaction via highly personalized services during the stay – there are multiple use cases for big data analytics in the hospitality and restaurant industries . 

Apart from the customer-relevant insights, big data analytics can also offer business insights to the business owners such as:

  • Location suggestions
  • Itinerary suggestions 
  • Deals, discounts, and promotional campaigns
  • Smart advertising
  • Pricing and family/corporate-specific services 
  • Travelers’ needs 

The tourism industry is also an interesting use case, as people are now traveling for many purposes, other than business, leisure, and work, such as medical tourism. 

Some of the application areas of big data analytics in the tourism industry are shown in the following visual:

24. Miscellaneous Use Cases

Construction.

  • Resolving structural issues
  • Improved collaboration
  • Reduced construction time, wastage, and carbon emissions
  • Wearables’ data processing to improve worker safety

Image Processing

  • Better image data visualization
  • Satellite image processing 
  • Improved security for confidential images
  • Interactive digital media
  • Military imagery protection and image data processing
  • Image-based modeling and algorithms
  • Knowledge-based recognition
  • Virtual and augmented reality
  • Track maintenance and planning 
  • Service, customer, and travel data
  • Real-time predictive analysis for minimizing delays owing to weather and sudden incidents
  • Infrastructure management
  • Coach maintenance, facility maintenance, and safety of travelers

Big Data Analytics: Laying the Road for Future-Ready Businesses

The future of the business landscape is full of uncertainties and intense competition, and nothing is more reliable and credible than data!

Big data analytics offers powerful data mining, management, and processing capabilities that can help businesses make the most of historical data and continuously generated organizational data.

With abilities to drive business decisions for the present and future, big data analytics is one of the most bankable technologies for businesses of all types and all scales. 

While it is easy to say, adopting and implementing big data analytics is a challenging task with serious requirements, in terms of resources and capital. Hence, the best way to take the first step towards embracing the revolution is by opting for reputed big data consulting companies , such as DataToBiz that can help you identify, understand, and cater to your big data analytics needs.

For more information, book an appointment today!

Driven by passion and an unrelenting urge to learn, Arya Bharti has a keen interest in evolving and innovative business technology and solutions that empower businesses and people alike. You can connect with Arya on LinkedIn and Facebook.

Leave a Reply Cancel reply

You must be logged in to post a comment.

Data Engineering

AI & Machine Learning

By Use Cases

Business Intelligence & Tableau

+91 70099 35623

[email protected], f-429, industrial area, phase 8b, mohali, pb 160059 punjab, india, ©2024 datatobiz r all rights reserved.

  • Privacy Policy

Subscribe To Our Newsletter

Get amazing insights and updates on the latest trends in AI, BI and Data Science technologies

The Case Centre logo

Award winner: Big Data Strategy of Procter & Gamble

big data case study

This case won the Knowledge, Information and Communication Systems Management  category at The Case Centre Awards and Competitions 2020 .  #CaseAwards2020

Author perspective

Who – the protagonist.

Linda W. Clement-Holmes , Procter & Gamble (P&G) Chief Information Officer (CIO).

P&G is a leading consumer packaged goods company, regarded as a pioneer in extensively adopting big data and digitization to understand consumer behaviour.

Big data

Former Chairman and CEO, Bob McDonald , and CIO, Filippio Passerini , were responsible for the push on big data, which had resulted in P&G becoming more nimble and efficient.

However, some experts were sceptical about P&G’s obsession with digitization, and how it could slow the speed of decision making.

It was in June 2015 when Linda replaced Filippio.

P&G is headquartered in Cincinnati, Ohio, but its brands are sold worldwide.

“ Change movement is one of the biggest challenges of big data implementation. Analytics need to be integrated with processes. We had to educate and train our field force over and over again in order to make analytics a part of their daily routine.” A head of analytics at a leading logistics company

Linda had the big responsibility of continuing and leveraging the big data initiatives started by Filippio.

In order to achieve this, a culture of data-driven decision making within the organisation needed to be implemented by the leadership team.

Linda’s job was to convince them of her vision.

AUTHOR PERSPECTIVE 

Vinod said: “I am extremely honoured to receive such a prestigious award from The Case Centre, popularly dubbed the Case Method Oscars!

“I am earnestly grateful for the recognition I have received for my effort which would not have been possible without the guidance and support of my Dean, Debapratim Purkayastha, who gave me an opportunity to associate with him in writing this case.”

Predicting the future

Debapratim commented: “Big data analytics has always been a key strategy for businesses to have a competitive edge and achieve their goals. Now, predictive analysis through big data can help predict what may occur in the future.

Making predictions with big data

“The topic is very contemporary to current business trends and the case helps the students to be updated with the organisational readiness to welcome latest changes in technology for better performance. The case discusses in detail how Procter & Gamble adapted the big data through different tools like Decision Cockpit and Business Sphere.”

Vinod commented: “The case helps understands many strategic, as well as technical aspects of big data and business analytics, and how they are implemented in a fast-moving consumer goods (FMCG) like Procter & Gamble.

“Not only does it help understand the opportunities and challenges in implementing a big data strategy, but also the significance of accessibility to information in an organisation and how its functioning can be transformed through the availability of real-time data.

“The case enables a discussion on ways in which big data could be productively employed in an organisation in some of the key business functions.” 

Debapratim added: "Educators may like using our other case,  Consumer Research at Procter & Gamble: From Field Research to Agile Research , as a follow-up, as it shows how the pioneers of marketing research is now leveraging big data for agile research.

Identifying the right information

Debapratim explained: “Understanding of the concepts that are going to be taught through the case study is a prerequisite of writing a case. Finding the relevant information, and presenting the case in an understandable manner to students is also equally important.

"Most importantly, people new to case writing should work with more experienced case writers to hone their skills in case writing.”

The authors

Debapratim Purkayastha

Celebrating the win

Unfortunately, due to the Coronavirus pandemic, we were unable to present the authors in person with their trophies for winning the Knowledge, Information and Communication Systems Management category in 2020.

We are delighted to celebrate Debapratim and Vinod's win by sharing these pictures of them with their awards - congratulations!

Debapratim Purkayastha and Vinod Babu Koti

The protagonist

Linda Clement-Holmes

Educators can login to view a free educator preview copy of this case and its teaching note.

View all the 2020 winners

Stay in touch with all the latest case news and views in our free newsletter,  Connect .

Read it online or sign up to have it delivered direct to your inbox!

Picture representing 'Get our newsletter'

Discover more

big data case study

Table of Contents

What is big data, the five ‘v’s of big data, what does facebook do with its big data, big data case study, challenges of big data, challenges of big data visualisation, security management challenges, cloud security governance challenges, challenges of big data: basic concepts, case study, and more.

Challenges of Big Data

Evolving constantly, the data management and architecture field is in an unprecedented state of sophistication. Globally, more than 2.5 quintillion bytes of data are created every day, and 90 percent of all the data in the world got generated in the last couple of years ( Forbes ). Data is the fuel for machine learning and meaningful insights across industries, so organizations are getting serious about how they collect, curate, and manage information.

This article will help you learn more about the vast world of Big Data, and the challenges of Big Data . And in case you thing challenges of Big Data and Big data as a concept is not a big deal, here are some facts that will help you reconsider: 

  • About 300 billion emails get exchanged every day (Campaign Monitor)
  • 400 hours of video are uploaded to YouTube every minute (Brandwatch)
  • Worldwide retail eCommerce accounts for more than $4 billion in revenue (Shopify)
  • Google receives more than 63,000 search inquiries every minute (SEO Tribunal)
  • By 2025, real-time data will account for more than a quarter of all data (IDC)

To get a handle on challenges of big data, you need to know what the word "Big Data" means. When we hear "Big Data," we might wonder how it differs from the more common "data." The term "data" refers to any unprocessed character or symbol that can be recorded on media or transmitted via electronic signals by a computer. Raw data, however, is useless until it is processed somehow.

Before we jump into the challenges of Big Data, let’s start with the five ‘V’s of Big Data.

Big Data is simply a catchall term used to describe data too large and complex to store in traditional databases. The “five ‘V’s” of Big Data are:

  • Volume – The amount of data generated
  • Velocity - The speed at which data is generated, collected and analyzed
  • Variety - The different types of structured, semi-structured and unstructured data
  • Value - The ability to turn data into useful insights
  • Veracity - Trustworthiness in terms of quality and accuracy 

Facebook collects vast volumes of user data (in the range of petabytes, or 1 million gigabytes) in the form of comments, likes, interests, friends, and demographics. Facebook uses this information in a variety of ways:

  • To create personalized and relevant news feeds and sponsored ads
  • For photo tag suggestions
  • Flashbacks of photos and posts with the most engagement
  • Safety check-ins during crises or disasters

Next up, let us look at a Big Data case study, understand it’s nuances and then look at some of the challenges of Big Data.

As the number of Internet users grew throughout the last decade, Google was challenged with how to store so much user data on its traditional servers. With thousands of search queries raised every second, the retrieval process was consuming hundreds of megabytes and billions of CPU cycles. Google needed an extensive, distributed, highly fault-tolerant file system to store and process the queries. In response, Google developed the Google File System (GFS).

GFS architecture consists of one master and multiple chunk servers or slave machines. The master machine contains metadata, and the chunk servers/slave machines store data in a distributed fashion. Whenever a client on an API wants to read the data, the client contacts the master, which then responds with the metadata information. The client uses this metadata information to send a read/write request to the slave machines to generate a response.

The files are divided into fixed-size chunks and distributed across the chunk servers or slave machines. Features of the chunk servers include:

  • Each piece has 64 MB of data (128 MB from Hadoop version 2 onwards)
  • By default, each piece is replicated on multiple chunk servers three times
  • If any chunk server crashes, the data file is present in other chunk servers

Next up let us take a look at the challenges of Big Data, and the probable outcomes too! 

With vast amounts of data generated daily, the greatest challenge is storage (especially when the data is in different formats) within legacy systems. Unstructured data cannot be stored in traditional databases.

Processing big data refers to the reading, transforming, extraction, and formatting of useful information from raw information. The input and output of information in unified formats continue to present difficulties.

Security is a big concern for organizations. Non-encrypted information is at risk of theft or damage by cyber-criminals. Therefore, data security professionals must balance access to data against maintaining strict security protocols.

Finding and Fixing Data Quality Issues

Many of you are probably dealing with challenges related to poor data quality, but solutions are available. The following are four approaches to fixing data problems:

  • Correct information in the original database.
  • Repairing the original data source is necessary to resolve any data inaccuracies.
  • You must use highly accurate methods of determining who someone is.

Scaling Big Data Systems

Database sharding, memory caching, moving to the cloud and separating read-only and write-active databases are all effective scaling methods. While each one of those approaches is fantastic on its own, combining them will lead you to the next level.

Evaluating and Selecting Big Data Technologies

Companies are spending millions on new big data technologies, and the market for such tools is expanding rapidly. In recent years, however, the IT industry has caught on to big data and analytics potential. The trending technologies include the following:

Hadoop Ecosystem

  • Apache Spark
  • NoSQL Databases
  • Predictive Analytics
  • Prescriptive Analytics

Big Data Environments

In an extensive data set, data is constantly being ingested from various sources, making it more dynamic than a data warehouse. The people in charge of the big data environment will fast forget where and what each data collection came from.

Real-Time Insights

The term "real-time analytics" describes the practice of performing analyses on data as a system is collecting it. Decisions may be made more efficiently and with more accurate information thanks to real-time analytics tools, which use logic and mathematics to deliver insights on this data quickly.

Data Validation

Before using data in a business process, its integrity, accuracy, and structure must be validated. The output of a data validation procedure can be used for further analysis, BI, or even to train a machine learning model.

Healthcare Challenges

Electronic health records (EHRs), genomic sequencing, medical research, wearables, and medical imaging are just a few examples of the many sources of health-related big data.

Barriers to Effective Use Of Big Data in Healthcare

  • The price of implementation
  • Compiling and polishing data
  • Disconnect in communication

Other issues with massive data visualisation include:

  • Distracting visuals; the majority of the elements are too close together. They are inseparable on the screen and cannot be separated by the user.
  •  Reducing the publicly available data can be helpful; however, it also results in data loss.
  • Rapidly shifting visuals make it impossible for viewers to keep up with the action on screen.

The term "big data security" is used to describe the use of all available safeguards about data and analytics procedures. Both online and physical threats, including data theft, denial-of-service assaults, ransomware, and other malicious activities, can bring down an extensive data system.

It consists of a collection of regulations that must be followed. Specific guidelines or rules are applied to the utilisation of IT resources. The model focuses on making remote applications and data as secure as possible.

Some of the challenges are below mentioned:

  • Methods for Evaluating and Improving Performance
  • Governance/Control
  • Managing Expenses

And now that we know the challenges of Big Data, let’s take a look at the solutions too!

Hadoop as a Solution

Hadoop , an open-source framework for storing data and running applications on clusters of commodity hardware, is comprised of two main components:

Hadoop HDFS

Hadoop Distributed File System (HDFS) is the storage unit of Hadoop. It is a fault-tolerant, reliable, scalable layer of the Hadoop cluster. Designed for use on commodity machines with low-cost hardware, Hadoop allows access to data across multiple Hadoop clusters on various servers. HDFS has a default block size of 128 MB from Hadoop version 2 onwards, which can be increased based on requirements.

Hadoop MapReduce

Become a big data professional.

  • 11.5 M Expected New Jobs For Data Analytics And Science Related Roles
  • 50% YOY Growth For Data Engineer Positions

Big Data Engineer

  • Live interaction with IBM leadership
  • 8X higher live interaction in live online classes by industry experts

Post Graduate Program in Data Engineering

  • Post Graduate Program Certificate and Alumni Association membership
  • Exclusive Master Classes and Ask me Anything sessions by IBM

Here's what learners are saying regarding our programs:

Craig Wilding

Craig Wilding

Data administrator , seminole county democratic party.

My instructor was experienced and knowledgeable with broad industry exposure. He delivered content in a way which is easy to consume. Thank you!

Joseph (Zhiyu) Jiang

Joseph (Zhiyu) Jiang

I completed Simplilearn's Post-Graduate Program in Data Engineering, with Purdue University. I gained knowledge on critical topics like the Hadoop framework, Data Processing using Spark, Data Pipelines with Kafka, Big Data and more. The live sessions, industry projects, masterclasses, and IBM hackathons were very useful.

Hadoop features Big Data security, providing end-to-end encryption to protect data while at rest within the Hadoop cluster and when moving across networks. Each processing layer has multiple processes running on different machines within a cluster. The components of the Hadoop ecosystem , while evolving every day, include:

  • Sqoop For ingestion of structured data from a Relational Database Management System (RDBMS) into the HDFS (and export back).
  • Flume For ingestion of streaming or unstructured data directly into the HDFS or a data warehouse system (such as Hive
  • Hive A data warehouse system on top of HDFS in which users can write SQL queries to process data
  • HCatalog Enables the user to store data in any format and structure
  • Oozie A workflow manager used to schedule jobs on the Hadoop cluster
  • Apache Zookeeper A centralized service of the Hadoop ecosystem, responsible for coordinating large clusters of machines
  • Pig A language allowing concise scripting to analyze and query datasets stored in HDFS
  • Apache Drill Supports data-intensive distributed applications for interactive analysis of large-scale datasets
  • Mahout For machine learning

MapReduce Algorithm

Hadoop MapReduce is among the oldest and most mature processing frameworks. Google introduced the MapReduce programming model in 2004 to store and process data on multiple servers, and analyze in real-time. Developers use MapReduce to manage data in two phases:

  • Map Phase In which data gets sorted by applying a function or computation on every element. It sorts and shuffles data and decides how much data to process at a time.
  • Reduce Phase Segregating data into logical clusters, removing bad data, and retaining necessary information.

Now that you have understood the five ‘V’s of Big Data, Big Data case study, challenges of Big Data, and some of the solutions too, it’s time you scale up your knowledge and become industry ready. Most organizations are making use of big data to draw insights and support strategic business decisions. Simplilearn's Caltech Post Graduate Program in Data Science will help you get ahead in your career!

If you have any questions, feel free to post them in the comments below. Our team will get back to you at the earliest.

Get Free Certifications with free video courses

Introduction to Big Data Tools for Beginners

Introduction to Big Data Tools for Beginners

Learn from Industry Experts with free Masterclasses

Test Webinar: Simulive

Program Overview: The Reasons to Get Certified in Data Engineering in 2023

Career Webinar: Secrets for a Successful Career in Big Data

Recommended Reads

Big Data Career Guide: A Comprehensive Playbook to Becoming a Big Data Engineer

What's The Big Deal About Big Data?

How to Become a Big Data Engineer?

An Introduction to Big Data: A Beginner's Guide

Best Big Data Certifications in 2024

Top Big Data Applications Across Industries

Get Affiliated Certifications with Live Class programs

  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Review Article
  • Open access
  • Published: 05 March 2020

Big data in digital healthcare: lessons learnt and recommendations for general practice

  • Raag Agrawal 1 , 2 &
  • Sudhakaran Prabakaran   ORCID: orcid.org/0000-0002-6527-1085 1 , 3 , 4  

Heredity volume  124 ,  pages 525–534 ( 2020 ) Cite this article

43k Accesses

98 Citations

84 Altmetric

Metrics details

  • Developing world

Big Data will be an integral part of the next generation of technological developments—allowing us to gain new insights from the vast quantities of data being produced by modern life. There is significant potential for the application of Big Data to healthcare, but there are still some impediments to overcome, such as fragmentation, high costs, and questions around data ownership. Envisioning a future role for Big Data within the digital healthcare context means balancing the benefits of improving patient outcomes with the potential pitfalls of increasing physician burnout due to poor implementation leading to added complexity. Oncology, the field where Big Data collection and utilization got a heard start with programs like TCGA and the Cancer Moon Shot, provides an instructive example as we see different perspectives provided by the United States (US), the United Kingdom (UK) and other nations in the implementation of Big Data in patient care with regards to their centralization and regulatory approach to data. By drawing upon global approaches, we propose recommendations for guidelines and regulations of data use in healthcare centering on the creation of a unique global patient ID that can integrate data from a variety of healthcare providers. In addition, we expand upon the topic by discussing potential pitfalls to Big Data such as the lack of diversity in Big Data research, and the security and transparency risks posed by machine learning algorithms.

Similar content being viewed by others

big data case study

Causal machine learning for predicting treatment outcomes

big data case study

Unveiling microbial diversity: harnessing long-read sequencing technology

big data case study

Best practices for single-cell analysis across modalities

Introduction.

The advent of Next Generation Sequencing promises to revolutionize medicine as it has become possible to cheaply and reliably sequence entire genomes, transcriptomes, proteomes, metabolomes, etc. (Shendure and Ji 2008 ; Topol 2019a ). “Genomical” data alone is predicted to be in the range of 2–40 Exabytes by 2025—eclipsing the amount of data acquired by all other technological platforms (Stephens et al. 2015 ). In 2018, the price for the research-grade sequencing of the human genome had dropped to under $1000 (Wetterstrand 2019 ). Other “omics” techniques such as Proteomics have also become accessible and cheap, and have added depth to our knowledge of biology (Hasin et al. 2017 ; Madhavan et al. 2018 ). Consumer device development has also led to significant advances in clinical data collection, as it becomes possible to continuously collect patient vitals and analyze them in real-time. In addition to the reductions in cost of sequencing strategies, computational power, and storage have become extremely cheap. All these developments have brought enormous advances in disease diagnosis and treatments, they have also introduced new challenges as large-scale information becomes increasingly difficult to store, analyze, and interpret (Adibuzzaman et al. 2018 ). This problem has given way to a new era of “Big Data” in which scientists across a variety of fields are exploring new ways to understand the large amounts of unstructured and unlinked data generated by modern technologies, and leveraging it to discover new knowledge (Krumholz 2014 ; Fessele 2018 ). Successful scientific applications of Big Data have already been demonstrated in Biology, as initiatives such as the Genotype-Expression Project are producing enormous quantities of data to better understand genetic regulation (Aguet et al. 2017 ). Yet, despite these advances, we see few examples of Big Data being leveraged in healthcare despite the opportunities it presents for creating personalized and effective treatments.

Effective use of Big Data in Healthcare is enabled by the development and deployment of machine learning (ML) approaches. ML approaches are often interchangeably used with artificial intelligence (AI) approaches. ML and AI only now make it possible to unravel the patterns, associations, correlations and causations in complex, unstructured, nonnormalized, and unscaled datasets that the Big Data era brings (Camacho et al. 2018 ). This allows it to provide actionable analysis on datasets as varied as sequences of images (applicable in Radiology) or narratives (patient records) using Natural Language Processing (Deng et al. 2018 ; Esteva et al. 2019 ) and bringing all these datasets together to generate prediction models, such as response of a patient to a treatment regimen. Application of ML tools is also supplemented by the now widespread adoption of Electronic Health Records (EHRs) after the passage of the Affordable Care Act (2010) and Health Information Technology for Economic and Clinical Health Act (2009) in the US, and recent limited adoption in the National Health Service (NHS) (Garber et al. 2014 ). EHRs allow patient data to become more accessible to both patients and a variety of physicians, but also researchers by allowing for remote electronic access and easy data manipulation. Oncology care specifically is instructive as to how Big Data can make a direct impact on patient care. Integrating EHRs and diagnostic tests such as MRIs, genomic sequencing, and other technologies is the big opportunity for Big Data as it will allow physicians to better understand the genetic causes behind cancers, and therefore design more effective treatment regimens while also improving prevention and screening measures (Raghupathi and Raghupathi 2014 ; Norgeot et al. 2019 ). Here, we survey the current challenges in Big Data in healthcare and use oncology as an instructive vignette, highlighting issues of data ownership, sharing, and privacy. Our review builds on findings from the US, UK, and other global healthcare systems to propose a fundamental reorganization of EHRs around unique patient identifiers and ML.

Current successes of Big Data in healthcare

The UK and the US are both global leaders in healthcare that will play important roles in the adoption of Big Data. We see this global leadership already in oncology (The Cancer Genome Atlas (TCGA), Pan-Cancer Analysis of Whole Genomes (PCAWG)) and neuropsychiatric diseases (PsychENCODE) (Tomczak et al. 2015 ; Akbarian et al. 2015 ; Campbell et al. 2020 ). These Big Data generation and open-access models have resulted in hundreds of applications and scientific publications. The success of these initiatives in convincing the scientific and healthcare communities of the advantages of sharing clinical and molecular data have led to major Big Data generation initiatives in a variety of fields across the world such as the “All of Us” project in the US (Denny et al. 2019 ). The UK has now established a clear national strategy that has resulted in the likes of the UK Biobank and 100,000 Genomes projects (Topol 2019b ). These projects dovetail with a national strategy for the implementation of genomic medicine with the opening of multiple genome-sequencing sites, and the introduction of genome sequencing as a standard part of care for the NHS (Marx 2015 ). The US has no such national strategy, and while it has started its own large genomic study—“All of Us”—it does not have any plans for implementation in its own healthcare system (Topol 2019b ). In this review, we have focussed our discussion on developments in Big Data in Oncology as a method to understand this complex and fast moving field, and to develop general guidelines for healthcare at large.

Big Data initiatives in the United Kingdom

The UK Biobank is a prospective cohort initiative that is composed of individuals between the ages of 40 and 69 before disease onset (Allen et al. 2012 ; Elliott et al. 2018 ). The project has collected rich data on 500,000 individuals, collating together biological samples, physical measures of patient health, and sociological information such as lifestyle and demographics (Allen et al. 2012 ). In addition to its size, the UK Biobank offers an unparalleled link to outcomes through integration with the NHS. This unified healthcare system allows researchers to link initial baseline measures with disease outcomes, and with multiple sources of medical information from hospital admission to clinical visits. This allows researchers to be better positioned to minimize error in disease classification and diagnosis. The UK Biobank will also be conducting routine follow-up trials to continue to provide information regarding activity and further expanded biological testing to improve disease and risk factor association.

Beyond the UK Biobank, Public Health England launched the 100,000 Genomes project with the intent to understand the genetic origins behind common cancers (Turnbull et al. 2018 ). The massive effort consists of NHS patients consenting to have their genome sequenced and linked to their health records. Without the significant phenotypic information collected in the UK Biobank—the project holds limited use as a prospective epidemiological study—but as a great tool for researchers interested in identifying disease causing single-nucleotide polymorphisms (SNPs). The size of the dataset itself is its main advance—as it provides the statistical power to discover the associated SNPs even for rare diseases. Furthermore, the 100,000 Genomes Project’s ancillary aim is to stimulate private sector growth in the genomics industry within England.

Big Data initiatives in the United States and abroad

In the United States, the “All of Us” project is expanding upon the UK Biobank model by creating a direct link between patient genome data and their phenotypes by integrating EHRs, behavioral, and family data into a unique patient profile (Denny et al. 2019 ). By creating a standardized and linked database for all patients—“All of Us” will allow researchers greater scope than the UK BioBank to understand cancers and discover the associated genetic causes. In addition, “All of Us” succeeds in focusing on minority populations and health, an area of focus that sets it apart and gives it greater clinical significance. The UK should learn from this effort by expanding the UK Biobank project to further include minority populations and integrate it with ancillary patient data such as from wearables—the current UK Biobank has ~500,000 patients that identify as white versus ~12,000 (i.e., just <2.5%) that identified as non-white (Cohn et al. 2017 ). Meanwhile, individuals of Asian ethnicities made up over 7.5% of the UK population as per the 2011 UK Census, with the proportion of minorities projected to rise in the coming years (O’Brien and Potter-Collins 2015 ; Cohn et al. 2017 ).

Sweden too provides an informative example of the power of investment in rich electronic research registries (Webster 2014 ). The Swedish government has committed over $70 million dollars in funding per annum to expand a variety of cancer registries that would allow researchers insight into risk factors for oncogenesis. In addition, its data sources are particularly valuable for scientists, as each patient’s entries are linked to unique identity numbers that can be cross references with over 90 other registries to give a more complete understanding of a patient’s health and social circumstances. These registries are not limited to disease states and treatments, but also encompass extensive public administrative records that can provide researchers considerable insight into social indicators of health such as income, occupation, and marital status (Connelly et al. 2016 ). These data sources become even more valuable to Swedish researchers as they have been in place for decades with commendable consistency—increasing the power of long-term analysis (Connelly et al. 2016 ). Other nations can learn from the Swedish example by paying particular attention to the use of unique patient identifiers that can map onto a number of datasets collected by government and academia—an idea that was first mentioned in the US Health Insurance Portability and Accountability Act of 1996 (HIPAA) but has not yet been implemented (Davis 2019 ).

China has recently become a leader in implementation and development of new digital technologies, and it has begun to approach healthcare with an emphasis on data standardization and volume. Already, the central government in China has initiated several funding initiatives aimed at pushing Big Data into healthcare use cases, with a particular eye on linking together administrative data, regional claims data from the national health insurance program, and electronic medical records (Zhang et al. 2018 ). China hopes to do this through leveraging its existing personal identification system that covers all Chinese nationals—similar to the Swedish model of maintaining a variety of regional and national registries linked by personal identification numbers. This is particularly relevant to cancer research as China has established a new cancer registry (National Central Cancer Registry of China) that will take advantage of the nation’s population size to give unique insight into otherwise rare oncogenesis. Major concerns regarding this initiative are data quality and time. China has only relatively recently adopted the International Classification of Diseases (ICD) revision ten coding system, a standardized method for recording disease states alongside prescribed treatments. China is also still implementing standardized record keeping terminologies at the regional level. This creates considerable heterogeneity in data quality—as well as inoperability between regions—a major obstacle in any national registry effort (Zhang et al. 2018 ). The recency of these efforts also mean that some time is required until researchers will be able to take advantage of longitudinal analysis—vital for oncology research that aims to spot recurrences or track patient survival. In the future we can expect significant findings to come out of China’s efforts to bring hundreds of millions of patient files available to researchers, but significant advances in standards of care and interoperability must be first surpassed.

The large variety of “Big Data” research projects being undertaken around the world are proposing different approaches to the future of patient records. The UK is broadly leveraging the centralization of the NHS to link genomic data with clinical care records, and opening up the disease endpoints to researchers through a patient ID. Sweden and China are also adopting this model—leveraging unique identity numbers issued to citizens to link otherwise disconnected datasets from administrative and healthcare records (Connelly et al. 2016 ; Cnudde et al. 2016 ; Zhang et al. 2018 ). In this way, tests, technologies and methods will be integrated in a way that is specific to the patient but not necessarily to the hospital or clinic. This allows for significant flexibility in the seamless transfer of information between sites and for physicians to take full advantage of all the data generated. The US’ “All of Us” program is similar in integrating a variety of patient records into a single-patient file that is stored in the cloud (Denny et al. 2019 ). However, it does not significantly link to public administrative data sources, and thus is limited in its usefulness for long-term analysis of the effects of social contributors to cancer progression and risk. This foretells greater problems with the current ecosystem of clinical data—where lack of integration, misguided design, and ambiguous data ownership make research and clinical care more difficult rather than easier.

Survey of problems in clinical data use

Fragmentation.

Fragmentation is the primary problem that needs to be addressed if EHRs have any hope of being used in any serious clinical capacity. Fragmentation arises when EHRs are unable to communicate effectively between each other—effectively locking patient information into a proprietary system. While there are major players in the US EHR space such as Epic and General Electric, there are also dozens of minor and niche companies that also produce their own products—many of which are not able to communicate effectively or easily with one another (DeMartino and Larsen 2013 ). The Clinical Oncology Requirements for the EHR and the National Community Cancer Centers Program have both spoken out about the need for interoperability requirements for EHRs and even published guidelines (Miller 2011 ). In addition, the Certification Commission for Health Information Technology was created to issue guidelines and standards for interoperability of EHRs (Miller 2011 ). Fast Healthcare Interoperability Resources (FHIR) is the current new standard for data exchange for healthcare published by Health Level 7 (HL7). It builds upon past standards from both HL7 and a variety of other standards such as the Reference Information Model. FHIR offers new principles on which data sharing can take place through RESTful APIs—and projects such as Argonaut are working to expand adoption to EHRs (Chambers et al. 2019 ). Even with the introduction of the HL7 Ambulatory Oncology EHR Functional Profile, EHRs have not improved and have actually become pain points for clinicians as they struggle to integrate the diagnostics from separate labs or hospitals, and can even leave physicians in the dark about clinical history if the patient has moved providers (Reisman 2017 ; Blobel 2018 ). Even in integrated care providers such as Kaiser Permanente there are interoperability issues that make EHRs unpopular among clinicians as they struggle to receive outside test results or the narratives of patients who have recently moved (Leonard and Tozzi 2012 ).

The UK provides an informative contrast in its NHS, a single government-run enterprise that provides free healthcare at the point of service. Currently, the NHS is able to successfully integrate a variety of health records—a step ahead of the US—but relies on outdated technology with security vulnerabilities such as fax machines (Macaulay 2016 ). The NHS has recently also begun the process of digitizing its health service, with separate NHS Trusts adopting American EHR solutions, such as the Cambridgeshire NHS trust’s recent agreement with Epic (Honeyman et al. 2016 ). However, the NHS still lags behind the US in broad use and uptake across all of its services (Wallace 2016 ). Furthermore, it will need to force the variety of EHRs being adopted to conform to centralized standards and interoperability requirements that allow services as far afield as genome sequencing to be added to a patient record.

Misguided EHR design

Another issue often identified with the modern incarnation of EHRs is that they are often not helpful for doctors in diagnosis—and have been identified by leading clinicians as a hindrance to patient care (Lenzer 2017 ; Gawande 2018 ). A common denominator among the current generation of EHRs is their focus on billing codes, a set of numbers assigned to every task, service, and drug dispensed by a healthcare professional that is used to determine the level of reimbursement the provider will receive. This focus on billing codes is a necessity of the insurance system in the US, which reimburses providers on a service-rendered basis (Essin 2012 ; Lenzer 2017 ). Due to the need for every part of the care process to be billed to insurers (of which there are many) and sometimes to multiple insurers simultaneously, EHRs in the US are designed foremost with insurance needs in mind. As a result, EHRs are hampered by government regulations around billing codes, the requirements of insurance companies, and only then are able to consider the needs of providers or researchers (Bang and Baik 2019 ). And because purchasing decisions for EHRs are not made by physicians, the priority given to patient care outcomes falls behind other needs. The American Medical Association has cited the difficulty of EHRs as a contributing factor in physician burnout and as a waste of valuable time (Lenzer 2017 ; Gardner et al. 2019 ). The NHS, due to its reliance on American manufacturers of EHRs, must suffer through the same problems despite its fundamentally different structure.

Related to the problem of EHRs being optimized for billing, not patient care, is their lack of development beyond repositories of patient information into diagnostic aids. A study of modern day EHR use in the clinic notes many pain points for physicians and healthcare teams (Assis-Hassid et al. 2019 ). Foremost was the variance in EHR use within the clinic—in part because these programs are often not designed with provider workflows in mind (Assis-Hassid et al. 2019 ). In addition, EHRs were found to distract from interpersonal communication and did not integrate the many different types of data being created by nurses, physician assistants, laboratories, and other providers into usable information for physicians (Assis-Hassid et al. 2019 ).

Data ownership

One of the major challenges of current implementations of Big Data are the lack of regulations, incentives, and systems to manage ownership and responsibilities for data. In the clinical space, in the US, this takes the form of compliance with HIPAA, a now decade-old law that aimed to set rules for patient privacy and control for data (Adibuzzaman et al. 2018 ). As more types of data are generated for patients and uploaded to electronic platforms, HIPAA becomes a major roadblock to data sharing as it creates significant privacy concerns that hamper research. Today, if a researcher is to search for even simple demographic and disease states—they can rapidly identify an otherwise de-identified patient (Adibuzzaman et al. 2018 ). Concerns around breaking HIPAA prevent complete and open data sharing agreements—blocking a path to the specificity needed for the next generation of research from being achieved, and also throws a wrench into clinical application of these technologies as data sharing becomes bogged down by nebulousness surrounding old regulations on patient privacy. Furthermore, compliance with the General Data Protection Regulation (GDPR) in the EU has hampered international collaborations as compliance with both HIPAA and GDPR is not yet standardized (Rabesandratana 2019 ).

Data sharing is further complicated by the need to develop new technologies to integrate across a variety of providers. Taking from the example of the Informatics for Integrating Biology and the Bedside (i2b2) program funded by the NIH with Partners Healthcare, it is difficult and enormously expensive to overlay programs on top of existing EHRs (Adibuzzaman et al. 2018 ). Rather, a new approach needs to be developed to solve the solution of data sharing. Blockchain provides an innovative approach and has been recently explored in the literature as a solution that centers patient control of their data, and also promotes safe and secure data sharing through data transfer transactions secured by encryption (Gordon and Catalini 2018 ). Companies exploring this mechanism for data sharing include Nebula Genomics, a firm founded by George Church, that is aimed at securing genomic data in blockchain in a way that scales commercially, and can be used for research purposes with permission only from data owners—the patients themselves. Other firms are exploring using a variety of data types stored in blockchain to create predictive models of disease—such as Doc.Ai—but all are centrally based on the idea of a blockchain to secure patient data and ensure private accurate transfer between sites (Agbo et al. 2019 ). Advantages of blockchain for healthcare data transfer and storage lie in its security and privacy, but the approach has yet to gain widespread use.

Recommendations for clinical application

Design a new generation of ehrs.

It is conceivable that physicians in the near future will be faced with terabytes of data—patients coming to their clinics with years of continuous data monitoring their heart rate, blood sugar, and a variety of other factors (Topol 2019a ). Gaining clinical insight from such a large quantity of data is an impossible expectation to place upon physicians. In order to solve this problem of the exploding numbers of tests, assays, and results, EHRs will need to be extended from simply being records of patient–physician interactions and digital folders, to being diagnostic aids (Fig. 1 ). Companies such as Roche–Flatiron are already moving towards this model by building predictive and analytical tools into their EHRs when they provide them to providers. However, broader adoption across a variety of providers—and the transparency and portability of the models generated will also be vital. AI-based clinical decision-making support will need to be auditable in order to avoid racial bias, and other potential pitfalls (Char et al. 2018 ). Patients will soon request to have permanent access to the models and predictions being generated by ML models to gain greater clarity into how clinical decisions were made, and to guard against malpractice.

figure 1

In this example we demonstrate how many possible factors may come together to better target patients for early screening measures, which can lower aggregate costs for the healthcare system.

Designing this next generation of EHRs will require collaboration between physicians, patients, providers, and insurers in order to ensure ease of use and efficacy. In terms of specific recommendations for the NHS, the Veterans Administration provides a fruitful approach as it was able to develop its own EHR that compares extremely favorably with the privately produced Epic EHR (Garber et al. 2014 ). Its solution was open access, public-domain, and won the loyalty of physicians in improving patient care (Garber et al. 2014 ). However, the VA’s solution was not actively adopted due to lack of support for continuous maintenance and limited support for billing (Garber et al. 2014 ). While the NHS does not need to consider the insurance industry’s input, it does need to take note that private EHRs were able to gain market prominence in part because they provided a hand to hold for providers, and were far more responsive to personalized concerns raised (Garber et al. 2014 ). Evidence from Denmark suggests that EHR implementation in the UK would benefit from private competitors implementing solutions at the regional rather than national level in order to balance the need for competition and standardization (Kierkegaard 2013 ).

Develop new EHR workflows

Already, researchers and enterprise are developing predictive models that can better diagnose cancers based on imaging data (Bibault et al. 2016 ). While these products and tools are not yet market ready and are far off from clinical approval—they portend things to come. We envision a future where the job of an Oncologist becomes increasingly interpretive rather than diagnostic. But to get to that future, we will need to train our algorithms much like we train our future doctors—with millions of examples. In order to build this corpus of data, we will need to create a digital infrastructure around Big Data that can both handle the demands of researchers and enterprise as they continuously improve their models—with those of patients and physicians who must continue their important work using existing tools and knowledge. In Fig. 2 , we demonstrate a hypothetical workflow based on models provided by other researchers in the field (Bibault et al. 2016 ; Topol 2019a ). This simplified workflow posits EHRs as an integrative tool that can facilitate the capture of a large variety of data sources and can transform them into a standardized format to be stored in a secure cloud storage facility (Osong et al. 2019 ). Current limitations in HIPAA in the US have prevented innovation in this field, so reform will need to both guarantee the protection of private patient data and the open access to patient histories for the next generation of diagnostic tools. The introduction of accurate predictive models for patient treatment will mean that cancer diagnosis will fundamentally change. We will see the job of oncologists transforming itself as they balance recommendations provided by digital tools that can instantly integrate literature and electronic records from past patients, and their own best clinical judgment.

figure 2

Here, various heterogeneous data types are fed into a centralized EHR system that will be uploaded to a secure digital cloud where it can be de-identified and used by research and enterprise, but primarily by physicians and patients.

Use a global patient ID

While we are already seeing the fruits of decades of research into ML methods, there is a whole new set of techniques that will soon be leaving research labs and being applied to the clinic. This set of “omics”—often used to refer to proteomics, genomics, metabolomics, and others—will reveal even more specificity about a patient’s cancer at lower cost (Cho 2015 ). However, they like other technologies, will create petabytes of data that will need to be stored and integrated to help physicians.

As the number of tests and healthcare providers diversify—EHRs will need to address the question of extensibility and flexibility. Providers as disparate as counseling offices and MRI imaging centers cannot be expected to use the same software—or even similar software. As specific solutions for diverse providers are created—they will need to interface in a standard format with existing EHRs. The UK Biobank creates a model for these types of interactions in its use of a singular patient ID to link a variety of data types—allowing for extensibility as future iterations and improvements add data sources for the project. Also, Sweden and China are informative examples in their usage of national citizen identification numbers as a method of linking clinical and administrative datasets together (Cnudde et al. 2016 ; Zhang et al. 2018 ). Singular patient identification numbers do not yet exist in the US despite their inclusion in HIPAA due to subsequent Congressional action preventing their creation (Davis 2019 ). Instead private providers have stepped in to bridge the gap, but have also called on the US government to create an official patient ID system (Davis 2019 ). Not only would a singular patient ID allow for researchers to link US administrative data together with clinical outcomes, but also provide a solution to the questions of data ownership and fragmentation that plague the current system.

Healthcare future will build on the Big Data projects currently being pioneered around the world. The models of data integration being pioneered by the “All of Us” trial and analytics championed by P4 medicine will come to define the patient experience (Flores et al. 2013 ). However, in this piece we have demonstrated a series of hurdles that the field must overcome to avoid imposing additional burdens on physicians and to deliver significant value. We recommend a set of proposals built upon an examination of the NHS and other publicly administered healthcare models and the US multi-payer system to bridge the gap between the market competition needed to develop these new technologies and effective patient care.

Access to patient data must be a paramount guiding principle as regulators begin to approach the problem of wrangling the many streams of data that are already being generated. Data must both be accessible to physicians and patients, but must also be secured and de-identified for the benefit of research. A pathway taken by the UK Biobank to guarantee data integration and universal access has been through the creation of a single database and protocol for accessing its contents (Allen et al. 2012 ). It is then feasible to suggest a similar system for the NHS which is already centralized with a single funding source. However, this system will necessarily also be a security concern due to its centralized nature, even if patient data is encrypted (Fig. 3 ). Another approach is to follow in the footsteps of the US’ HIPAA, which suggested the creation of unique patient IDs over 20 years ago. With a single patient identifier, EHRs would then be allowed to communicate with heterogeneous systems especially designed for labs or imaging centers or counseling services and more (Fig. 4 ). However, this design presupposes a standardized format and protocol for communication across a variety of databases—similar to the HL7 standards that already exist (Bender and Sartipi 2013 ). In place of a centralized authority building out a digital infrastructure to house and communicate patient data, mandating protocols and security standards will allow for the development of specialized EHR solutions for an ever diversifying set of healthcare providers and encourage the market needed for continual development and support of these systems. Avoiding data fragmentation as seen already in the US then becomes an exercise in mandating data sharing in law.

figure 3

Future implementations of Big Data will need to not only integrate data, but also encrypt and de-identify it for secure storage.

figure 4

Hypothetical healthcare system design based on unique patient identifiers that function across a variety of systems and providers—linking together disparate datasets into a complete patient profile.

The next problem then becomes the inevitable application of AI to healthcare. Any such tool created will have to stand up to the scrutiny not just of being asked to outclass human diagnoses, but to also reveal its methods. Because of the opacity of ML models, the “black box” effect means that diagnoses cannot be scrutinized or understood by outside observers (Fig. 5 ). This makes clinical use extremely limited, unless further techniques are developed to deconvolute the decision-making process of these models. Until then, we expect that AI models will only provide support for diagnoses.

figure 5

Without transparency in many of the models being implemented as to why and how decisions are being made, there exists room for algorithmic bias and no room for improvement or criticism by physicians. The “black box” of machine learning obscures why decisions are made and what actually affects predictions.

Furthermore, many times AI models simply replicate biases in existing datasets. Cohn et al. 2017 demonstrated clear areas of deficiency in the minority representation of patients in the UK Biobank. Any research conducted on these datasets will necessarily only be able to create models that generalize to the population in them (a largely homogenous white-British group) (Fig. 6 ). In order to protect against algorithmic bias and the black box of current models hiding their decision-making, regulators must enforce rules that expose the decision-making of future predictive healthcare models to public and physician scrutiny. Similar to the existing FDA regulatory framework for medical devices, algorithms too must be put up to regulatory scrutiny to prevent discrimination, while also ensuring transparency of care.

figure 6

The “All of Us” study will meet this need by specifically aiming to recruit a diverse pool of participants to develop disease models that generalize to every citizen, not just the majority (Denny et al. 2019 ). Future global Big Data generation projects should learn from this example in order to guarantee equality of care for all patients.

The future of healthcare will increasingly live on server racks and be built in glass office buildings by teams of programmers. The US must take seriously the benefits of centralized regulations and protocols that have allowed the NHS to be enormously successful in preventing the problem of data fragmentation—while the NHS must approach the possibility of freer markets for healthcare devices and technologies as a necessary condition for entering the next generation of healthcare delivery which will require constant reinvention and improvement to deliver accurate care.

Overall, we are entering a transition in how we think about caring for patients and the role of a physician. Rather than creating a reactive healthcare system that finds cancers once they have advanced to a serious stage—Big Data offers us the opportunity to fine tune screening and prevention protocols to significantly reduce the burden of diseases such as advanced stage cancers and metastasis. This development allows physicians to think more about a patient individually in their treatment plan as they leverage information beyond rough demographic indicators such as genomic sequencing of their tumor. Healthcare is not yet prepared for this shift, so it is the job of governments around the world to pay attention to how each other have implemented Big Data in healthcare to write the regulatory structure of the future. Ensuring competition, data security, and algorithmic transparency will be the hallmarks of how we think about guaranteeing better patient care.

Adibuzzaman M, DeLaurentis P, Hill J, Benneyworth BD (2018) Big data in healthcare—the promises, challenges and opportunities from a research perspective: a case study with a model database. AMIA Annu Symp Proc 2017:384–392

PubMed   PubMed Central   Google Scholar  

Agbo CC, Mahmoud QH, Eklund JM (2019) Blockchain technology in healthcare: a systematic review. Healthcare 7:56

Article   PubMed Central   Google Scholar  

Aguet F, Brown AA, Castel SE, Davis JR, He Y, Jo B et al. (2017) Genetic effects on gene expression across human tissues. Nature 550:204–213

Article   Google Scholar  

Akbarian S, Liu C, Knowles JA, Vaccarino FM, Farnham PJ, Crawford GE et al. (2015) The PsychENCODE project. Nat Neurosci 18:1707–1712

Article   CAS   PubMed   PubMed Central   Google Scholar  

Allen N, Sudlow C, Downey P, Peakman T, Danesh J, Elliott P et al. (2012) UK Biobank: current status and what it means for epidemiology. Health Policy Technol 1:123–126

Assis-Hassid S, Grosz BJ, Zimlichman E, Rozenblum R, Bates DW (2019) Assessing EHR use during hospital morning rounds: a multi-faceted study. PLoS ONE 14:e0212816

Bang CS, Baik GH (2019) Using big data to see the forest and the trees: endoscopic submucosal dissection of early gastric cancer in Korea. Korean J Intern Med 34:772–774

Article   PubMed   PubMed Central   Google Scholar  

Bender D, Sartipi K (2013) HL7 FHIR: an agile and RESTful approach to healthcare information exchange. In Proceedings of the 26th IEEE International Symposium on Computer-Based Medical Systems, IEEE. pp 326–331

Bibault J-E, Giraud P, Burgun A (2016) Big Data and machine learning in radiation oncology: state of the art and future prospects. Cancer Lett 382:110–117

Article   CAS   PubMed   Google Scholar  

Blobel B (2018) Interoperable EHR systems—challenges, standards and solutions. Eur J Biomed Inf 14:10–19

Google Scholar  

Camacho DM, Collins KM, Powers RK, Costello JC, Collins JJ (2018) Next-generation machine learning for biological networks. Cell 173:1581–1592

Campbell PJ, Getz G, Stuart JM, Korbel JO, Stein LD (2020) Pan-cancer analysis of whole genomes. Nature https://www.nature.com/articles/s41586-020-1969-6

Chambers DA, Amir E, Saleh RR, Rodin D, Keating NL, Osterman TJ, Chen JL (2019) The impact of Big Data research on practice, policy, and cancer care. Am Soc Clin Oncol Educ Book Am Soc Clin Oncol Annu Meet 39:e167–e175

Char DS, Shah NH, Magnus D (2018) Implementing machine learning in health care—addressing ethical challenges. N Engl J Med 378:981–983

Cho WC (2015) Big Data for cancer research. Clin Med Insights Oncol 9:135–136

Cnudde P, Rolfson O, Nemes S, Kärrholm J, Rehnberg C, Rogmark C, Timperley J, Garellick G (2016) Linking Swedish health data registers to establish a research database and a shared decision-making tool in hip replacement. BMC Musculoskelet Disord 17:414

Cohn EG, Hamilton N, Larson EL, Williams JK (2017) Self-reported race and ethnicity of US biobank participants compared to the US Census. J Community Genet 8:229–238

Connelly R, Playford CJ, Gayle V, Dibben C (2016) The role of administrative data in the big data revolution in social science research. Soc Sci Res 59:1–12

Article   PubMed   Google Scholar  

Davis J (2019) National patient identifier HIPAA provision removed in proposed bill. HealthITSecurity https://healthitsecurity.com/news/national-patient-identifier-hipaa-provision-removed-in-proposed-bill

DeMartino JK, Larsen JK (2013) Data needs in oncology: “Making Sense of The Big Data Soup”. J Natl Compr Canc Netw 11:S1–S12

Deng J, El Naqa I, Xing L (2018) Editorial: machine learning with radiation oncology big data. Front Oncol 8:416

Denny JC, Rutter JL, Goldstein DB, Philippakis Anthony, Smoller JW, Jenkins G et al. (2019) The “All of Us” research program. N Engl J Med 381:668–676

Elliott LT, Sharp K, Alfaro-Almagro F, Shi S, Miller KL, Douaud G et al. (2018) Genome-wide association studies of brain imaging phenotypes in UK Biobank. Nature 562:210–216

Essin D (2012) Improve EHR systems by rethinking medical billing. Physicians Pract. https://www.physicianspractice.com/ehr/improve-ehr-systems-rethinking-medical-billing

Esteva A, Robicquet A, Ramsundar B, Kuleshov V, DePristo M, Chou K et al. (2019) A guide to deep learning in healthcare. Nat Med 25:24–29

Fessele KL (2018) The rise of Big Data in oncology. Semin Oncol Nurs 34:168–176

Flores M, Glusman G, Brogaard K, Price ND, Hood L (2013) P4 medicine: how systems medicine will transform the healthcare sector and society. Pers Med 10:565–576

Article   CAS   Google Scholar  

Garber S, Gates SM, Keeler EB, Vaiana ME, Mulcahy AW, Lau C et al. (2014) Redirecting innovation in U.S. Health Care: options to decrease spending and increase value: Case Studies 133

Gardner RL, Cooper E, Haskell J, Harris DA, Poplau S, Kroth PJ et al. (2019) Physician stress and burnout: the impact of health information technology. J Am Med Inf Assoc 26:106–114

Gawande A (2018) Why doctors hate their computers. The New Yorker , 12 https://www.newyorker.com/magazine/2018/11/12/why-doctors-hate-their-computers

Gordon WJ, Catalini C (2018) Blockchain technology for healthcare: facilitating the transition to patient-driven interoperability. Comput Struct Biotechnol J 16:224–230

Hasin Y, Seldin M, Lusis A (2017) Multi-omics approaches to disease. Genome Biol 18:83

Honeyman M, Dunn P, McKenna H (2016) A Digital NHS. An introduction to the digital agenda and plans for implementation https://www.kingsfund.org.uk/sites/default/files/field/field_publication_file/A_digital_NHS_Kings_Fund_Sep_2016.pdf

Kierkegaard P (2013) eHealth in Denmark: A Case Study. J Med Syst 37

Krumholz HM (2014) Big Data and new knowledge in medicine: the thinking, training, and tools needed for a learning health system. Health Aff 33:1163–1170

Lenzer J (2017) Commentary: the real problem is that electronic health records focus too much on billing. BMJ 356:j326

Leonard D, Tozzi J (2012) Why don’t more hospitals use electronic health records. Bloom Bus Week

Macaulay T (2016) Progress towards a paperless NHS. BMJ 355:i4448

Madhavan S, Subramaniam S, Brown TD, Chen JL (2018) Art and challenges of precision medicine: interpreting and integrating genomic data into clinical practice. Am Soc Clin Oncol Educ Book Am Soc Clin Oncol Annu Meet 38:546–553

Marx V (2015) The DNA of a nation. Nature 524:503–505

Miller RS (2011) Electronic health record certification in oncology: role of the certification commission for health information technology. J Oncol Pr 7:209–213

Norgeot B, Glicksberg BS, Butte AJ (2019) A call for deep-learning healthcare. Nat Med 25:14–15

O’Brien R, Potter-Collins A (2015) 2011 Census analysis: ethnicity and religion of the non-UK born population in England and Wales: 2011. Office for National Statistics. https://www.ons.gov.uk/peoplepopulationandcommunity/culturalidentity/ethnicity/articles/2011censusanalysisethnicityandreligionofthenonukbornpopulationinenglandandwales/2015-06-18

Osong AB, Dekker A, van Soest J (2019) Big data for better cancer care. Br J Hosp Med Lond Engl 2005 80:304–305

Rabesandratana T (2019) European data law is impeding studies on diabetes and Alzheimer’s, researchers warn. Sci AAAS. https://doi.org/10.1126/science.aba2926

Raghupathi W, Raghupathi V (2014) Big data analytics in healthcare: promise and potential. Health Inf Sci Syst 2:3

Reisman M (2017) EHRs: the challenge of making electronic data usable and interoperable. Pharm Ther 42:572–575

Shendure J, Ji H (2008) Next-generation DNA sequencing. Nature Biotechnology 26:1135–1145

Stephens ZD, Lee SY, Faghri F, Campbell RH, Zhai C, Efron MJ et al. (2015) Big Data: astronomical or genomical? PLOS Biol 13:e1002195

Tomczak K, Czerwińska P, Wiznerowicz M (2015) The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol 19:A68–A77

Topol E (2019a) High-performance medicine: the convergence of human and artificial intelligence. Nat Med 25:44

Topol E (2019b) The topol review: preparing the healthcare workforce to deliver the digital future. Health Education England https://topol.hee.nhs.uk/

Turnbull C, Scott RH, Thomas E, Jones L, Murugaesu N, Pretty FB, Halai D, Baple E, Craig C, Hamblin A, et al. (2018) The 100 000 Genomes Project: bringing whole genome sequencing to the NHS. BMJ 361

Wallace WA (2016) Why the US has overtaken the NHS with its EMR. National Health Executive Magazine, pp 32–34 http://www.nationalhealthexecutive.com/Comment/why-the-us-has-overtaken-the-nhs-with-its-emr

Webster PC (2014) Sweden’s health data goldmine. CMAJ Can Med Assoc J 186:E310

Wetterstrand KA (2019) DNA sequencing costs: data from the NHGRI Genome Sequencing Program (GSP). Natl Hum Genome Res Inst. www.genome.gov/sequencingcostsdata , Accessed 2019

Zhang L, Wang H, Li Q, Zhao M-H, Zhan Q-M (2018) Big data and medical research in China. BMJ 360:j5910

Download references

Author information

Authors and affiliations.

Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK

Raag Agrawal & Sudhakaran Prabakaran

Department of Biology, Columbia University, 116th and Broadway, New York, NY, 10027, USA

Raag Agrawal

Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra, 411008, India

Sudhakaran Prabakaran

St Edmund’s College, University of Cambridge, Cambridge, CB3 0BN, UK

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Sudhakaran Prabakaran .

Ethics declarations

Conflict of interest.

SP is co-founder of Nonexomics.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Associate editor: Frank Hailer

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Agrawal, R., Prabakaran, S. Big data in digital healthcare: lessons learnt and recommendations for general practice. Heredity 124 , 525–534 (2020). https://doi.org/10.1038/s41437-020-0303-2

Download citation

Received : 28 June 2019

Revised : 25 February 2020

Accepted : 25 February 2020

Published : 05 March 2020

Issue Date : April 2020

DOI : https://doi.org/10.1038/s41437-020-0303-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Lightweight federated learning for stis/hiv prediction.

  • Thi Phuoc Van Nguyen
  • Wencheng Yang

Scientific Reports (2024)

An open source knowledge graph ecosystem for the life sciences

  • Tiffany J. Callahan
  • Ignacio J. Tripodi
  • Lawrence E. Hunter

Scientific Data (2024)

Using machine learning approach for screening metastatic biomarkers in colorectal cancer and predictive modeling with experimental validation

  • Amirhossein Ahmadieh-Yazdi
  • Ali Mahdavinezhad
  • Saeid Afshar

Scientific Reports (2023)

Introducing AI to the molecular tumor board: one direction toward the establishment of precision medicine using large-scale cancer clinical and biological information

  • Ryuji Hamamoto
  • Takafumi Koyama
  • Noboru Yamamoto

Experimental Hematology & Oncology (2022)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

big data case study

TechRepublic

Male system administrator of big data center typing on laptop computer while working in server room. Programming digital operation. Man engineer working online in database center. Telecommunication.

8 Best Data Science Tools and Software

Apache Spark and Hadoop, Microsoft Power BI, Jupyter Notebook and Alteryx are among the top data science tools for finding business insights. Compare their features, pros and cons.

AI act trilogue press conference.

EU’s AI Act: Europe’s New Rules for Artificial Intelligence

Europe's AI legislation, adopted March 13, attempts to strike a tricky balance between promoting innovation and protecting citizens' rights.

Concept image of a woman analyzing data.

10 Best Predictive Analytics Tools and Software for 2024

Tableau, TIBCO Data Science, IBM and Sisense are among the best software for predictive analytics. Explore their features, pricing, pros and cons to find the best option for your organization.

Tableau logo.

Tableau Review: Features, Pricing, Pros and Cons

Tableau has three pricing tiers that cater to all kinds of data teams, with capabilities like accelerators and real-time analytics. And if Tableau doesn’t meet your needs, it has a few alternatives worth noting.

Futuristic concept art for big data solution for enterprises.

Top 6 Enterprise Data Storage Solutions for 2024

Amazon, IDrive, IBM, Google, NetApp and Wasabi offer some of the top enterprise data storage solutions. Explore their features and benefits, and find the right solution for your organization's needs.

Latest Articles

Businessman holding a virtual shield with check mark,

How Can Businesses Defend Themselves Against Common Cyberthreats?

TechRepublic consolidated expert advice on how businesses can defend themselves against the most common cyberthreats, including zero-days, ransomware and deepfakes.

CRM displayed on a monitor and surrounded by flat icons of CRM features.

Top 10 CRM Features and Functionalities

Discover the top CRM features for business success. Explore a curated list of key capabilities to consider when choosing the right CRM solution, including marketing tools, activity tracking and more.

Cubes, dice or blocks with deep fake letters.

Combatting Deepfakes in Australia: Content Credentials is the Start

The production of deepfakes is accelerating at more than 1,500% in Australia, forcing organisations to create and adopt standards like Content Credentials.

Pipedrive logo.

The Top 5 Pipedrive Alternatives for 2024

Discover the top alternatives to Pipedrive. Explore a curated list of CRM platforms with similar features, pricing and pros and cons to find the best fit for your business.

Technology background with national flag of Australia.

The Australian Government’s Manufacturing Objectives Rely on IT Capabilities

The intent of the Future Made in Australia Act is to build manufacturing capabilities across all sectors, which will likely lead to more demand for IT skills and services.

Businessman add new skill or gear into human head to upgrade working skill.

Udemy Report: Which IT Skills Are Most in Demand in Q1 2024?

Informatica PowerCenter, Microsoft Playwright and Oracle Database SQL top Udemy’s list of most popular tech courses.

Digital map of Australia,

Gartner: 4 Bleeding-Edge Technologies in Australia

Gartner recently identified emerging tech that will impact enterprise leaders in APAC. Here’s what IT leaders in Australia need to know about these innovative technologies.

big data case study

Llama 3 Cheat Sheet: A Complete Guide for 2024

Learn how to access Meta’s new AI model Llama 3, which sets itself apart by being open to use under a license agreement.

Zoho vs Salesforce.

Zoho vs Salesforce (2024): Which CRM Is Better?

Look at Zoho CRM and Salesforce side-by-side to compare the cost per functionality and top pros and of each provider to determine which is better for your business needs.

Businessman hand holding glowing digital brain.

9 Innovative Use Cases of AI in Australian Businesses in 2024

Australian businesses are beginning to effectively grapple with AI and build solutions specific to their needs. Here are notable use cases of businesses using AI.

An illustration of a monthly salary of a happy employee on year 2024.

How Are APAC Tech Salaries Faring in 2024?

The year 2024 is bringing a return to stable tech salary growth in APAC, with AI and data jobs leading the way. This follows downward salary pressure in 2023, after steep increases in previous years.

Splash graphic featuring the logo of Anthropic.

Anthropic Releases Claude Team Enterprise AI Plan and iOS App

The enterprise plan seeks to fill a need for generative AI tools for small and medium businesses. Plus, a Claude app is now on iOS.

Audience at conference hall.

Top Tech Conferences & Events to Add to Your Calendar in 2024

A great way to stay current with the latest technology trends and innovations is by attending conferences. Read and bookmark our 2024 tech events guide.

big data case study

TechRepublic Premium Editorial Calendar: Policies, Checklists, Hiring Kits and Glossaries for Download

TechRepublic Premium content helps you solve your toughest IT issues and jump-start your career or next project.

Close up of IBM logo at their headquarters located in SOMA district, downtown San Francisco.

IBM Acquires HashiCorp for $6.4 Billion, Expanding Hybrid Cloud Offerings

The deal is intended to strengthen IBM’s hybrid and multicloud offerings and generative AI deployment.

Create a TechRepublic Account

Get the web's best business technology news, tutorials, reviews, trends, and analysis—in your inbox. Let's start with the basics.

* - indicates required fields

Sign in to TechRepublic

Lost your password? Request a new password

Reset Password

Please enter your email adress. You will receive an email message with instructions on how to reset your password.

Check your email for a password reset link. If you didn't receive an email don't forgot to check your spam folder, otherwise contact support .

Welcome. Tell us a little bit about you.

This will help us provide you with customized content.

Want to receive more TechRepublic news?

You're all set.

Thanks for signing up! Keep an eye out for a confirmation email from our team. To ensure any newsletters you subscribed to hit your inbox, make sure to add [email protected] to your contacts list.

Transactions on NanoBioscience (TNB)

Exploring the Potential of DNA Computing for Complex Big Data Problems: A Case Study on the Traveling Car Renter Problem

  • May 6, 2024

The traveling car renter problem (TCRP) is a variant of the Traveling Salesman Problem (TSP) wherein the salesman utilizes rented cars for travel. The primary objective of this problem is to identify a solution that minimizes the cumulative operating costs. Given its classification as a non-deterministic polynomial (NP) problem, traditional …

View on IEEE Xplore

IMAGES

  1. Bigdata-case-studybook final big datas are there

    big data case study

  2. 5 Big Data Case Studies

    big data case study

  3. Big Data Use cases for Beginners

    big data case study

  4. Generating Insights from Big Data

    big data case study

  5. Application of Big Data Analytics

    big data case study

  6. How to Customize a Case Study Infographic With Animated Data

    big data case study

VIDEO

  1. 🚀 Data Intelligence Company: Big Data, AI, and Security

  2. Big Data Project Use Case

  3. How to compliantly store, seal and secure your data with Digital Evidence Management // Signicat

  4. Volkswagen Group: Driving Big Business With Big Data Case Solution & Analysis- TheCaseSolutions.com

  5. Interview with One Click AG

  6. Lecture3

COMMENTS

  1. Top 10 Big Data Case Studies that You Should Know

    Learn how Netflix, Google, LinkedIn, Walmart, eBay, Sprint, and other companies use big data to optimize their business and services. See examples of big data applications, challenges, and benefits in various industries.

  2. 8 case studies and real world examples of how Big Data has helped keep

    Learn how Starbucks, Netflix, Coca-Cola, American Express and other companies use Big Data and analytics to gain competitive advantages. See how they collect, analyze and act on data from customers, markets, operations and more.

  3. How companies are using big data and analytics

    Learn from six senior leaders how they overcome challenges and capture value from big data and analytics in their organizations. See examples of how they use data to improve customer experience, optimize network, and drive innovation.

  4. Big Data Examples & Use Cases in Action

    Learn how big data has transformed the data storage and data analytics paradigm and how companies use it to find insights and trends at scale. See real-world examples of big data applications in healthcare, banking, finance, and manufacturing industries. Discover the value of big data vs. small data and how to start using your data with Tableau.

  5. 8 big data use cases for businesses and industry examples

    5. Improved personalization and recommendation. One of the most popular uses of big data is to improve product recommendations and personalization of websites and services. The challenge with online offerings is that there are sometimes an overwhelming number of choices.

  6. Companies Using Big Data

    Learn how big data is used by various industries and companies, such as Netflix, AccuWeather, Etsy, and mLogica, to solve business problems and improve performance. See the use cases, outcomes, and solutions for each case study.

  7. Big Data Statistics: 40 Use Cases and Real-life Examples

    Learn how companies of different sizes and industries use big data for various purposes, such as customer analytics, data warehouse optimization, and fraud detection. See real-life examples, statistics, and insights from big data experts and consultants.

  8. Big data case study: How UPS is using analytics to improve ...

    Learn how UPS uses data and analytics to optimise its delivery routes, network planning, and customer service. The article covers the benefits, challenges, and projects of UPS's smart logistics network.

  9. Ten big data case studies in a nutshell

    You haven't seen big data in action until you've seen Gartner analyst Doug Laney present 55 examples of big data case studies in 55 minutes. It's kind of like The Complete Works of Shakespeare, Laney joked at Gartner Symposium, though "less entertaining and hopefully more informative."(Well, maybe, for this tech crowd.) The presentation was, without question, a master class on the three Vs ...

  10. Case Study

    AWS Payments is part of the AWS Commerce Platform (CP) organization that owns the customer experience of paying AWS invoices. It helps AWS customers manage their payment methods and payment preferences, and helps customers make self-service payments to AWS. The Machine Learning, Data and Analytics (MLDA) team at AWS Payments enables data-driven ...

  11. 26 Big Data Use Cases and Examples for Business

    The ability to collect and analyze large volumes of data has become a critical factor in decision-making, innovation, and growth. Big Data helps businesses to: Understand customer needs and preferences. Improve operational efficiency. Reduce costs. Identify new market opportunities. Manage risks.

  12. PDF case study collection 7 get big data

    Learn how some companies use big data to drive business performance, from Google and GE to Kaggle and Cornerstone. This case study collection is based on articles by Bernard Marr, a bestselling author and influencer, and covers topics such as search, ads, industrial internet and more.

  13. 5 Big Data Case Studies

    Learn how big data analytics helps Walmart, Uber, Netflix, eBay and P&G to perform exponentially in the market with these 5 case studies. Discover how they use data mining, machine learning, streaming data and other tools to optimize their customer experience, product recommendations, supply and demand, and content creation.

  14. Netflix Recommender System

    The V's of Big Data . Volume: As of May 2019, Netflix has around 13,612 titles (Gaël, 2019). Their US library alone consists of 5087 titles. As of 2016, Netflix has completed its migration to Amazon Web Services. Their data of tens of petabytes of data was moved to AWS (Brodkin et al., 2016).

  15. GE's Big Bet on Data and Analytics

    GE has bet big on the Industrial Internet — the convergence of industrial machines, data, and the Internet. The company is putting sensors on gas turbines, jet engines, and other machines; connecting them to the cloud; and analyzing the resulting flow of data. The goal: identify ways to improve machine productivity and reliability.

  16. PDF Top Big Data Analytics Use Cases

    Big data use cases 1-3. Retail Big data use cases 4-8. Healthcare Big data use cases 9-12. Oil and gas Big data use cases 13-15. Telecommunications. Big data use cases 16-18. Financial services. Big data use cases 19-22. If yours isn't among them, you'll still find the use cases informative and applicable.

  17. Big Data Use Case: How Amazon uses Big Data to drive eCommerce revenue

    In this big data use case, we'll look at how Amazon is leveraging data analytic technologies to improve products and services and drive overall revenue. Big data has changed how we interact with the world and continue strengthening its hold on businesses worldwide. New data sets can be mined, managed, and analyzed using a combination of ...

  18. 20+ Most Effective Big Data Analytics Use Cases

    Recent studies suggest that big data analytics is going to register a CAGR of 22.97% over the period of 2021 to 2026. As the amount of data generated and government regulations increase, they are fueling the demand for big data analytics in the sector. ... Project management is a huge use case for big data analytics, and some application areas ...

  19. Award winner: Big Data Strategy of Procter & Gamble

    The case discusses in detail how Procter & Gamble adapted the big data through different tools like Decision Cockpit and Business Sphere.". Strategy. Vinod commented: "The case helps understands many strategic, as well as technical aspects of big data and business analytics, and how they are implemented in a fast-moving consumer goods (FMCG ...

  20. Challenges of Big Data: Basic Concepts, Case Study, and More

    The Five 'V's of Big Data. Big Data is simply a catchall term used to describe data too large and complex to store in traditional databases. The "five 'V's" of Big Data are: Volume - The amount of data generated. Velocity - The speed at which data is generated, collected and analyzed. Variety - The different types of structured ...

  21. Amazon: Big Data Analysis Case Study

    A big data analysis case study of amazon. Big data is one of the advanced technologies mainly used for evaluating and integrating the collected data in the companies. The use of big data is ...

  22. Big data in digital healthcare: lessons learnt and ...

    Big Data initiatives in the United Kingdom. The UK Biobank is a prospective cohort initiative that is composed of individuals between the ages of 40 and 69 before disease onset (Allen et al. 2012 ...

  23. The convergence of big data and accounting: innovative research

    Chen et al. (2015) conduct a case study on how Alibaba Group uses big data in fraud risk management; they state that Alibaba Group has a fraud risk management system based on real-time big data processing and intelligent risk models. As a result, the final information on risks will be more accurate and reliable and can be provided on a real ...

  24. Big Data Processing Software Evolution for a Global Fleet ...

    At the core of the Customer's products is a proprietary big data processing engine that enables real-time capture and analysis of fleet telematics data generated by vehicle devices. ... ScienceSoft's experts will study your case and get back to you with the details within 24 hours. Close 5900 S. Lake Forest Drive Suite 300, McKinney, Dallas ...

  25. BigCode, an Open Innovation Case Study

    BigCode, an open-scientific collaboration by ServiceNow and Hugging Face, is an open innovation case study. Find out how we took it from research to product. ... These LLMs were fine-tuned on workflow data from the ServiceNow platform and from ServiceNow scripting best practices to power the Now Assist generative AI skills for users of the ...

  26. Big Data: Latest Articles, News & Trends

    Big Data Big Data Tableau Review: Features, Pricing, Pros and Cons . Tableau has three pricing tiers that cater to all kinds of data teams, with capabilities like accelerators and real-time analytics.

  27. Exploring the Potential of DNA Computing for Complex Big Data Problems

    The traveling car renter problem (TCRP) is a variant of the Traveling Salesman Problem (TSP) wherein the salesman utilizes rented cars for travel. The primary objective of this problem is… Continue Reading Exploring the Potential of DNA Computing for Complex Big Data Problems: A Case Study on the Traveling Car Renter Problem

  28. Buildings

    The linear equations regarding cooling capacity and PMV were established separately using environment data, and then the optimal region was determined. A case study on Terminal 3 of Xi'an Xianyang International Airport was conducted. The thermal environment was investigated through on-site measurements, questionnaires, and numerical ...

  29. GEN-Z ACCOUNTANTS: Redefining Traditional Accounting Practices

    Join us at 6 PM (WAT) this Thursday May 9, 2024, as our distinguish guest will be discussing the topic: GEN-Z ACCOUNTANTS: Redefining Traditional...