Data science is the study of data to extract meaningful insights for business. It is a multidisciplinary approach
Data Science
Data science is the study of data to extract
meaningful insights for business. It is a multidisciplinary approach that
combines principles and practices from the fields of mathematics, statistics,
artificial intelligence, and computer engineering to analyze large amounts of
data. This analysis helps data scientists to ask and answer questions like what
happened, why it happened, what will happen, and what can be done with the
results.
Why is data science important?
Data science is importance because it combines tools, methods, and technology to generate meaning from data. Modern organizations are inundated with data; there is a proliferation of devices that can automatically collect and store information. Online systems and payment portals capture more data in the fields of e-commerce, medicine, finance, and every other aspect of human life. We have text, audio, video, and image data available in vast quantities.
History of data science
While the term data
science is not new, the meanings and connotations have changed over time. The
word first appeared in the ’60s as an alternative name for statistics. In the
late ’90s, computer science professionals formalized the term. A proposed
definition for data science saw it as a separate
field with three aspects: data design, collection, and analysis. It still took
another decade for the term to be used outside of academia.
Uses of Data Science:
1. Descriptive Analysis: Descriptive
analysis examines data to gain insights into what happened or what is
happening in the data environment. It is characterized by data
visualizations such as pie charts, bar charts, line graphs, tables, or
generated narratives.
·
For example, a flight booking service may record
data like the number of tickets booked each day. Descriptive analysis will
reveal booking spikes, booking slumps, and high-performing months for this
service.
2. Diagnostic Analysis: Diagnostic analysis is a deep-dive or detailed
data examination to understand why something happened. It is
characterized by techniques such as drill-down, data discovery, data mining,
and correlations. Multiple data operations and transformations may be performed
on a given data set to discover unique patterns in each of these techniques.
· For example, the flight service might drill down on a particularly high-performing month to better understand the booking spike. This may lead to the discovery that many customers visit a particular city to attend a monthly sporting event.
3.
Predictive
Analysis: Predictive analysis uses historical data to make
accurate forecasts about data patterns that may occur in the future. It is
characterized by techniques such as machine learning, forecasting, pattern
matching, and predictive modeling. In each of these techniques, computers are
trained to reverse engineer causality connections in the data.
·
.For
example, the flight service team might use data science to predict flight
booking patterns for the coming year at the start of each year. The computer
program or algorithm may look at past data and predict booking spikes for
certain destinations in May. Having anticipated their customer’s future travel
requirements, the company could start targeting.
4.
Prescriptive Analysis:
Prescriptive analytics takes predictive data to the next level.
It not only predicts what is likely
to happen but also suggests an optimum response to that outcome. It can analyze
the potential implications of different choices and recommend the best course
of action. It uses graph analysis, simulation, complex event processing, neural
networks, and recommendation engines from machine learning.
Data
science process:
O– Obtain data: Data can be pre-existing, newly acquired, or a data repository downloadable from the internet. Data scientists can extract data from internal or external databases, company CRM software, web server logs, social media or purchase it from trusted third-party sources.
S – Scrub data: Data scrubbing, or data
cleaning, is the process of standardizing the data according to a predetermined
format. It includes handling missing data, fixing data errors, and removing any
data outliers. Some examples of data scrubbing are
§
Changing all date values to a
common standard format.
§
Fixing spelling mistakes or
additional spaces.
§
Fixing mathematical inaccuracies
or removing commas from large numbers.
E – Explore data: Data exploration is preliminary data analysis that is used for planning further data modeling strategies. Data scientists gain an initial understanding of the data using descriptive statistics and data visualization tools. Then they explore the data to identify interesting patterns that can be studied or action.
M – Model data: Software and machine learning algorithms are used to gain deeper insights, predict outcomes, and prescribe the best course of action. Machine learning techniques like association, classification, and clustering are applied to the training data set. The model might be tested against predetermined test data to assess result accuracy. The data model can be fine-tuned many times to improve result outcomes.
N – Interpret results: Data scientists work together with analysts and businesses to convert data insights into action. They make diagrams, graphs, and charts to represent trends and predictions. Data summarization helps stakeholders understand and implement results effectively.
No comments:
Post a Comment