What data analysis methods will you use?

What data analysis methods will you use?

Introduction: The main content of this article is to popularize the concepts and methods of data analysis. The author uses a clear "what, why, and how" approach to provide readers with one methodology after another, bringing entry-level advanced knowledge to data analysis novices.

A student asked: Mr. Chen, every time I was interviewed, I was asked "What data analysis methods have you used?" I couldn't answer it. What methods are there for data analysis? Why do I feel like I don't have any methods when I'm doing data analysis? Let me give you a systematic answer today.

First of all, not all methods with the word "analysis" in their names are data analysis methods. There are many XX analyses, which are professional tools in statistics, operations research, and mathematics, and do not directly point to the answer to business problems. When people ask: "What are the analysis methods?", they expect to hear a method that can give a conclusion.

So if you want to answer this question well, we need to go back to the question: what business problems does data analysis solve?

From the perspective of business use, data analysis can solve five major types of problems

  1. How much is it (data describing the situation)
  2. What is (establishing data standards)
  3. Why (exploring the cause of the problem)
  4. What will happen (forecast business trends)
  5. So what? (Comprehensive judgment of the situation)

There is a specific combination of methods for each problem scenario (as shown below)

1. How to solve the problem of “how much is it”

To describe the situation with data, it is necessary to establish a complete data indicator system. To establish a data indicator system, it is necessary to sort out the logic between data indicators. There are two basic logics between data indicators: serial logic and parallel logic, which derives two basic analysis methods: funnel analysis method & indicator decomposition method.

As more businesses are disassembled, people find that certain data indicators can be used in fixed combinations, such as:

  • User operation scenarios: AARRR indicator, RFM indicator
  • Retail store scenario: people, goods, and venue indicators
  • Commodity management scenarios: purchase, sales, and inventory indicators

These are also commonly referred to as: analytical models. But please note that these are just displaying data. Data + judgment criteria, there is an analytical conclusion. The analysis of judgment criteria is: what kind of problem is it.

2. Solutions to the “What is” Problem

The judgment criteria can be very simple, such as the leader's instructions, KPI requirements, and past data. These are collectively referred to as simple criteria. However, in many cases, there is no clear KPI constraint to determine whether the indicator trend is normal, or even if the KPI meets the standard but the trend is strange, leaders will still think there is a problem. At this time, other references need to be found. Therefore, a series of analysis methods are derived.

for example:

  • Compare with the business rules to judge whether it is good or bad: life cycle method, natural cycle method
  • Comparison with similar businesses that are developing at the same time: Cohort analysis
  • Comparison with other business entities: Stratified analysis

By comparing in this way, even if there is only one data indicator, a good or bad judgment can be made. If the business development violates the past rules and is obviously worse than other individuals, it can be judged as: bad.

Of course, you can also use two indicators, such as the classic matrix analysis method, which divides the business into four categories through the intersection of two indicators and the average of two indicators, and thus makes a good or bad judgment.

It is also possible to use more indicators, such as the commonly used Kmean clustering. You can first use multiple indicators to cluster business individuals, and then look at the performance of each type.

All of the above methods can distinguish good and bad businesses, thereby assisting in judgment to a certain extent.

3. Solutions to the “Why” Question

"Analyze the reasons for this problem..." is a common request. This is the "why" question. There are two basic ideas to solve the why question:

1. Inference of results

Common ones, such as:

  • Structural analysis method: find the problem point through structural analysis
  • Tag analysis method: by tagging and comparing individuals, find the cause of the problem
  • Correlation analysis method: by calculating the correlation between indicators, find relevant indicators and then form hypotheses
  • MECE method: discuss multiple business assumptions, combine them into analysis logic according to the MECE principle, and verify them one by one

Result inference means that after a problem occurs, various data are used to find differences and establish hypotheses. It can abstract the "I think this is the reason for XX" from the business into a hypothesis that can be verified by data, so it has a very wide range of applications. However, result inference is only a unilateral induction from the results, which may be biased and needs experimental verification.

2. Experimental inference

These methods are closer to traditional statistical experiments, and most of them require:

  1. Conduct data experiments to verify hypotheses
  2. Set up a reference group/experimental group, and the characteristics of the reference group/experimental group are similar
  3. Distinguish between control variables and environmental variables, and focus on measuring the impact of control variables
  4. First, make a hypothesis, and then verify it through experiments/group comparisons. Common methods include ABtest, DID, PSM, RDD, Uplift, etc.

Experimental inference has statistical basis, and the calculation process is complicated, which seems to be more quantitative. However, it has too high requirements for experimental conditions. For example, it is difficult to use in large-scale promotional ALL-in businesses, business scenarios such as products and stores where the environment cannot be controlled, and areas such as salesperson behavior and content dissemination where data is difficult to collect.

The ideal state is definitely a combination of the two, facts-hypothesis-verification, a continuous cycle, approaching the truth. But in reality, there are many conditions and constraints. As a result, we can only approach from one angle and slowly approach the truth.

4. Solutions to the “What if” Problem

Prediction problems are a topic of interest to everyone and are also where statistics/algorithms are most likely to play a role. The only thing that limits the use of the method is how much data is available and whether business personnel are involved.

If business personnel insist on participating in the forecasting process, they can only use the business hypothesis method or the rolling forecast method. These methods list all the parameters that affect the results, which makes it easier for business personnel to make decisions based on their own judgments and helps them understand clearly: how much I need to do.

If business personnel are not involved, it depends on the amount of data. If there is little data, time series prediction is used; if there is a lot of data and there is data on causes that affect the results, algorithms such as regression models can be used for prediction.

5. Solutions to the “So What” Problem

Comprehensive evaluation and allocation issues are collectively referred to as "what if" issues. This is the final step in decision-making, which determines whether to take action on the business and how big the action should be. Some simple evaluations are easy, such as salespeople signing a life and death agreement and being fired if they fail to meet performance targets.

But in most cases, the evaluation is very complicated and needs to be considered from all aspects. The biggest difference here is whether to consider the subjective opinions of the leaders. If so, use the subjective scoring method decisively! Satisfying the leader's desire for scoring is the first priority. If not, consider using supervised machine learning algorithms, or objective methods such as factor analysis, DEA (which seeks relative efficiency), etc.

As for how much to do and who will do it, it is a more complicated question. If you want to make a good allocation, you must first complete the previous steps of analysis and have a full understanding of the basic capabilities of each business line before making a judgment. Here, the linear programming method can be used as support.

6. Why do I feel like the method is not being used?

As can be seen from the above, there are many methods for data analysis. But why do many students feel that they have not used any method? Because each method is closely related to the business scenario, leadership style, and data quality.

For example, causal inference algorithms are mostly based on group testing, but in actual business, many causal analyses are done after the fact, without giving the opportunity for a secondary experiment.

For example, the allocation plans of many companies are simply decided by the leaders, who have the final say and do not give analysts any opportunity to use algorithms.

For example, when it comes to forecasting, many companies simply do not have enough data accumulated and only have one piece of sales data, so at best they can only use the time series method.

This gap between ideal and reality makes many students very distressed. On the one hand, they don’t know how to use these methods, and on the other hand, they don’t understand how to respond to business needs. Interviews and work are very difficult.

Author: Down-to-earth Teacher Chen

Source: WeChat public account "Down-to-earth Teacher Chen"

<<:  Marketing News in April | After consumption downgrade, where should the Xiaohongshu brand go?

>>:  40,000 "siege cities" encircle 1 million talent dreams

Recommend

Taobao returns to users, and users return to Taobao

This article will give you an in-depth understandi...

Brand development principles

Brand means the image of products and quality comm...

Retail Discount 3.0 Era: The Secret of Low Prices is Here

You can see those retail stores everywhere on the ...

These are the nine best methods for data analysis

Introduction: The author of this article mainly in...

Is the fate of workers in the hands of AI interviewers?

AI interviews are becoming a new trend in recruitm...

ChatGPT + Xiaohongshu popular article, mass production of 100 notes in 1 day

How can we use ChatGPT for content marketing on a ...

Does Amazon have almost no orders without advertising? How to get orders?

Amazon is a very popular cross-border e-commerce p...

Is Shopee easy to do? Can you make money?

There are many cross-border e-commerce platforms, ...

The second half of short dramas: high-quality, artistic, vertical, and IP-based

The recent explosion of short short dramas has mad...

Why can't I register on eBay?

If we don't plan to open a store on eBay to se...

How to cancel Amazon Global Selling? How to close it?

Among cross-border e-commerce platforms, Amazon an...