Don’t just compare year-on-year and year-on-year. These five steps will make your data analysis more in-depth.

Don’t just compare year-on-year and year-on-year. These five steps will make your data analysis more in-depth.

How to deepen the data analysis project? From what angles can this work be carried out in depth? This article divides data analysis work into 5 levels according to the degree of depth, and gives examples of how to carry out data analysis work at different levels, from requirements to logic, showing readers a clear data analysis idea. It is suitable for those who are interested in data analysis, and I hope it will be helpful to you.

Many students always feel that the data analysis is not in-depth, and the PPT only contains year-on-year and year-on-year data. How should they do it? Today, I will share with you how to do an in-depth data analysis project based on a specific example.

1. Depth level: Level 0

One day, you receive a request: "Look at the number of people who have used the newly added function A in our company's APP more than 1 time in the past 5 days (de-duplication)". This problem is too simple, just run a number and throw it in, "The cumulative number of users in the past 5 days is 10,000 people", and you're done.

However, this kind of analysis is not in-depth at all, and it cannot even be called "analysis" at all. It is just a number. Indeed, when the demand is very specific data collection indicators + statistical time, this is just a number collection, and this is the 0th level of in-depth analysis.

2. In-depth level: Level 1

One day, you receive another request: "Check out the newly added function A in our company's APP and see how many people are using it in the past 5 days."

It sounds similar to the previous question, but please note that "how many people" is not a clear indicator, but a general statement. In detail, there are:

  1. People who have used it more than 1 time in 5 days (delete duplicates)
  2. How many times have you used it in the past 5 days (without deduplication)?
  3. How many people use it every day in 5 days?
  4. Within 5 days, how many people used it for 5, 4, 3, 2, and 1 days in total?
  5. Number of people using each frequency within 5 days (1, 2, 3...10, 10+ times)

It takes several indicators to combine to clearly explain how many people there are. Some students may think that this is too troublesome. I just assume that he can't see the duplicate number of people. In fact, a lot of repeated data collection at work, overtime work, and being chased by the business for data are all caused by "not confirming the requirements clearly and assuming an indicator that the business doesn't want." Especially when you ask the business: Which caliber do you want to see. The business will say: All. At this time, it is best to think of more in advance to avoid repeated rework.

This kind of proactive thinking is the starting point for in-depth analysis, because these indicators are useful for business:

  1. By looking at the number of duplicate users, we can evaluate the penetration of total users.
  2. Looking at the number of visitors per day, we can see the development trend
  3. By looking at the cumulative number of days of use of each type, you can determine how many heavy users there are.
  4. By looking at the cumulative number of days of use of each type, you can determine how many heavy users there are.

Moreover, we found that the results of level 0 become part of the output of level 1. The same is true for the follow-up. The deeper you go, the more indicators and dimensions you design, and the more complicated the problem will be.

3. Depth level: Level 2

One day, you receive another request: "Take a look at the new function A added to our APP. Are the payment behaviors of people who have used it in the past 5 days better than others?"

Note that there are no clear data indicators here, so we need to break down the problem first:

  1. The subject is: users who have used feature A in the past 5 days. So how many people are using it? The first level of in-depth data needs to be added here.
  2. Payment behavior: Payment behavior is a general term. Is it the payment amount or the frequency? Since it is not clear, let’s take a look at both.
  3. Better than others: Who are others? All users or users who have not used the feature. From the problem scenario, users who have not used the feature in the past 5 days and have been active at least once should be distinguished, so that they can be compared.

With these three steps of decomposition, this unclear requirement can be turned into a data retrieval requirement:

  1. Basic information of users who have used function A in the past 5 days (number of users, distribution of usage days, and distribution of usage frequency)
  2. Payment behavior of users who have used function A in the past 5 days (what percentage, cumulative payment amount of paying users in the past 5 days, payment frequency in the past 5 days, average payment amount per person, average payment times per person)
  3. Number of active days, payment ratio, payment amount, payment frequency, average payment amount per person, and average payment times per person of active users who have not used function A in the past 5 days

In this way, once the two groups are compared, a conclusion can be drawn. However, doing so will quickly lead to the next question: "Why are people who use A higher/lower than other groups?"

IV. Depth level: Level 3

One day, you receive another request: "Analyze why people who use function A pay better?" Note that asking if it is true first and then why is the basic requirement for answering questions. Therefore, when breaking down the problem, you must first complete the in-depth 2nd level homework. After confirming that "paying for function A is better", analyze the reasons.

When analyzing the reasons, assumptions are important. Since the demand focuses on function A, the key is whether function A is useful. When analyzing the reasons, it is easier to disprove than to prove, so we can first eliminate some obviously wrong answers, such as "users of function A are all high-paying groups", which can directly eliminate "function A is useful for paid conversions".

But this is not logically tenable, because:

  1. The consumption is high in itself, but it becomes even higher after using the A function
  2. The consumption itself is high, but it is higher than those who do not use A
  3. People with low consumption can also improve their
  4. For people with low consumption, not using A will only make it lower.

Even if we see the data: Group A's consumption is naturally higher than that of non-groups, there are still at least four possibilities that need to be ruled out. So we need to list the hypothesis logic tree clearly and check the possibilities one by one. This is also what we said: to verify the viewpoint, we need to find examples of both positive and negative sides at the same time.

Note that even so, there are still counterarguments. Because we are all based on past data analysis, it is very likely that a counterargument is: "Function A can only attract this small group of users and cannot be expanded" or "User A is just trying something new, and it will have no effect after this period of time." Both of these viewpoints involve future data conditions, so it takes a period of observation before a conclusion can be drawn.

If we can’t wait that long, we can also do tests. For example, to test the point of “not being able to expand”, we can actively promote function A to other groups and observe the incremental and retention effects of function A. If the incremental effect is small, or there is incremental effect but the retention effect is poor, then we can infer that it is indeed not able to expand. If we want to do in-depth analysis, testing and long-term observation are indispensable. Good conclusions need time to settle.

5. Depth level: Level 4

One day, you receive another request: "Analyze the impact of function A on users?" It seems that the question is simpler to express, but it is more complicated to solve. Because from level 0 to level 3, we only discussed the impact of "payment", but there may be more impacts, such as activity, retention, referrals, etc. Each direction must go through such a long split and analysis to get a comprehensive result.

So far, our analysis has been very in-depth. Interestingly, our problem is actually very simple. In fact, if a problem:

  1. Have clear metrics
  2. There are clear criteria for judging whether an indicator is good or bad
  3. There is obvious logic of influence between indicators
  4. Based on closed business scenarios, easy to test

Then this problem can be easily solved.

But the real problem is often:

  1. Colloquial
  2. Contains multiple aspects
  3. No clear criteria for judgment
  4. There are many influencing factors
  5. There is no time or space for us to test slowly

At this time, you can start from the beginning and sort it out bit by bit. Reversing the order of this article is to sort out the scenarios of business problems from 0.

Of course, not all analyses need to be done from beginning to end.

  • It is possible that the person asking the question has no idea at all. In this case, you can first give him level 1 depth data to help him build his cognition, and then give him level 2 depth data to guide him to pay attention to the differences.
  • It is possible that the person asking the question may speak vaguely but has a clear goal in mind. At this time, in-depth communication can be carried out to clarify the needs.
  • It is possible that the person asking the question does not need a rigorous argument and is ready to jump to a conclusion based on some evidence. In this case, you can just argue the point that he is most confused about.

The only thing you should not do at this time is not to communicate, just come up with some numbers randomly, or find so-called "models" on the Internet and apply them mechanically. If you work behind closed doors, rework, overtime, and being dissed are common.

If we have done many verifications in a certain business scenario and proved the key indicators + judgment criteria + causal relationship of the business problem, we can directly apply it at this time. This is what we call: business analysis model. However, before it is precipitated, we still need to do more demonstration, especially the causal relationship demonstration. If it is not done in detail, we will be slapped in the face in minutes.

Author: Down-to-earth Teacher Chen

Source: WeChat public account "Down-to-earth Teacher Chen"

<<:  Who says hot topics are hard to grasp? Three ways to create hot topics and attract attention

>>:  The application of comedy elements in content marketing: How to use humor to make content more attractive?

Recommend

How to upload products on Shopee? How to upload products?

After opening a Shopee store, the store owner also...

How did the Director of Culture and Tourism in the short video become popular?

Short videos have become a very popular way to spr...

The ultimate debate: Which is more important, product or marketing?

For business operations, which is more important i...

The older the model, the more money it can make!

Searching for new opportunities in traditional ind...

Estimated future DAU

This article teaches you how to use DAU and APRU t...

What are the advantages of Amazon video marketing?

Video marketing has always been an important part ...

How to manage brand equity?

The author of this article introduces in detail ho...

Which country is the best for eBay? Which site is the best for eBay?

eBay is a global e-commerce platform where individ...