Don’t just compare year-on-year and year-on-year. These five steps will make your data analysis more in-depth.

How to deepen the data analysis project? From what angles can this work be carried out in depth? This article divides data analysis work into 5 levels according to the degree of depth, and gives examples of how to carry out data analysis work at different levels, from requirements to logic, showing readers a clear data analysis idea. It is suitable for those who are interested in data analysis, and I hope it will be helpful to you.

Many students always feel that the data analysis is not in-depth, and the PPT only contains year-on-year and year-on-year data. How should they do it? Today, I will share with you how to do an in-depth data analysis project based on a specific example.

1. Depth level: Level 0

One day, you receive a request: "Look at the number of people who have used the newly added function A in our company's APP more than 1 time in the past 5 days (de-duplication)". This problem is too simple, just run a number and throw it in, "The cumulative number of users in the past 5 days is 10,000 people", and you're done.

However, this kind of analysis is not in-depth at all, and it cannot even be called "analysis" at all. It is just a number. Indeed, when the demand is very specific data collection indicators + statistical time, this is just a number collection, and this is the 0th level of in-depth analysis.

2. In-depth level: Level 1

One day, you receive another request: "Check out the newly added function A in our company's APP and see how many people are using it in the past 5 days."

It sounds similar to the previous question, but please note that "how many people" is not a clear indicator, but a general statement. In detail, there are:

People who have used it more than 1 time in 5 days (delete duplicates)
How many times have you used it in the past 5 days (without deduplication)?
How many people use it every day in 5 days?
Within 5 days, how many people used it for 5, 4, 3, 2, and 1 days in total?
Number of people using each frequency within 5 days (1, 2, 3...10, 10+ times)

…

It takes several indicators to combine to clearly explain how many people there are. Some students may think that this is too troublesome. I just assume that he can't see the duplicate number of people. In fact, a lot of repeated data collection at work, overtime work, and being chased by the business for data are all caused by "not confirming the requirements clearly and assuming an indicator that the business doesn't want." Especially when you ask the business: Which caliber do you want to see. The business will say: All. At this time, it is best to think of more in advance to avoid repeated rework.

This kind of proactive thinking is the starting point for in-depth analysis, because these indicators are useful for business:

By looking at the number of duplicate users, we can evaluate the penetration of total users.
Looking at the number of visitors per day, we can see the development trend
By looking at the cumulative number of days of use of each type, you can determine how many heavy users there are.
By looking at the cumulative number of days of use of each type, you can determine how many heavy users there are.

Moreover, we found that the results of level 0 become part of the output of level 1. The same is true for the follow-up. The deeper you go, the more indicators and dimensions you design, and the more complicated the problem will be.

3. Depth level: Level 2

One day, you receive another request: "Take a look at the new function A added to our APP. Are the payment behaviors of people who have used it in the past 5 days better than others?"

Note that there are no clear data indicators here, so we need to break down the problem first:

The subject is: users who have used feature A in the past 5 days. So how many people are using it? The first level of in-depth data needs to be added here.
Payment behavior: Payment behavior is a general term. Is it the payment amount or the frequency? Since it is not clear, let’s take a look at both.
Better than others: Who are others? All users or users who have not used the feature. From the problem scenario, users who have not used the feature in the past 5 days and have been active at least once should be distinguished, so that they can be compared.

With these three steps of decomposition, this unclear requirement can be turned into a data retrieval requirement:

Basic information of users who have used function A in the past 5 days (number of users, distribution of usage days, and distribution of usage frequency)
Payment behavior of users who have used function A in the past 5 days (what percentage, cumulative payment amount of paying users in the past 5 days, payment frequency in the past 5 days, average payment amount per person, average payment times per person)
Number of active days, payment ratio, payment amount, payment frequency, average payment amount per person, and average payment times per person of active users who have not used function A in the past 5 days

In this way, once the two groups are compared, a conclusion can be drawn. However, doing so will quickly lead to the next question: "Why are people who use A higher/lower than other groups?"

IV. Depth level: Level 3

One day, you receive another request: "Analyze why people who use function A pay better?" Note that asking if it is true first and then why is the basic requirement for answering questions. Therefore, when breaking down the problem, you must first complete the in-depth 2nd level homework. After confirming that "paying for function A is better", analyze the reasons.

When analyzing the reasons, assumptions are important. Since the demand focuses on function A, the key is whether function A is useful. When analyzing the reasons, it is easier to disprove than to prove, so we can first eliminate some obviously wrong answers, such as "users of function A are all high-paying groups", which can directly eliminate "function A is useful for paid conversions".

But this is not logically tenable, because:

The consumption is high in itself, but it becomes even higher after using the A function
The consumption itself is high, but it is higher than those who do not use A
People with low consumption can also improve their
For people with low consumption, not using A will only make it lower.

…

Even if we see the data: Group A's consumption is naturally higher than that of non-groups, there are still at least four possibilities that need to be ruled out. So we need to list the hypothesis logic tree clearly and check the possibilities one by one. This is also what we said: to verify the viewpoint, we need to find examples of both positive and negative sides at the same time.

Note that even so, there are still counterarguments. Because we are all based on past data analysis, it is very likely that a counterargument is: "Function A can only attract this small group of users and cannot be expanded" or "User A is just trying something new, and it will have no effect after this period of time." Both of these viewpoints involve future data conditions, so it takes a period of observation before a conclusion can be drawn.

If we can’t wait that long, we can also do tests. For example, to test the point of “not being able to expand”, we can actively promote function A to other groups and observe the incremental and retention effects of function A. If the incremental effect is small, or there is incremental effect but the retention effect is poor, then we can infer that it is indeed not able to expand. If we want to do in-depth analysis, testing and long-term observation are indispensable. Good conclusions need time to settle.

5. Depth level: Level 4

One day, you receive another request: "Analyze the impact of function A on users?" It seems that the question is simpler to express, but it is more complicated to solve. Because from level 0 to level 3, we only discussed the impact of "payment", but there may be more impacts, such as activity, retention, referrals, etc. Each direction must go through such a long split and analysis to get a comprehensive result.

So far, our analysis has been very in-depth. Interestingly, our problem is actually very simple. In fact, if a problem:

Have clear metrics
There are clear criteria for judging whether an indicator is good or bad
There is obvious logic of influence between indicators
Based on closed business scenarios, easy to test

Then this problem can be easily solved.

But the real problem is often:

Colloquial
Contains multiple aspects
No clear criteria for judgment
There are many influencing factors
There is no time or space for us to test slowly

At this time, you can start from the beginning and sort it out bit by bit. Reversing the order of this article is to sort out the scenarios of business problems from 0.

Of course, not all analyses need to be done from beginning to end.

It is possible that the person asking the question has no idea at all. In this case, you can first give him level 1 depth data to help him build his cognition, and then give him level 2 depth data to guide him to pay attention to the differences.
It is possible that the person asking the question may speak vaguely but has a clear goal in mind. At this time, in-depth communication can be carried out to clarify the needs.
It is possible that the person asking the question does not need a rigorous argument and is ready to jump to a conclusion based on some evidence. In this case, you can just argue the point that he is most confused about.

The only thing you should not do at this time is not to communicate, just come up with some numbers randomly, or find so-called "models" on the Internet and apply them mechanically. If you work behind closed doors, rework, overtime, and being dissed are common.

If we have done many verifications in a certain business scenario and proved the key indicators + judgment criteria + causal relationship of the business problem, we can directly apply it at this time. This is what we call: business analysis model. However, before it is precipitated, we still need to do more demonstration, especially the causal relationship demonstration. If it is not done in detail, we will be slapped in the face in minutes.

Author: Down-to-earth Teacher Chen

Source: WeChat public account "Down-to-earth Teacher Chen"

<<: Who says hot topics are hard to grasp? Three ways to create hot topics and attract attention

>>: The application of comedy elements in content marketing: How to use humor to make content more attractive?

How to make a good career choice? What is the logic behind career choice and career planning?

8,000 words of useful information | Learn "Brand Launch Conference" from Steve Jobs and Lei Jun: 6 rules and 18 techniques!

Blog

Millions of people check in to enjoy the hustle and bustle of life. Will live streaming + setting up stalls be a new business?

Blog

What does Amazon North America mean? What are the Amazon sites?

Don’t just compare year-on-year and year-on-year. These five steps will make your data analysis more in-depth.

1. Depth level: Level 0

2. In-depth level: Level 1

3. Depth level: Level 2

IV. Depth level: Level 3

5. Depth level: Level 4

How to make a good career choice? What is the logic behind career choice and career planning?

How about cross-border e-commerce agency operation? Is it useful?

Does Amazon have monthly rent every month? Where is the monthly rent deducted from?

8,000 words of useful information | Learn "Brand Launch Conference" from Steve Jobs and Lei Jun: 6 rules and 18 techniques!

Millions of people check in to enjoy the hustle and bustle of life. Will live streaming + setting up stalls be a new business?

What does Amazon North America mean? What are the Amazon sites?

The 6 most popular hot topics in Xiaohongshu’s health track

What are the situations of cross-border e-commerce export tax rebates? What is the process?

Xiaohongshu e-commerce: Chaos in expansion

WeChat e-commerce, this year’s KPI

Recommend

Will an Amazon account be cancelled if it is not used for a long time? How long does it take to cancel it?

Non-standard product business in private domain

Pinduoduo Temu European semi-hosted merchants product delivery & inventory & shipping template operation

How to apply for Shopee parent-child account? What are the requirements?

How to handle VAT in Europe when running an independent website? How to pay VAT?

10 very interesting copywriting

Does Amazon China have traffic? What is the traffic within the site?

What is the difference between Amazon's order volume and sales volume? Is sales volume the monthly sales volume?

How to achieve a breakthrough in marketing during the World Cup?

The lines from the movie "Catching Dolls" are terrifying when you think about it~

How is it to open a store on eBay now? What are the prospects?

The critical attack on the plush toys was caused by JellyCat's "madness"

What benefits are available to new sellers on Amazon Japan? How can I claim them?

If we continue to do this, there will be fewer and fewer people starting businesses!

Does Shopee provide traffic support for new stores? How can new stores increase traffic?