A brief discussion on “retention” of user behavior analysis

A brief discussion on “retention” of user behavior analysis

When doing user behavior analysis, retention is an important indicator that we need to pay attention to in our work. So, how is retention defined? What is the calculation method and usage scenario of retention? This article analyzes this issue from three aspects. I recommend it to friends who are interested in user analysis. I hope it will be helpful to you.

Brief Discussion 1: How to Define

User retention, as the name suggests, means that users stay. Retention is defined as the proportion of users who return N days after their first use.

In user behavior products, we can have a more refined definition of retention: retention is the ratio of users who had an initial event in the first time period and had a return visit event in the second time period.

So the calculation method for retention is:

(The number of users who had a return visit event in the second time period / the number of users who had an initial event in the first time period) × 100%.

Different products can define different starting events, return events and time period lengths based on business conditions.

Some products will directly define both the start event and return event as "open the app/browse the webpage", such as some gaming products/social products (because the purpose of users opening these products is very clear, basically to play games/chat, so the start & return events can be directly defined as "open the app/browse the webpage").

For example, suppose a game product defines both the starting event and the return event as "opening the app" when defining retention, and sets the observation period to one week. In other words, after a user opens the game (triggering the starting event), if he opens the game again in the next week, then the user is counted as a retained user of the game in the next week.

Of course, there are also many products that define the start event and return event differently. This is related to the business needs of the product itself. For example, if a fitness app only uses "open the app" as the start and return event, it is not appropriate - the user who opens the app may not really follow the app's instructions to exercise, but just take a look. This kind of "onlooker" is actually easy to lose for fitness apps.

Counting all users who open the app again as retained users may not provide a real basis for product operation and development. In this case, if we set the starting event as "opening the app" and the return visit event as "completing a fitness session", supplemented by an appropriate observation period, we can better understand the user retention in the product.

Brief Discussion 2: How to Calculate

Retention analysis focuses on how many users who triggered the initial event then return to the site.

Here is an example of how a user behavior analysis product is calculated:

Retention = the number of users who had a starting event in the first time period and then had a return event in the second time period = (the number of users who had a return event in the second time period / the number of users who had a starting event in the first time period) × 100%

Based on this formula, let us analyze various retention algorithms in detail.

1. Calculation of daily retention

(1) How to calculate the retention rate on the next day, 3 days, 7 days, or n days from a certain day?

When calculating this type of retention, we actually use "days" as the unit of observation period. What we need to care about is:

  1. The number of users who triggered the "starting event" on a certain day;
  2. Track whether this group of users triggers a "return visit event" on the nth day;
  3. After getting the number of users in the first and second steps:

The n-day retention on a certain day = the number of users who triggered the "return event" on the nth day / the number of users who triggered the "starting event" on the 0th day × 100%.

According to the above steps, we can easily conclude that:

Second retention on a certain day = (number of users with return events on day 1 / number of users with initial events on day 0) × 100%

3-day retention for a certain day = (number of users with return events on the 3rd day / number of users with initial events on the 0th day) × 100%

7-day retention for a certain day = (number of users with return events on the 7th day / number of users with initial events on the 0th day) × 100%

(Note: In addition, some data products define the day when the starting event is triggered as Day 1, which may have a certain impact on the calculation of retention. We should pay attention to this when analyzing.)

(2) How to calculate the next-day/3-day/7-day/n-day retention within a certain time range?

  1. Within the selected time range, filter out each day that can calculate n-day retention, and record the sum of the number of users who complete the "starting event" each day;
  2. For each day for which n-day retention can be calculated, calculate the sum of the number of users who triggered a "return visit event" on the nth day;
  3. After getting the number of users in the first and second steps:

n-day retention within a certain period of time = number of users in step 2 / number of users in step 1 × 100%

In the figure, we have selected the time range of the last 8 days (20200831-20200907). Now we want to calculate the 5-day retention data within these 8 days. How do we calculate it?

5-day retention rate in the last 8 days = (the sum of the number of 5-day retained users on each day from 20200831 to 20200907 / the sum of the number of users who had a start event on each day from 20200831 to 20200907) * 100%;

2. Calculation of weekly/monthly retention

(1) How do you calculate the n-week retention for a particular week? How do you calculate the n-month retention for a particular month?

  1. Record the number of users who triggered the "starting event" in the week/month (week/month 0);
  2. Track whether this group of users has triggered a "return visit event" within the nth week/month, and record the number of users who have triggered a "return visit event";
  3. After getting the number of users in the first and second steps:

Retention in the nth week of a certain week = the number of users who triggered the "return event" in the nth week / the number of users who triggered the "starting event" in the 0th week × 100%

Retention in the nth month of a certain month = the number of users who triggered the "return event" in the nth month / the number of users who triggered the "starting event" in the 0th month × 100%

(2) What is the weekly/monthly retention rate within a certain period of time?

  1. Within the selected time range, filter out each week/month for which n-week/n-month retention can be calculated, and record the sum of the number of users who complete the "starting event" in each week/month.
  2. For each week/month for which n-week/n-month retention can be calculated, calculate the sum of the number of users who triggered a "return visit event" in the nth week/month;
  3. After getting the number of users in the first and second steps:

Retention in a certain period of time for n weeks/months = number of users in step 2/number of users in step 1 × 100%

For example, how to calculate “2-week retention in the last 6 weeks”?

In the figure, we have selected the time range of the last 6 weeks.

On February 26, 2012, we selected the time range as "Last 6 Weeks", and the default one-week period is "Monday to Sunday". The selections for the last six weeks are as follows:

Let’s get back to the topic. Now we want to calculate the 2-week retention data for the last 6 weeks. How is it calculated?

2-week retention in the last 6 weeks = (the sum of the number of users with 2-week retention in each week / the sum of the number of users with the starting event in each week) × 100%

Brief discussion 3: Retention analysis, scenario application

In daily work, retention analysis is often used in the following scenarios:

1. Understand the quality of a channel

You can use "daily retention" to measure the performance of users from each channel, and use it as one of the criteria for measuring channel quality. For example, compare the next-day retention and 7-day retention of users from different channels (different industries may choose different periods) to measure the user retention of different channels. Generally speaking, a better retention situation can reflect the better quality of this channel.

2. Determine whether an operational measure or a functional change is effective

When we expect a certain operation method/function to improve retention, we can analyze the retention rate of "new users covered" and "new users not covered" by the operation method/function, and compare the retention of the two parts of users to verify the effectiveness of the method/function.

Taking Tieba as an example, the forum wanted to test whether the “read posts” function had an improvement effect on the retention of new users, so it conducted a retention comparison of new users from channel A (some of whom used the read posts function and some of whom did not).

By comparison, it was found that the three-day retention rate of new users who had used the post-reading function was more than 10% higher than that of new users who had not used the function. This shows that the "post-reading" function has a positive effect on the retention of new users.

3. Measuring whether a product is healthy

You can use indicators such as "weekly retention" and "monthly retention" to observe user stickiness on the platform and measure the health of the product.

Of course, in addition to this, different products may have more analysis methods, which we will not list here one by one.

In general, today we will take a deep look at "retention" by introducing the definition, calculation method and usage scenarios of retention. In the future, retention should be an important indicator that we need to pay attention to in our work. If you want to learn more about retention analysis, you need to practice it in your actual work~

Author: Zhao Zhuangshi, member of the "Data Creator Alliance".

<<:  ChatGPT becomes popular, how will artificial intelligence empower marketing?

>>:  How to grow private domain activities like a "big company"

Recommend

The first batch of people who made money with AIGC have already started overseas

The wind of AI writing applications first blew AI ...

The short play is free, who is consuming it?

Recently, with the help of ByteDance's traffic...

How can brands collaborate with UGC creators? What should they pay attention to?

The cooperation between brands and UGC creators ha...

Brand upgrade, concept first

Introduction: This article introduces the theme fr...

At the World Cup, Coca-Cola is "doing wholesale"

As a sponsor of the Qatar World Cup, Coca-Cola not...

Five major companies share Spring Festival traffic: Who loses money? Who wins?

As a traffic hotspot, the Spring Festival is a mus...

How to re-create old Amazon products? What are the operation skills?

On the platform of Amazon, a global e-commerce gia...

From data assets to journey metrics

From data accumulation to journey insights, this a...

Come and create this account, you will definitely make money!

Script accounts are accounts that provide various ...

Marketing formula for social media operations in the tourism industry

With the opening up of the whole country, this yea...