How to solve the problem of inconsistent data indicators?

How to solve the problem of inconsistent data indicators?

In the interview process of data products, inconsistent data and indicators are obstacles that cannot be avoided in the data-based operation process. This problem can easily test whether the candidate really has practical experience. How to solve it specifically? Please see the author's sharing.

During data product interviews, the indicator system and indicator caliber are very frequently asked questions, mainly because data indicators are the core application scenarios of data-driven analysis, and data mismatches and inconsistent indicators are hurdles that cannot be avoided in the data-driven operation process. This question can easily test whether the candidate really has practical experience.

1. First, recognize the objective existence of inconsistent indicators

In the process of data analysis and application, data mismatch often occurs due to various reasons such as naming conventions, data processing logic, business definition, statistical methods, etc., including:

  • The same name has different meanings . The same indicator name has inconsistent statistical calibers. There is a lack of naming standard restrictions. Different businesses only start from their own departments and lack a global perspective. For example, the revenue of the financial caliber must be calculated strictly according to rigorous logic to calculate every penny actually collected and paid, while the product/operation end considers the conversion effect more. However, in their respective KPI monitoring reports, the indicator is named revenue.
  • The same meaning but different names . The indicators are unified and logically consistent, but the names of different products are inconsistent. Different stages or different business parties/product managers name the indicators differently, resulting in different names for the same indicator on different data product pages.
  • The definition is not clear , it is just a synonym, such as the number of active users: the number of visiting users
  • The naming is difficult to understand , the meaning is unclear and ambiguous, or it is too professional and only the person who created the indicator can understand it. For example, the conversion rate indicator has two types: order creation conversion rate and order conversion rate. Simply calling it conversion rate is very unreadable.
  • The logic is inaccurate and the description of the indicator caliber is incorrect. For example, the UV indicator is described as "deduplicated by device ID". In fact, the deduplication logic of different platforms is not consistent. For example, WeChat mini-programs deduplicate according to UnionID, APPs deduplicate according to DeviceID, and PC and H5 deduplicate according to loginkey.
  • Data is difficult to trace , and the data sources of data product indicators lack intuitive link tracking capabilities. Troubleshooting of abnormal indicator data issues requires looking through the code to see the data source, which is a long and time-consuming process. In the morning, when business feedback on indicator problems occurs, it may take a whole morning to find the conclusion.
  • Poor data quality and common problems in indicator management combined often lead to a significant reduction in the business's trust in data indicators. When data fluctuations are discovered, the first reaction is to confirm with the data department whether there is a problem with the data, rather than considering any changes in the business.

2. Analyze the causes of the problem

The problem of inconsistent data indicators is mainly due to the following reasons:

  • Organizational structure and division of functions : Different organizations or departments may have different functions and tasks, which leads to different needs and focuses on data. For example, the product department focuses on App downloads, activations, and conversions; the operations department focuses on user activity and transaction volume; the marketing department focuses on advertising delivery link tracking, etc. Therefore, different indicators and definitions may be used to measure performance.
  • Lack of unified standards : Each department has its own data analysis needs. If there is no unified data collection department, each department will act independently, resulting in a lack of unified standards. The same name but different meanings or ambiguous indicators often appear, causing users to use indicators incorrectly.
  • Human errors : During the data processing and analysis process, human errors may also lead to inconsistent indicators. For example, errors may occur during data cleaning and conversion, and there may be deviations in the selection of statistical methods. Indicators developed by different data developers and logical changes made at different stages may lead to data mismatches.

3. Ideas and methods for solving problems

Indicator system construction and management: Based on the overall strategic goals and business plans, we gradually establish an indicator system that fully reflects the health of the business, including core indicators, indicator statistical logic, etc., to ensure that all business lines follow the same indicator definition and caliber, and establish an SOP process for indicator production.

Data standard construction: clarify the indicators recognized by the business, formulate data standards to describe the meaning of attribute layer data and business rules that enterprises need to comply with, and ensure that people have a common understanding and compliance with the same data.

Confirm the data source and processing method: Before processing and analyzing the data, it is necessary to confirm whether the data source and processing method are consistent. If not, corresponding adjustments and corrections need to be made.

Check data caliber: When processing and analyzing data, it is necessary to check whether the data caliber used by different business lines is consistent to ensure the uniformity of indicator caliber.

Systematization of indicator management: The concept of indicator management has existed for many years. Various Internet companies are building their own management platforms. After reading many articles about the construction of indicator management systems, you will find that the things they do are similar. They mainly focus on the pain points of indicator management, and use Alibaba's OneData theory as the methodology. The same things only need to be done once, and the rest is to provide productized solutions to make indicator construction and indicator reuse more standardized and efficient. It mainly includes:

  • Establish an indicator production coordination mechanism. The birth of indicators must go through the process of demand application, review, data development, and online application. The indicator creation process should be closed to avoid the "pollution" caused by the randomness of indicator construction.
  • Formulate indicator naming and caliber description specifications, integrate rules into the platform in the form of atomic indicators + business limitations + statistical dimensions, and control indicator output through system rules
  • The indicator dictionary is online to solve the problems of offline document (Excel) management indicators such as difficulty in sharing, untimely updates, and lack of authority control.
  • Indicator data logical binding, that is, in addition to maintaining the business metadata of the indicator, it is also necessary to establish the technical metadata of the indicator, from which model, which field, and what calculation logic the indicator data is obtained
  • Indicator output: The greatest value of indicator management is to provide data output for data products, synchronize the Hive layer model to query engines such as MySQL, Greenplumn, Kylin, CK, etc. that have better query performance and can respond in seconds, and directly obtain data by calling the JDBC connection method through the interface.

Training and communication : Strengthen communication and training between different business lines to ensure that everyone has a common understanding and recognition of data indicators and reduce misunderstandings and ambiguity.

<<:  Can the Tik Tok mini program only rely on short dramas?

>>:  Amazing! Accurately attract more than 650 people a day, the latest and most efficient way to promote WeChat public accounts

Recommend

The autumn recruitment of the 2000s has its own era bonus

In the autumn recruitment season of 2024, the post...

How to open your own store on Facebook? How to open a store?

If domestic merchants want to do overseas business...

New tea brands are entering Northeast counties

This article deeply analyzes the phenomenon of new...

Does Amazon Australia require VAT? What are the rules?

After learning about Amazon, we know that it has b...

Douyin, Kuaishou, and Meituan’s food delivery “Three Kingdoms”

Driven by the wave of digitalization, the local li...

An inventory of 50 private domain traffic entrances in 2024, add your friends!

Private domain operation is undoubtedly very impor...

How much does it cost to send a Royal Mail parcel to eBay?

When trading on the eBay platform, it is very impo...

Video accounts enter local life again

In the field of local life, the video account is a...

In the AI ​​era, it is necessary to relearn media knowledge

Why is it necessary to relearn media knowledge in ...