Value 1: Dispelling doubts about dataIn our daily work, I believe everyone often encounters such scenarios. Business personnel or senior leaders have great doubts about the data or indicator data in the report on the computer screen, "This data deviation is so large, is there a problem?" "Why is this data inconsistent with the offline data, is there a problem with your calculation logic?" "Why is the field of this month's sales revenue this number in system A and this number in system B? Which caliber do you use for your numbers?"... Faced with this series of problems, the data department is busy investigating and dispelling users' doubts about the report data. When facing users’ doubts about the reliability and authenticity of the reported data, the following issues may cause the data to be distorted:
Facing the above data problems, traditional troubleshooting methods are very lengthy and inefficient:
Once data anomalies are detected, users' concerns about the authenticity and reliability of the data are confirmed, and users will gradually lose trust in the data. Not only does this fail to improve the efficiency of using data, but it also requires data management personnel to check each questioned data over and over again. Since data goes through many processing links from production to enabling business applications, when business-side reports or data application services are abnormal, it is necessary to locate the problem, troubleshoot and repair it as soon as possible. If we rely on manual code translation layer by layer, the efficiency will be very low. On the one hand, data development manpower will be spent on troubleshooting, and on the other hand, the longer it takes to locate the problem, the greater the business impact and loss. If data lineage analysis technology is used, the troubleshooting efficiency can be greatly improved, especially the visualization of data lineage, which allows users to independently check the data source and link, and intuitively discover the data production link and whether there are any abnormalities in each link. This can quickly dispel end-user doubts about the reliability of reported data. Value 2: Rapid assessment of the impact of data changesDuring the data development process, data lineage can provide two value points: improving problem-solving efficiency and efficiently evaluating data impact. From a simple data perspective, the dimensions of data lineage include database, table, field, system, and application, that is, what table in what database the data is stored in, what the corresponding fields are and the attributes of the fields, the system to which the data belongs, and the applications related to the data. From a business perspective, the dimension of data lineage is mainly the business line to which the data belongs. When it comes to business, it is necessary to sort out the logic of data generation, the logic of data usage, and the relationship between business lines. Data lineage is critical to data governance – including compliance, data quality, data privacy and security. It is also important for data analytics and data science. The ability to map and verify how data is accessed and changed is critical for data transparency. It helps generate a detailed record of where specific data came from. It also shows how data has been changed, impacted, and used. Data lineage also makes it easier to respond to compliance audits and reporting queries. It also helps improve security posture by enabling organizations to track and identify potential risks in data flows. Data lineage helps organizations take a proactive approach to identifying and fixing data gaps required for business applications. This is particularly useful for data analytics and customer experience initiatives. Collecting sensitive data exposes organizations to regulatory scrutiny and business abuse. Data lineage shows how sensitive and other business-critical data flows throughout your organization. This way, you can ensure your policies align with existing controls. For IT operations, data lineage helps visualize the impact of data changes on downstream analytics and applications. It also helps understand the risk of business process changes. It enables you to take a more proactive approach to change management. It also improves operational efficiency by reducing time-consuming manual processes and reduces costs by eliminating duplicate data and data silos. Additionally, data lineage helps enable successful cloud data migrations and modernization initiatives that drive transformation. Data lineage can help visualize how different data objects and data flows are related and connected to the data graph. This deeper understanding makes it easier for data architects to predict how moving or changing data will affect the data itself. It also becomes easier to predict the impact on downstream processes and applications that rely on it and validate changes. Value 3: Data asset value assessment measurement toolIn the digital age, data is generally considered to be an important corporate asset. The general definition of data assets refers to data resources recorded in physical or electronic form that are owned or controlled by individuals or companies and can bring future economic benefits to the company. The key characteristics of data assets are:
In short, data with more users (demand side), greater usage, and more frequent updates are often more valuable. For example, the CRIC Research Center is a professional R&D department of CRIC Information Group under E-House China, and has been committed to in-depth exploration of real estate industry and corporate topics for ten years. Many companies spend money to purchase their research results data. The value of such data is obvious, so it can be called corporate assets; Guiyang Big Data Trading Platform can package its own data into services and APIs for customers to purchase and use; aggregation platforms, Qichacha, and Tianyancha provide corporate information queries. These are all data transactions with obvious value and redeemable value. These data will truly become shared data among companies, that is, data assets. So based on these ideas, how to make data a valuable asset may depend on whether this data has potential transaction value now or in the future. Based on the above issues, data lineage can be used as a measurement tool for evaluating the value of data assets. The specific value is as follows: Data lineage can clearly record the purchase and production costs of data. Even with subsequent processing, the cost of data can be clearly recorded throughout the entire data cycle, solving the problem of uncertainty in the initial confirmation of data assets. For example, we can record the value of the data we purchase from data suppliers. If it is an asset such as data indicators that we manually process internally, we can continue to track the cost value of the lineage data and finally form a summary. Since data lineage reflects the multi-source nature of data, we can further confirm the data assets formed during the processing of each data item. For example, the cost of data aggregation and processing involved in a certain indicator data can be shared. Data lineage reflects the life cycle of data, and the entire process from data generation to data extinction. When data is sealed or destroyed, it actually represents the life of the recorded data asset. This can further measure the value of the asset. In particular, as business development continues to grow, the increase in tasks and data tables will continue to expand the cost of big data resources. By building a comprehensive and accurate full-link data lineage, we can identify downstream data users, facilitate communication and information synchronization, and promptly take offline services that have not been called for a long time, saving data costs. Data assets need to consider whether the data is circulated (that is, what we call sharing). Most of our data projects serve the needs of internal management scenarios. We also need to consider whether some reference data is circulating in the market, such as financial statements, operating data, technical indicators, etc. published on the official website, to form circulating data assets (productization). Whether the data is used internally or shared externally, we need to measure its value. This requires the use of technologies similar to data lineage to register data assets online. On the one hand, assetizing data value measurement can facilitate pricing in data sharing transactions. Another very important aspect is to form a data security protection level based on the quantifiable value of data assets. Traditional data security protection level assessments often rely entirely on relevant regulatory requirements and business experience, lack assessment basis in specific application scenarios, and are divorced from the application scenarios and true business value of the data. Data lineage provides an evaluation method based on the actual application of data: the more users (demand side), the greater the usage volume, the greater the value, and the more frequent the updates, the higher the data security protection level should be. In short, to turn data into assets, we must conceive a series of systems and technical means around the "data value chain" to ensure that the value can be quantified and measured. Data lineage is the key technology to visualize the process from raw data, data resources to data products and data assets. Value 4: Add a "moral" lock to data abuseIn recent years, big data has made people's lives more and more convenient, but the ensuing chaos, such as big data killing old customers, abuse of facial recognition technology, and excessive requests for permissions, have harmed the legitimate interests of the public. Faced with various chaos, the public is often miserable, but helpless. One of the main reasons for data abuse is that a large amount of data is owned by super platforms, and the ownership of the data during the process of production, collection, circulation, and use is unclear. In response to the above challenges, we have gradually improved a number of security measures, such as: access control and isolation, implementation of multi-tenant access isolation measures, data security classification and grading, support for mandatory access control based on tags, providing ACL-based data access authorization model, and providing access control for data views. We also provide data desensitization and encryption functions, unified key management and access authentication services, data access audit logs, etc. It is important to note that data lineage analysis technology is a key means to solve data abuse. By tracking data lineage, we can confirm the source, owner and flow of the data. In this way, we can provide specific information based on the data life cycle, such as collection, storage, use, transmission, sharing, publication, and destruction, and take targeted management measures. In particular, solving the rights relationship between data generators, users, and miners is conducive to avoiding abuse after data ownership is confirmed. Data lineage indirectly provides a compliance mechanism for auditing, improving risk management, and ensuring that data is stored and processed in accordance with data governance policies and regulations. For example, GDPR legislation was enacted in 2016 to protect the personal data of people in the EU and EEA, giving individuals greater control over their own data. In the United States, individual states such as California have enacted policies such as the California Consumer Privacy Act (CCPA), which requires businesses to inform consumers about the data they collect. This type of legislation makes the storage and security of this data a top priority, and without data lineage analysis technology or related tools, organizations will find non-compliance issues a time-consuming and expensive task. Data lineage is a powerful tool in the era of refined data management and control. If enterprises can make good use of it, they will surely achieve great success in the field of data assets. |
Home furnishing brands need to place precise ads, ...
After you open a store on Amazon, if you don't...
When opening a store on Amazon, the platform requi...
Many friends choose to do cross-border e-commerce....
Whether the Amazon payment account is used by an i...
Why do brands go out of the circle and why do they...
Shopee is a cross-border e-commerce platform, and ...
Three Sheep has officially entered the overseas li...
For Amazon merchants, after opening a store, they ...
As the world's leading online shopping platfor...
This article describes how brands can maximize the...
Under the wave of the Internet, every ordinary per...
As the Amazon platform continues to grow, more and...
This article explains five methods of opening a wi...
The student paper was misjudged by AI, highlightin...