Chapter 2: Where are the boundaries of data governance?

Chapter 2: Where are the boundaries of data governance?

In the wave of digital transformation, data governance has become a core issue that enterprises cannot ignore. However, faced with this huge issue, many enterprises and data managers often feel at a loss as to where to start. This article explores the boundary issues of data governance in depth, analyzes the roles and responsibilities of the three participants: data producers, data processors, and data consumers, as well as the two boundary scopes of data governance.

I don't know how other people feel about data governance, but I feel like I'm facing a huge monster and I don't know where to start. There seem to be a lot of things to do, but which ones should be done? What should be done first and what should be done next? Is there a key node that affects the whole system? How can we achieve phased output? How to smoothly promote it? And so on. It seems that there are more questions than answers.

If we talk about the first thing that needs to be agreed upon, at present, I personally believe that the first step in data governance is to determine the boundaries to be governed.

1. Three Data Participants

According to the data flow, there are three participants: data producers, data processors, and data consumers.

Data producers are business systems that generate data. Data processors are data departments that clean, model, and process data, usually data middle office departments. Data consumers are departments that ultimately use cleaned and processed data, which can be business departments or analysis departments.

2. Two Boundaries of Data Governance

What is meant by determining the boundaries of data governance here is to determine whether governance is only for the scope involved by data processors, or whether it also includes the scope involved by data producers.

In other words, is it to govern only the data generated by the business system and imported into the data center, that is, governance after entering the lake? Or is it to govern the data generated by the business system before it is imported into the data center, that is, governance before entering the lake?

From the name, if we want to conduct enterprise-level data governance, then of course we need both pre-entry and post-entry governance. However, the actual situation is that the two types of governance are quite different in terms of difficulty, process, communication scope, etc. It is said that data is like water flow. This kind of pre-entry data governance is like the lower reaches of a river. It is difficult to imagine the difficulty of requiring governance in the upper reaches of the river.

The first type is data governance after entering the lake, which is mainly led by the data middle platform (assuming that the data middle platform takes the lead), and other business departments assist. The second type includes data governance before entering the lake, but it is really necessary for the entire company to adapt and transform the system. Huawei is said to have reached a point where governance before and after entering the lake must comply with certain data governance standards and quality requirements before the business system is released online. If it does not meet the requirements, the system will not be allowed to go online.

If you say at the beginning of a data governance project that you need to conduct global data governance before and after entering the lake, and you need the cooperation of the entire company, it is highly unlikely to succeed. At this time, you will often hear that "data governance is a top-level project", but I personally feel that how and what the top-level person supports is not clear. If the support is really mindless and affects the business system, who will be responsible? After all, data governance is still at the stage of icing on the cake at present.

And, at the moment, it feels like a clear, successful path doesn't seem entirely clear.

Of course, this does not mean that leadership support is not needed. In the data governance process, leadership support is a condition of good timing (we will talk about location and people later). Rather, it means that only after the path is clear can leadership support be obtained, and leaders can give people, money, and time, and expectations must be controlled.

Most of the time support is conditional, and you need to see hope of success before you can get support.

3. Whether to carry out data governance before entering the lake

Does that mean that we don’t need to conduct data governance before entering the lake? Not really. Rather, we only conduct data governance after entering the lake at the beginning, and then gradually penetrate into data governance before entering the lake through problem-driven and scenario-driven approaches to influence the business and achieve global data governance.

Problem-driven is easy to understand. It means that if you find that some data is inconsistent, the caliber cannot be unified, and the value cannot be aligned, the reason is that the data in a certain system often has anomalies. At this time, this problem can be used to drive the business to conduct data governance before importing data into the middle platform, that is, to conduct data governance for the business system. Through erroneous data, the data quality of the source system is forced to improve.

Of course, this process requires tool monitoring and tool support. The business system can use tools to flexibly configure relevant monitoring rules. Instead of manual operation, there is no way to conduct effect statistics later. This involves tool preparation. In addition, there must be policy specifications on top of the tools, which can be implemented through the tools. There must also be corresponding organizational manpower to respond to and be responsible for this matter.

Another way is to drive by scenarios. In a relatively important scenario, there have always been inconsistent calibers within the company. For the same indicator, you say this value, and I say this value. Or key master data information cannot be pulled across systems, such as: personnel, address, etc. At this time, through a scenario, people from different organizations are coordinated to use tools to unify the caliber according to the same specification. For example, the address data of different systems in the company is unified, and the personnel collected by different systems in the company are established and consistent.

Each of these scenario-driven governance is a relatively large system project. For example, ECIF mentioned above is an independent system for connecting user master data.

This form of gradual improvement is used to influence the business and achieve the goal of system data governance before entering the lake.

In addition, there is another reason for conducting data governance after entering the lake first. Through this action of data governance after entering the lake, the communication and trust between the data middle office department and the business can be increased. The business can know what is being done and influence the business from the action level, rather than letting the business feel that this matter can be done or not, or even have resistance.

IV. Conclusion

This chapter mainly talks about the first issue that needs to be confirmed when it comes to data governance: the boundary issue of data governance. At the beginning, the data is mainly governed after entering the lake, and then the governance before entering the lake is continuously carried out through problem-driven and scenario-driven methods to achieve global governance.

After confirming the boundaries of data governance, before continuing to introduce the content of data governance, let us first talk about the difference between data management and data governance. By determining the different meanings of these two concepts, we can better understand the concept of data governance.

<<:  Luckin Coffee’s WeChat store received over 10,000 “gifts” in one day

>>:  How to distinguish between data governance and data management? (Guaranteed version)

Recommend

If you want to do marketing and sales, you must know these points

If you want to do marketing and close deals, you m...

Which country has the least traffic for Lazada? Analysis of each site

If you want to do cross-border e-commerce, when ch...

Can I reopen my Amazon account after it is closed? How can I cancel it?

Amazon is a platform where many people shop. After...

Amazon MCF sellers can package footwear and apparel orders without brands

Amazon Europe recently announced that Multi-Channe...

Can Alipay’s “touch and go” service take on WeChat Pay?

As the mobile payment market is becoming increasin...

The homophonic pun of "Ele.me" can no longer be hidden

Too much work pressure, rushing to commute? "...

When writing a data analysis report, don’t make these mistakes

In the field of data analysis, an accurate, clear ...

Internet giants are starting to grab new turf

During the Spring Festival holiday, the short play...

Can I cancel my bid on eBay? What are some tips for bidding?

Auctions are a popular way of shopping on the e-co...