How data analysts should work with algorithm engineers is a long-standing problem. On the one hand, the business side has increasingly high expectations for models. On the other hand, many companies have problems such as poor data collection, lack of sufficient data personnel, unclear work goals, etc. How can analysis and algorithms work together to increase efficiency? Let me share with you today. 1. Two typical wrong practices: Goubuli styleSome company leaders like to dislike their own data analysts for their incompetence, always thinking that "only those who can develop a model are awesome". So data analysts are all trying to protect themselves , and simply draw a clear line with all work involving the word "model", leaving it all to algorithm engineers. Of course, doing this will kill the algorithm. Not to mention, in many cases, the model mentioned by the leader is simply something vague like "SWOT"; Not to mention, many modeling goals are simply unrealistic things like "predict what I can do to be successful." If no one supports the basic feature screening work, the algorithm engineer will be exhausted. The project progress is slow, and in the end, they are still criticized: "Why can't your model predict 100% accurately!!!" Of course, such problems are common in traditional enterprises, especially in the digital transformation stage, where leaders have read a lot of fancy PPTs and think they know a lot about traditional enterprises. Being treated like a dog : Some Internet companies have a relatively clear positioning for the application of algorithms, and the status of the algorithm group is relatively high. As a result, they go to the other extreme: the analysts assigned to the algorithm group are treated like dogs. You don’t have to worry about what they do, just follow my instructions to collect data. The work of data analysis is overwhelmed by endless data collection tables. Doing this will pit everyone. Because even data analysts don’t understand the algorithm logic, let alone the operations department. In the absence of knowledge, the operations department can only speculate on the algorithm effect through simple data indicator monitoring. And at the slightest sign of trouble, they will start to question: "The algorithm doesn’t work!", "What did you change secretly!", "You are just messing around!" These questions will become the fuse for departments to pass the buck and argue, triggering endless internal friction. 2. Basic ideas for breaking the impasseEssentially, analysis and algorithms are both applications of data. So here comes the soul-searching question: With data, will money pour out of the computer? Obviously not! Data itself cannot cure all diseases. If you want data to play a role, you have to closely integrate it with the actual business and find the points where data can help. However, the actual business situation is very complex, and data and business behaviors are often intertwined. for example:
At this time, the business department can always pass the buck: "Our data is too incompetent, it would be awesome if we had ByteDance's algorithms." On the data side, whether it is the algorithm or the analysis, they all take the blame. So the final solution is for the data students to unite, find good scenarios, make achievements, and reduce the blame , rather than stepping on each other. Empty words seem too empty, so let's look at it with a specific problem scenario. 3. Typical cooperation scenario 1Project establishment problem scenario: A large manufacturing company hopes to establish a "multi-dimensional three-dimensional analysis model" to improve recruitment efficiency. Question: How should it accept the demand at this time? This is a typical scenario where requirements are unclear.
The above situations are not clear Therefore, no matter who is responsible for the algorithm/analysis, they must first ask the above questions. Of course, when the problem definition is unclear, it is more appropriate to let the data analyst come forward to communicate. Data analysts are closer to the business, and it is easier for them to understand the business language and guide the business thinking. The business side further responded:
So, is it time to start building a “multi-dimensional” and “three-dimensional” model? No! Not even close! IV. Typical Cooperation Scenario 2There are three major problems in task decomposition , which restrict the progress of the project: 1. The definition of “suitable” management positions is unclear .The assessment of managers is much more complicated than that of assembly line workers. For assembly line workers, only a few simple dimensions such as age, ID card, and education are required, and the operation skills can also pass the standardized operation assessment. The assessment of managers is much more complicated, and there are also highly personalized and unquantifiable assessment points such as "whether the leader likes him or not". Therefore, we cannot simply stop here. Further definition is needed. 2. Labor force data for each province and city is missing .Note: Screening out suitable resumes from the current HR and identifying where there is more labor from the vast sea of people are two completely different issues. Because the data that have been received can be counted, but there is no data at all for the vast sea of people. If you start blindly, it is very likely to cause misjudgment. 3. The overall department employment cost and recruitment efficiency are two fundamental issues .The entire department's employment costs include not only new hires, but also in-service wages and benefits, compensation for departing employees, etc. If the goal is to control the overall cost of the department, then which part has the highest total amount, which part accounts for the largest proportion, which part is redundant, and which part is growing the fastest, you need to analyze them one by one in advance, and then see how to solve them. At this point, at least five tasks can be broken down.
These five tasks are mainly data analysis. Data analysis clarifies the current situation and collects data, so that the subsequent algorithms can be targeted. For example: 1. In the case of existing "suitable/unsuitable" labels for management positions, a classification prediction model (logistic regression/decision tree) is built for interviewees based on resume information, information provided by headhunters, and recruitment channel information to predict the probability of "suitability". 2. When you already have data on the overall labor cost structure, growth reasons, and development trends, build a predictive model (time series/multivariate regression) to determine whether labor costs will exceed expectations, thereby intervening in decision-making (do not recruit in large numbers due to short-term staff shortages, and compare the difference in costs between overtime pay and adding new employees). Of course, there is a third point of cooperation: when challenges arise at work, we tackle them together. V. Typical Cooperation Scenario 3Problem Solving When facing the ultimate question of "Why is the model inaccurate?", everyone must work together. The first thing to eliminate is the influence of external factors, unexpected fluctuations, and proactive business behavior . Don't throw mud on the model because of a problem. For example: A sudden change in senior management triggers a complete change in management recruitment requirements. An epidemic breaks out in the source of recruitment, and personnel cannot leave. Industry leaders suddenly increase salaries, raising the cost of the entire industry. The original recruitment plan is postponed for various reasons, and expectations are not met. New channels/new methods need to be added. All these factors will make the originally designed model invalid or reduce its effectiveness. In response to these changes, data analysis should be at the forefront. When monitoring data daily, problems can be discovered early, business risks can be indicated, and everyone can be reminded to pay attention to changes. Instead of waiting for the business to come to the door and then argue. VI. SummaryThe difference in the nature of the work of algorithms and analysis means that when they work together, they naturally have different focuses. The ideal way of cooperation is: analysis clears business obstacles, and algorithms focus on improving efficiency . Everyone works together to achieve results. In fact, if you work long enough and have enough contact with the business, you will find that most of the "model building" requirements that come directly from the business are unreliable, either because of missing data or unclear goals. Especially when it comes to prediction problems (classification problems are relatively better). Requirements that have been converted by data analysts are much more reliable. Author: Down-to-earth Teacher Chen WeChat public account: "Down-to-earth Teacher Chen" |
The 618 shopping festival, which is known as the m...
Merchants who have done e-commerce know that the p...
This year's 618 promotion has kicked off. This...
Now, "cloud shareholders" on the Interne...
When doing cross-border e-commerce, logistics is v...
As for domestic e-commerce, almost all e-commerce ...
Xiaohongshu is an APP that everyone knows, but why...
What is business? What is a good business? How to ...
Entering a new era, the old city of Harbin has und...
They do not engage in popular science, do not rega...
eBay, Taobao's former rival, has no presence i...
At present, there are many businesses that have in...
On content e-commerce platforms like Xiaohongshu, ...
Opening an attractive and unique store name on the...
Dear friends, if you want to shop on the Shopee cr...