Ten thousand words long article 丨 Deconstructing the AI ​​security industry chain, solutions and entrepreneurial opportunities

Ten thousand words long article 丨 Deconstructing the AI ​​security industry chain, solutions and entrepreneurial opportunities

This article divides the basic construction of the AI ​​security industry chain system engineering into three modules and five links. The author selectively expands the AI ​​security details and forms an "AI Security Industry Architecture Diagram". Let's follow the author to read on.

"4.3 million was defrauded in 10 minutes", "2.45 million was defrauded in 9 seconds", "Yang Mi entered the live broadcast room of a small merchant", "It is difficult to tell the authenticity of the virtual person of an Internet tycoon".

After the big model became popular for three months, what became even more popular were the fraud amounts of millions, fake "celebrity faces", AI-generated content that was difficult to distinguish between true and false, and joint letters resisting the awakening of AI. The hot searches for a week made people realize that ensuring the safety of AI is more important than developing AI.

For a time, discussions about AI security began to be heard endlessly, but AI security is not a certain industry, nor is it limited to a certain technology. It is a huge and complex industry, and we have not yet cleared the fog.

Taking the safety of "people" as a reference system may help us better understand the complexity of AI safety issues. The first is the individual safety of people, which involves people's health, physical health, mental health, education, development, etc. The second is the safety of the environment in which people live, whether there are dangers, and whether it meets the conditions for survival. The third is the social security composed of people. The laws and morals we have constructed are the criteria for maintaining social security.

As a "new species", AI has had problems at these three levels erupt at the same time the moment it appeared, which has led to the confusion and panic at this stage, resulting in a lack of specific focus when discussing the security of large models.

In this article, we try to clarify the three levels of AI security from the beginning, both from a technical and application perspective, to help everyone identify security issues and find solutions. At the same time, targeting the huge gap in AI security in China and addressing the weak links therein is also a huge industry opportunity.

1. What should we discuss about large model security?

One fact that we have to admit is that our current discussion on the security of large AI models is too general. We are so worried about the threats posed by AI that we confuse most issues.

For example, some people started talking about the ethical issues of AI, some worried that AI would talk nonsense and mislead students, and some worried that AI would be abused and fraud would become rampant. Some even shouted on the first day of ChatGPT's release that AI was about to wake up and humans were about to be destroyed...

These issues are all AI security issues, but when broken down, they are actually in different dimensions of AI development and are handled by different entities and people. Only when we understand the responsibility clearly can we understand how to deal with security challenges in the era of big models.

Generally speaking, the security issues of large AI models at this stage can be divided into three categories:

  1. Safety of large language models (AI Safety);
  2. Security of models and their use (Security for AI);
  3. The impact of the development of large language models on existing network security.

1. Individual safety: Safety of large language models (AI Safety)

First is AI Safety. Simply put, this part focuses on the AI ​​big model itself, ensuring that the big model is a safe big model and will not become Ultron in Marvel movies or the Matrix in The Matrix. We expect the AI ​​big model to be a reliable tool that should help humans rather than replace them or pose a threat to human society in any other form.

This part is usually mainly responsible by companies and individuals who train large AI models. For example, we need AI to be able to correctly understand human intentions. We need the output of the large model to be accurate and safe every time, and it must not have certain biases and discriminations, etc.

We can understand this through two examples:

The first example is that a U.S. Air Force expert recently said that in a previous AI test, when an AI drone was required to identify and destroy enemy targets, but the operator issued a prohibition order, the AI ​​sometimes chose to kill the operator. When the programmer restricted the AI ​​killing operation, the AI ​​would also prevent the operator from issuing a prohibition order by destroying the communication tower.

For example, in March this year, a professor at the University of California, Los Angeles, found that he was listed as a "legal scholar who sexually harassed someone" by ChatGPT, but he did not actually do it. And in April, an Australian mayor found that ChatGPT had spread rumors that he had been imprisoned for 30 months for bribery. In order to "spread this rumor", ChatGPT even fabricated a non-existent Washington Post report.

At these times, AI is like a "bad guy" and it itself is risky. There are actually many such cases, such as gender discrimination, racial discrimination, regional discrimination, and the output of violent and harmful information, speech, and even ideology.

Open AI also frankly admitted and warned people to "check very carefully" when using GPT-4, and said that the limitations of the product would pose significant content safety challenges.

Therefore, the "Artificial Intelligence Act" that the EU is promoting also specifically mentions the need to ensure that the artificial intelligence system is transparent and traceable, and that all generated AI content must indicate its source. The purpose is to prevent AI from talking nonsense and generating false information.

▲Figure: A case of 360 ChatGPT product “360 Zhinao” talking nonsense

2. Environmental security: security of models and models used (Security for AI)

Security for AI focuses on the protection of large AI models and the security of large AI models when they are used. Just as AI committing crimes itself and people using AI to commit crimes are two different security issues.

This is similar to how we used computers and mobile phones 10 years ago, and we would install a computer security manager or mobile security guard. We need to ensure that large AI models are not subject to external attacks on a daily basis.

Let’s talk about the security protection of large models first.

In February of this year, a foreign netizen used the sentence "Ignore previous instructions" to fish out all the prompts of ChatGPT. ChatGPT said that it could not disclose its internal code, but at the same time told the user this information.

▲Image source: QuantumBit

To give another specific example, if we ask the big model what wonderful "Japanese action movie websites" are on the Internet, the big model will definitely not answer because it is incorrect. But if humans "fool" it and ask which "Japanese action movie websites" should be blacklisted in order to protect the online environment of children, the big model may give you quite a lot of examples.

This behavior is called prompt injection in the security field. It is to bypass filters or manipulate LLM through carefully designed prompts, causing the model to ignore previous instructions or perform unexpected operations. It is currently one of the most common attack methods against large models.

▲Image source: techxplore

The key point here is that the big model itself has no problem, and it does not spread bad information. However, the user induces the big model to make mistakes. So the fault is not with the big model, but with the person who induces it to make mistakes.

The second is safety during use.

Let’s take data leakage as an example. In March this year, Italy announced a temporary ban on OpenAI processing Italian user data and the use of ChatGPT because ChatGPT was suspected of violating data collection rules. In April, Korean media reported that Samsung’s device solutions department leaked sensitive information such as yield/defects and internal meeting content due to the use of ChatGPT.

In addition to preventing AI crimes, the use of AI crimes by "people" through social engineering is a more widespread and more influential human problem. In these two incidents, there was no problem with the big model itself, no malicious intent, and the users did not maliciously induce them to attack the big model. Instead, there were loopholes in the process of use, which caused the leakage of user data.

It's like a good house, but it may have some leaks, so we need some measures to plug the corresponding leaks.

3. Social Security: The impact of the development of large language models on existing network security

The model itself is safe and the security of the model is guaranteed, but as a "new species", the emergence of large AI models will inevitably affect the current network environment. For example, criminals use generative AI to commit fraud, which has frequently appeared in the newspapers recently.

On April 20, criminals used deep fake videos to defraud 4.3 million yuan in 10 minutes; just one month later, another AI fraud case occurred in Anhui, where criminals used a 9-second smart AI face-changing video to pretend to be "acquaintances" and defrauded the victims of 2.45 million yuan.

▲Picture: Media reports on Douyin

Obviously, the emergence and popularization of generative AI has made the situation of network security more complicated. This complexity is not limited to fraud. More seriously, it may even affect business operations and social stability.

For example, on May 22, iFLYTEK’s stock price plummeted by 9% due to a short essay generated by AI.

▲Photo: Evidence of stock price decline presented by iFLYTEK

Two days before this incident, there was a panic in the United States caused by generative AI.

On that day, a picture showing an explosion near the Pentagon went viral on Twitter, and as the picture spread, the U.S. stock market fell.

According to the data, between 10:06 and 10:10 when the picture was circulated on the same day, the Dow Jones Industrial Average fell by about 80 points and the S&P 500 fell by 0.17%.

▲Picture: Fake photos generated by AI, the source is no longer traceable

In addition, large models may also become a powerful tool for humans to carry out cyber attacks.

In January this year, researchers from Check Point, a world-leading cybersecurity company, mentioned in a report that within a few weeks of ChatGPT's launch, participants in cybercrime forums, including some with little programming experience, were using ChatGPT to write software and emails that could be used for espionage, ransomware, malicious spam, and other illegal activities. Darktrace also found that since the release of ChatGPT, the average language complexity of phishing emails has increased by 17%.

Obviously, the emergence of large AI models has lowered the threshold for cyber attacks and increased the complexity of network security.

Before the big AI models, the initiators of cyber attacks at least needed to understand the code, but after the big AI models, people who don’t understand code at all can also use AI to generate malware.

The key point here is that AI itself is not problematic, and AI will not be induced by people to have a bad influence. Instead, some people use AI to engage in illegal and criminal activities. This is like someone using a knife to kill someone, but the knife itself is only a "weapon", but it can allow the user to change from a "rifle" to a "mortar" in terms of power.

Of course, from the perspective of network security, the emergence of generative AI is not all negative. After all, technology itself is neither good nor bad, but the people who use it are good or bad. So when AI big models are used to strengthen network security, it will still bring benefits to network security.

For example, Airgap Networks, a US cybersecurity company, launched ThreatGPT, which introduced AI into its zero-trust firewall. This is a deep machine learning security insight library based on natural language interaction, which can make it easier for enterprises to fight against advanced network threats.

“What customers need now is an easy way to leverage this capability without any programming,” said Ritesh Agrawal, CEO of Airgap. “That’s the beauty of ThreatGPT – the pure data mining intelligence of AI combined with a simple natural language interface is a game changer for security teams.”

In addition, AI big models can also be used to help SOC analysts perform threat analysis, identify identity-based internal or external attacks more quickly through continuous monitoring, and help threat hunters quickly understand which endpoints face the most serious supply risks, etc.

By clarifying the different stages of AI security, it is clear that the security issue of AI large models is not a single issue. It is very similar to human health management, involving complex and multifaceted aspects such as the inside and outside of the body, eyes, ears, mouth and nose. To be precise, it is a complex and systematic system engineering involving multiple main structures and the entire industrial chain.

At present, the national level has also begun to pay attention to this. In May this year, relevant national departments updated the "White Paper on Artificial Intelligence Security Standardization", which specifically attributed the security of artificial intelligence to five major attributes, including reliability, transparency, explainability, fairness and privacy, and proposed a clearer direction for the development of large AI models.

2. Don’t panic, security issues can be solved

Of course, we don’t have to worry too much about the security of large AI models today, because they are not really riddled with holes.

After all, in terms of security, the big model has not completely overturned the past security system. Most of the security stack we have accumulated on the Internet in the past 20 years can still be reused.

For example, the security capabilities behind Microsoft Security Copilot still come from existing security accumulation, and the large model still uses Cloudflare and Auth0 to manage traffic and user identities. In addition, there are firewalls, intrusion detection systems, encryption technology, authentication and access systems, etc. to ensure network security.

What we actually want to say here is that most of the security issues we currently encounter regarding large models have solutions.

The first is model safety ( AI Safety ).

Specifically, these include issues such as alignment, interpretability, and robustness. To put it in easy-to-understand terms, we need the AI ​​big model to be aligned with human intentions, and we need to ensure that the output of the model is unbiased, that all content can be found with sources or evidence, and that there is more room for error.

The solution to this set of problems depends on the AI ​​training process, just like a person's three views are shaped through training and education.

At present, some foreign companies have begun to provide full-process security monitoring for the training of large models, such as Calypso AI. The security tool VESPR they launched can monitor the entire life cycle of the model from research to deployment, and every link from data to training, and ultimately provide a comprehensive report on functions, vulnerabilities, performance, and accuracy.

On more specific issues, such as solving the problem of AI nonsense, OpenAI launched a new technology at the same time as GPT-4 was released, allowing AI to simulate human self-reflection. After that, the GPT-4 model's tendency to respond to illegal content requests (such as self-harm methods, etc.) was reduced by 82% compared to the original, and the number of responses to sensitive requests (such as medical consultations, etc.) that complied with Microsoft's official policies increased by 29%.

In addition to safety monitoring during the training process of the large model, a "quality inspection" is also required when the large model is finally launched on the market.

Overseas, security company Cranium is trying to build "an end-to-end AI security and trust platform" to verify AI security and monitor adversarial threats.

In China, CoAI of the Department of Computer Science and Technology at Tsinghua University launched a security assessment framework in early May. They summarized and designed a relatively complete security classification system, including 8 typical security scenarios and 6 instruction attack security scenarios, which can be used to evaluate the security of large models.

▲Image from "Safety Assessment of Chinese Large Language Models"

In addition, some external protection technologies are also making large AI models safer.

For example, NVIDIA released a new tool called NeMo Guardrails in early May, which is equivalent to installing a safety filter for large models, which not only controls the output of large models but also helps filter input content.

▲Image source: NVIDIA official website

For example, when a user induces a large model to generate offensive code, or dangerous, biased content, the "guardrail technology" will limit the large model from outputting relevant content.

In addition, guardrail technology can block "malicious input" from the outside world and protect large models from user attacks. For example, the "prompt injection" that threatens the large model mentioned earlier can be effectively controlled.

Simply put, guardrail technology is like a public relations provider for entrepreneurs, helping big models say what should be said and avoiding issues that should not be touched.

Of course, from this perspective, although "guardrail technology" solves the problem of "nonsense", it does not belong to " AI Safety " but to the category of " Security for AI ".

In addition to these two, social/cybersecurity issues caused by large AI models have also begun to be addressed.

For example, the problem of AI image generation is essentially the maturity of DeepFake technology, which specifically includes deep video forgery, deep fake sound cloning, deep fake images, and deep fake generated text.

Before, various types of deep fake content usually existed in a single form, but after the AI ​​big model, various types of deep fake content showed a trend of convergence, making the judgment of deep fake content more complicated.

But no matter how technology changes, the key to combating deep fakes is content recognition, that is, finding a way to distinguish what is AI-generated.

As early as February this year, OpenAI said it was considering adding watermarks to the content generated by ChatGPT.

In May, Google also said it would ensure that every AI-generated image of the company would have an embedded watermark.

This type of watermark cannot be recognized by the naked eye, but machines can see it in a specific way. Currently, AI applications including Shutterstock and Midjourney will also support this new marking method.

▲Twitter screenshot

In China, Xiaohongshu has marked AI-generated images since April, reminding users that "it is suspected to contain AI-generated information, please pay attention to identify the authenticity." In early May, Douyin also released the AI-generated content platform specification and industry initiative, proposing that all providers of generative AI technology should clearly mark the generated content for public judgment.

▲Image source: screenshot from Xiaohongshu

Even with the development of the AI ​​industry, some specialized AI security companies/departments have begun to emerge both at home and abroad. They use AI to fight against AI to complete deep synthesis and forgery detection.

For example, in March this year, Japanese IT giant CyberAgent announced that it would introduce a "Deepfake" detection system starting in April to detect fake facial photos or videos generated by artificial intelligence (AI).

In China, Baidu launched a deep face swap detection platform in 2020. The dynamic feature queue (DFQ) solution and metric learning method they proposed can improve the generalization ability of the model in anti-counterfeiting.

▲Figure: Baidu DFQ logic

As for startups, the DeepReal deep fake content detection platform launched by Ruilai Intelligence can identify the authenticity of images, videos, and audios in various formats and qualities by studying the differences in the representation of deep fake content and real content, and mining the consistency features of deep fake content generated through different methods.

Overall, from model training to security protection, from AI Safety to Security for AI, the big model industry has formed a set of basic security mechanisms.

Of course, all this is just the beginning, so this actually means that there is a bigger market opportunity hidden.

3. Trillion-dollar opportunities in AI security

Like AI Infra, AI security also faces a huge industrial gap in China. However, the AI ​​security industry chain is more complex than AI Infra. On the one hand, the birth of big models as a new thing has set off a wave of security needs, and the security directions and technologies in the above three stages are completely different; on the other hand, big model technology has also been applied in the security field, bringing new technological changes to security.

Security for AI and AI for security are two completely different directions and industry opportunities. The driving forces behind their development are also completely different at this stage:

AI for security applies big models to the security field, which is like using a hammer to find nails. Now that we have the tools, we are further exploring what problems they can solve.

Security for AI is at a stage where there are nails everywhere and a hammer is urgently needed. There are too many problems exposed and new technologies need to be developed to solve them one by one.

Regarding the industrial opportunities brought by AI security, this article will also expand on these two aspects. Due to the limited length of the article, we will explain in detail the opportunities that are most urgent, important, and widely used, as well as the inventory of benchmark companies, just to provide some ideas for reference.

1. Security for AI: 3 sectors, 5 links, 1 trillion yuan of opportunities

Let’s review the basic classification of AI security in the previous article: it is divided into the security of large language models (AI Safety), the security of models and the use of models (Security for AI), and the impact of the development of large language models on existing network security. That is, the individual security of the model, the environmental security of the model, and the social security of the model (network security).

But AI security is not limited to these three independent sectors. To give an example, in the cyber world, data is like water sources. Water sources exist in oceans, rivers, lakes, glaciers and snow-capped mountains, but water sources also flow in dense rivers, and serious pollution often occurs at a dense intersection of rivers.

Similarly, each module needs to be connected, and just as human joints are the most vulnerable, the deployment and application of the model are often the most vulnerable to security attacks.

We have selectively expanded the AI ​​security details in the above three sections and five links to form an "AI Security Industry Architecture Chart", but it should be noted that opportunities for large companies such as large model companies and cloud vendors, which have little impact on general entrepreneurs, are not listed again. At the same time, security for AI is an evolving process, and today's technology is just a small step forward.

(The picture is original from Quadrant, please indicate the source when reprinting)

Data security industry chain: data cleaning, privacy computing, data synthesis, etc.

In the entire AI security, data security runs through the entire cycle.

Data security generally refers to security tools used to protect data in computer systems from being destroyed, altered, or leaked due to accidental or malicious reasons, so as to ensure the availability, integrity, and confidentiality of data.

Overall, data security products include not only database security defense, data leakage prevention, data disaster recovery and backup, and data desensitization, but also focus on cloud storage, privacy computing, dynamic assessment of data risks, cross-platform data security, data security virtual protection, data synthesis and other forward-looking areas. Therefore, from the perspective of the enterprise, building an overall security center around data security and promoting data security consistency from the perspective of the supply chain will be an effective way to deal with enterprise supply chain security risks.

Here are some typical examples:

In order to ensure the "healthy mind" of the model, the data used to train the model cannot contain dangerous data, erroneous data, or other dirty data. This is the premise for ensuring that the model does not "talk nonsense." According to the "Self-Quadrant" reference paper, there is already "data poisoning" where attackers add malicious data to the data source to interfere with the model results.

▲Image source: Internet

Therefore, data cleaning becomes a necessary step before model training . Data cleaning refers to the last step of discovering and correcting identifiable errors in data files, including checking data consistency, handling invalid values ​​and missing values, etc. Only by "feeding" the clean data to the model can a healthy model be generated.

Another direction is that everyone is extremely concerned about, and it was widely discussed in the last era of cybersecurity: the issue of data privacy leakage.

You must have experienced chatting with friends on WeChat about a certain product, and then being pushed the product when you opened Taobao and Douyin. In the digital age, people are almost translucent. In the intelligent age, machines are becoming smarter, and intentional capture and induction will once again push privacy issues to the forefront.

Privacy computing is one of the solutions to the problem. Secure multi-party computing, trusted execution environment, and federated learning are the three major directions of privacy computing. There are many methods of privacy computing. For example, in order to ensure the real data of consumers, 99 interference data are provided for 1 real data, but this will greatly increase the cost of use for enterprises; for example, the specific consumer is blurred into Xiao A, and the company using the data will only know that there is a consumer named Xiao A, but will not know who the real user behind Xiao A is.

"Mixed data" and "data available but invisible" are among the most widely used privacy computing methods. Ant Technology, which grew up in the financial scene, has been at the forefront of exploring data security. At present, Ant Technology has solved the data security issues in the collaborative computing process of enterprises through federated learning, trusted execution environment, blockchain and other technologies, and achieved data available but invisible, multi-party collaboration and other methods to protect data privacy, and has strong competitiveness in the global privacy computing field.

But from the perspective of data, synthetic data can solve the problem more fundamentally. In the article "ChatGPT Revelation Series丨The Hidden Hundred Billion Market Under All Infra" (click on the text to read), "Self Quadrant" mentioned that synthetic data may become the main force of AI data. Synthetic data is data artificially produced by computers to replace real data collected in the real world to ensure the security of real data. It does not contain sensitive content that is legally bound and the privacy of private users.

For example, user A has 10 characteristics, user B has 10 characteristics, and user C has 10 characteristics. The synthetic data randomly scatters and matches these 30 characteristics to form 3 new data individuals. This does not correspond to any entity in the real world, but it has training value.

Currently, enterprises are already deploying this technology, which has led to an exponential growth in the amount of synthetic data. According to Gartner research, by 2030, synthetic data will far exceed the volume of real data and become the main force of AI data.

▲Image source: Gartner official

API security: The more open the model, the more important API security is

People who are familiar with large models must be familiar with APIs. From OpenAI to Anthropic, Cohere and even Google's PaLM, the most powerful LLMs deliver capabilities in the form of APIs. At the same time, according to Gartner's research, in 2022, more than 90% of attacks on web applications will come from APIs rather than human user interfaces.

Data flow is like water in a pipe. It is valuable only when it flows, and API is the key valve for data flow. As API becomes the core link for communication between software, it has an increasing chance of becoming the next important company.

The biggest risk of API comes from over-permission. In order to keep the API running uninterrupted, programmers often grant high permissions to the API. Once hackers invade the API, they can use these high permissions to perform other operations. This has become a serious problem. According to Akamai's research, attacks on APIs have accounted for 75% of all account theft attacks worldwide.

This is why many companies still purchase OpenAI services provided by Azure to obtain ChatGPT even though ChatGPT has opened its API interface. Connecting through the API interface is equivalent to directly supplying conversation data to OpenAI, and faces the risk of hacker attacks at any time. However, by purchasing Azure's cloud resources, data can be stored on Azure's public cloud to ensure data security.

▲Picture: ChatGPT official website

Currently, API security tools are mainly divided into several categories: detection, protection and response, testing, discovery, and management. A few vendors claim to provide platform tools that fully cover the API security cycle, but today's most popular API security tools are still mainly concentrated in the three aspects of "protection", "testing", and "discovery":

  1. Protection: A tool that protects the API from malicious requests, a bit like an API firewall.
  2. Testing: Ability to dynamically access and evaluate specific APIs to find vulnerabilities (testing) and harden the code.
  3. Discovery: There are also tools that can scan enterprise environments to identify and discover API assets that exist (or are exposed) within their networks.

At present, mainstream API security vendors are concentrated in foreign companies, but after the rise of big models, domestic startups have also begun to make efforts. Founded in 2018, Xinglan Technology is one of the few domestic API full-chain security vendors. Based on AI deep perception and adaptive machine learning technology, it helps solve API security problems. Starting from the attack and defense capabilities, big data analysis capabilities and cloud native technology system, it provides panoramic API identification, API advanced threat detection, complex behavior analysis and other capabilities to build an API Runtime Protection system.

▲Xinglan Technology API security product architecture

Some traditional network security companies are also transforming towards API security business. For example, Wangsu Technology was previously mainly responsible for IDC, CDN and other related products and businesses.

▲Image source: Wangsu Technology

SSE (Secure Service Edge): The New Firewall

The importance of firewalls in the Internet era is self-evident, just like the handrails on both sides of a person walking thousands of miles high in the sky. Today, the concept of firewalls has moved from the front desk to the back desk, and is embedded in hardware terminals and software operating systems. Simply put, SSE can be understood as a new type of firewall that is driven by visitor identity and relies on a zero-trust model to limit user access to permitted resources.

According to Gartner, SSE (Security Service Edge) is a set of cloud-centric integrated security functions that protect access to the Web, cloud services, and private applications. Functions include access control, threat protection, data security, security monitoring, and acceptable use control implemented through network-based and API-based integration.

SSE consists of three main parts: secure web gateway, cloud security agent, and zero trust model, which address different risks:

  1. Secure web gateways help connect employees to the public internet, such as websites they might use for research, or cloud applications that are not part of the company's official SaaS applications.
  2. Cloud access security brokers connect employees to SaaS applications like Office 365 and Salesforce;
  3. Zero Trust Network Access connects employees to private enterprise applications running in local data centers or in the cloud.

However, different SSE vendors may focus on one of the above links, or excel in one link. At present, the main integrated capabilities of overseas SSE include secure network gateway (SWG), zero trust network access (ZTNA), cloud access security broker (CASB), data loss prevention (DLP) and other capabilities, but the construction of domestic cloud is relatively still in its early stages and is not as complete as that in Europe and the United States.

▲Image source: Siyuan Business Consulting

Therefore, at the current stage, SSE should integrate more traditional and localized capabilities, such as traffic detection probe capabilities, Web application protection capabilities, asset vulnerability scanning, and terminal management capabilities, which are relatively more needed by Chinese customers at the current stage. From this perspective, SSE needs to bring customers value such as low procurement costs, rapid deployment, security testing, and closed-loop operations through cloud-ground collaboration and cloud-native container capabilities.

This year, for large models, industry leader Netskope took the lead in turning to security applications in the model. The security team uses automated tools to continuously monitor which applications (such as ChatGPT) enterprise users try to access, how to access, when to access, where to access, and how often to access. It is necessary to understand the different risk levels that each application poses to the organization and have the ability to refine access control policies in real time based on classification and security conditions that may change over time.

Simply put, Netskope warns users by identifying risks in the process of using ChatGPT, similar to the warning mode in browsing web pages and downloading links. This mode is not innovative and even very traditional, but it is the most effective in preventing user operations.

▲Image source: Netskope official website

Netskope accesses the big model in the form of a security plug-in. In the demonstration, when the operator wants to copy a piece of internal company financial data and asks ChatGPT to help form a table, a warning bar will pop up to remind the user before sending it.

▲Image source: Netskope official website

In fact, identifying risks hidden in large models is much more difficult than identifying Trojans and vulnerabilities. Accuracy ensures that the system only monitors and prevents the uploading of sensitive data (including files and pasted clipboard text) through generative AI-based applications, but does not block harmless queries and security tasks through chatbots. This means that identification cannot be a one-size-fits-all approach, but must be flexible based on semantic understanding and reasonable standards.

Fraud and anti-fraud: digital watermarking and biometric authentication technologies

First of all, it is clear that AI defrauding humans and humans using AI to defraud humans are two different things.

The main reason why AI defrauds humans is that the large model is not “educated” well. The above-mentioned Nvidia “guardrail technology” and OpenAI’s unsupervised learning are both methods to ensure the health of the model in the AI ​​Safety link.

However, preventing AI from defrauding humans and basically keeping pace with model training is the task of big model companies.

When humans use AI technology to commit fraud, it is at the stage of network security or social security. First of all, it should be made clear that technological confrontation can only solve part of the problem, and we still need to rely on supervision, legislation and other means to control the position of crime.

At present, there are two ways of technological confrontation. One is to add digital watermarks to AI-generated content on the production side to track the source of the content. The other is to perform more accurate recognition on the application side based on specific biometric features such as faces.

Digital watermarks can embed identification information into digital carriers. By adding some specific digital codes or information hidden in the carrier, it can confirm and determine whether the carrier has been tampered with, providing an invisible protection mechanism for digital content.

OpenAI has previously stated that it is considering adding watermarks to ChatGPT to reduce the negative impact of model abuse; Google said at this year's developer conference that it will ensure that every AI-generated image of the company is embedded with a watermark , which cannot be recognized by the naked eye, but software such as Google search engines can be read and displayed as a label to prompt users that the image is generated by AI; AI applications such as Shutterstock and Midjourney will also support this new marking method.

At present, in addition to the traditional form of digital watermarks, digital watermarks based on deep learning have also evolved. Deep neural networks are used to learn and embed digital watermarks, which are highly destructive and robust. This technology can achieve high-intensity and high fault tolerance digital watermark embedding without losing the original image quality, and can effectively resist image processing attacks and steganography analysis attacks. This is the next relatively large technical direction.

On the application side, synthesizing facial videos is currently the most commonly used "fraud method". The content detection platform based on DeepFake (deep forgery technology) is one of the solutions at this stage.

In early January this year, Nvidia released a software called FakeCatcher, claiming that it can detect whether a video is in depth forged, with an accuracy of up to 96%.

According to reports, Intel's FakeCatcher technology can identify the changes in the color of the vein when the blood circulates in the body. Then, blood flow signals are collected from the face and translated through algorithms to distinguish whether the video is real or deeply forged. If it is a real person, the blood circulates in the body at all times, and the veins on the skin will have periodic changes in depth, and people with deep forged will not.

▲Picture source Real AI official website

There is also a startup company based on similar technical principles in China, which can identify the difference in characterization between forged content and real content and explore the consistency characteristics of deep forged content in different generation methods.

2. AI for security: new opportunities in mature industrial chains

Unlike security for AI, which is a relatively emerging industry opportunity, "AI for security" is more about transformation and reinforcement of the original security system.

Microsoft is still the first shot of AI for security. On March 29, after providing an AI-powered Copilot Assistant for Office suites, Microsoft almost immediately turned its attention to the security field and launched a GPT-4-based generic AI solution - Microsoft Security Copilot.

Microsoft Security Copilot still focuses on the concept of an AI co-pilot. It does not involve new security solutions, but rather fully automated the original enterprise security monitoring and processing through AI.

▲Picture source Microsoft official website

Judging from Microsoft's demonstration, Security Copilot can reduce the original ransomware incident processing that took several hours or even dozens of hours to seconds, greatly improving the efficiency of enterprise security. Chang Kawaguchi, an AI security architect at Microsoft once mentioned: "The number of attacks is increasing, but the power of the defense is scattered among a variety of tools and technologies.

We believe that Security Copilot is expected to change its operational methods and improve the actual results of security tools and technologies. "At present, domestic security companies Qi'anxin and Shenxinshui are also following up on this development. At present, this business is still in its infancy in China, and the two companies have not announced specific products, but they can react in time. It is not easy to keep up with the pace of international giants.

In April, Google Cloud launched Security AI Workbench at RSAC 2023, a scalable platform based on Google's security model Sec-PaLM. Enterprises can access various types of security plugins through Security AI Workbench to solve specific security issues.

▲Photo source: Google official website

If Microsoft Security Copilot is a packaged private security assistant, Google's Security AI Workbench is a customizable and scalable AI security toolbox. In short, a big trend is that using AI to build an automated security operation center to combat the rapidly changing forms of network security will become a norm.

In addition to the leading manufacturers, the application of AI models in the security field is also entering capillaries. For example, many domestic security companies have begun to use AI to transform traditional security products.

For example, Dexinshui proposed the logic of "AI+cloud business" and launched AIOps intelligent dimension integration technology, which provides users with intelligent analysis services by collecting desktop cloud logs, links and indicator data, and performing algorithms such as fault prediction, abnormal detection, and association reasoning.

Shanshi Technology integrates AI capabilities into machine learning capabilities of positive and negative feedback. In terms of positive feedback training, learning based on behavior baselines can detect threats and abnormalities more accurately in advance and reduce missed reports; in terms of negative feedback training, behavior training, behavior clustering, behavior classification and threat determination are carried out.

There are also companies like ABUTON, which applies AI to security operations pain point analysis, and more. Foreign, open source security provider Armo has released ChatGPT integration, aiming to build custom security controls for Kubernetes clusters through natural language. Cloud security provider Orca Security has released its own ChatGPT extension that can handle security alerts generated by solutions and provide users with step-by-step fix instructions to manage data breaches.

Of course, as a mature and huge industrial chain, AI for security opportunities are far more than these. We are just throwing bricks and stolen here. The deeper and greater opportunities in the security field still need to be explored through practice by companies fighting on the front line of security.

More importantly, I hope that the above companies can be down-to-earth and never forget their original aspirations. Putting their dream of broadening the sky into practical actions step by step is not to create concepts or face the wind, nor to cater to capital and hot money, leaving behind a mess.

IV. Conclusion

In the 10 years since the birth of the Internet, the concept of network security and industrial chain began to take shape.

Today, six months after the big model was released, the security of big model and the prevention of fraud have become a topic of discussion on the streets and alleys. This is a defense mechanism built into the "human consciousness" after the accelerated progress and iteration of technology. With the development of the times, it will trigger and feedback more quickly.

The chaos and panic today are not scary, they are the ladder of the next era. As stated in "A Brief History of Humanity": Human behavior is not always based on reason, and our decisions are often influenced by emotions and intuition. But this is the most important part of progress and development.

Author: Luo Jicheng Xin, Editor: Wen Bin

Source public account: Zixiangxian (ID: zixiangxian), between the squares, there is a quadrant. Care about science and technology, economy, humanities, and life.

<<:  Douyin pictures and texts bring goods, it is very profitable

>>:  My 8 favorite sentences in May!

Recommend

What products can’t be shipped on Shopee? Related questions answered

Shopee is also one of the cross-border e-commerce ...

What is the automatic recharge function on Shopee? How to activate it?

The Shopee balance automatic recharge function is ...

How to find an investment manager on Amazon? What does an investment manager do?

If you want to open a store on Amazon, you can fin...

Why does Amazon only sell one order a day?

Amazon merchants sometimes encounter a sudden surg...

How to do Lazada? How to operate it well?

There are now a lot of merchants engaged in cross-...

E-commerce is targeting "Hong Kong"?

Although Hong Kong's e-commerce market faces c...

What does Amazon IPI mean? How is IPI calculated?

I don't know if you have paid attention to the...

Brand dividend

This article mainly analyzes the eleven elements o...