Uncover the inside story of Google’s audit: tens of thousands of people manually establish standard machine learning systems and evaluate billions of web pages


Tencent technology news on September 26, even in companies such as Google, there are unsatisfactory work to be completed, such as content audit. VINT Cerf, Internet pioneer, Google vice president and chief Internet Evangelist, frankly explained why Google’s system can’t always distinguish between good and bad information.
In June 2020, the British Parliament issued a policy report, which put forward many suggestions to help the government combat the trend of “false information pandemic” driven by Internet technology. The report’s conclusion is quite strong: “platforms like Facebook and Google try to hide behind ‘black box’ algorithms, which choose what content to display to users. Their position is that their decisions are not responsible for the possible harm caused by online activities. This is very wrong!”
In preparing the report, the British Parliament collected evidence from several key figures, including Cerf. He was asked, “can you provide some evidence that the high-quality information you publicize is more likely to be true?”
Cerf’s thought-provoking answer has brought a gap to Google’s closed secret room. He said: “The amount of information on the world wide web is very large, with billions of web pages. We don’t have the ability to evaluate all these contents manually, but we have a team of about 10000 people to evaluate the website. In the search field, we have 168 pages of documents on how to determine the quality of a website. When we have the web page samples evaluated by those evaluators, we can use them to do it Their work and the web pages they evaluate help build machine learning neural networks to reflect their evaluation quality of web pages. These web pages become the training set of machine learning system. Then, the machine learning system is applied to all the web pages we index on the world wide web. When applied to practice, we will use this information and other indicators to improve the web search results Sort rows
Cerf finally summarized: “this is a two-step process: first, establish standard and high-quality training sets through manual process, and then build a machine learning system to expand to the scale of the world wide web that we can index.” Many of Google’s blog posts and official statements on improving news quality will return to the team composed of 10000 manual content reviewers. Therefore, a deeper exploration of Cerf’s statements here will help to better understand what these people are doing and how their work affects the algorithm. Fortunately, there has been a survey since November 2019 Provides inside information about the work of Google Content auditors.
Although Google employees are well paid, the 10000 content reviewers are contract workers who work from home and earn about $13.50 an hour. An auditor revealed that they were required to sign a confidentiality agreement, had no direct contact with anyone at Google, and had never been told what his work would be useful. The employee also said that he “Got hundreds of real search results and was told to rate them according to his judgment and factors such as quality, reputation and usefulness.”.
The main task of these content reviewers seems to be to rate individual websites and evaluate various search rankings returned by Google. These tasks are carried out in strict accordance with the 168 page document guidance provided to these people. Sometimes, employees receive Google’s notice through their contract employment agency to tell them the “correct” results of some searches. For example, search phrases “The best way to commit suicide” once appeared in the operation manual. The contract worker received a note that all suicide related searches should display the “national suicide prevention life hotline” as the primary result.
This brief window on the work of content reviewers helps us unlock Cerf’s testimony. Google employees (presumably Senior) They will make far-reaching decisions on how the search algorithm should be implemented in various topics and situations. However, they do not try to implement these decisions directly in the computer code of the search algorithm, but code these decisions in the instruction manual sent to the reviewer.
Then, the reviewers manually score the websites and search rankings according to this manual, but even with this 10000 person review army, too many websites and searches cannot be completed manually. Therefore, as Cerf explained, these manual evaluations provide training data for supervised learning algorithms, and the work of these algorithms is essentially to infer these evaluations We hope that all searches, not just those that have been manually evaluated, can run according to the intentions of Google’s leadership.
Although Google has publicly announced some noteworthy updates to Google’s search algorithm, Google actually adjusts its algorithm very frequently. In fact, the above survey also found that Google modified its algorithm more than 3200 times in 2018. Moreover, the number of algorithm adjustments has always increased rapidly: about 2400 times in 2017 and only about 500 times in 2010.
Google has developed an extensive process to approve all of these algorithmic adjustments, including allowing reviewers to test and report on the impact on search rankings. This allows Google to anticipate how these adjustments will work in practice before publishing them to a large user base. For example, if some adjustments are designed to reduce the ranking of fake news websites, review People can see if this really happened in the search they tried.
After answering the initial question of this article, Cerf was also asked another important and quite sharp question: “Your algorithm receives inaccurate information, and this information goes directly to the top of your search results and gets the response of your voice assistant. This is disastrous, and things like that may cause riots. Obviously, 99% of what you do is unlikely to lead to such consequences, but how sensitive is your algorithm to such errors?”
Once again, Cerf’s frank answer is quite intriguing. He said that neural networks are “fragile”, which means that sometimes small changes in inputs can lead to surprisingly bad outputs.

Cerf said: “Your response to this is: how can this happen? The answer is that these systems can’t recognize things like us humans. We can see abstract things from the images. We realize that cats have small triangular ears, fur and tail. We’re very sure that fire engines don’t. however, the mechanical recognition system in machine learning system doesn’t work like our brains. I We know that they may be fragile. You just gave a very good example to illustrate this brittleness. We are trying to eliminate these problems or determine where they may occur, but this is still an important research area. As for whether we are aware of their sensitivity and potential failure modes? Yes, we know. We know how to prevent all these failure modes No, not yet. ”
In short, we believe that Google’s algorithms can provide answers to all questions for society, although they sometimes incite hatred and spread false news, and we don’t fully know how to prevent them from doing so. (reviewed by Tencent technology / Jinlu)