The Purported Google Search Algorithm Leak Makes This One Point Abundantly Clear (And It's Not What You Think)

The recent purported leak of Google's Search algorithm has provided a rare glimpse into the intricate operations behind the world's most widely used search engine. This leak, which allegedly includes over 2,500 pages of API documentation, suggests that Google's public guidance for creating "people-first content" may not align with its actual ranking practices.

Google has long emphasized the importance of content that demonstrates expertise, authoritativeness, and trustworthiness (E-E-A-T), encouraging website owners to focus on creating content for users, not search engines. However, the leaked documents suggest that Google's ranking system relies on a variety of metrics that the company has publicly downplayed or denied, such as "siteAuthority," Chrome data, and click metrics.

The leaked documents also indicate that factors like freshness, links, branding, and change history play significant roles in Google's ranking system. Furthermore, content can be demoted for reasons such as links not matching the target site or the presence of inappropriate content.

This leak underscores the discrepancy between Google's public statements and its actual practices, raising questions about the transparency of its operations. It also highlights the importance of understanding Google's ranking system for website owners and content creators, given the high stakes in terms of traffic and revenue.

However, the complexity of the leaked documents means that their interpretation may lead to incomplete or incorrect conclusions. As the SEO and content industry continues to analyze these documents, we may gain further insights into the inner workings of Google Search.

In the meantime, this leak serves as a reminder that Google's Search algorithm remains a closely guarded secret, and the company's public statements should be viewed with a degree of skepticism.

Google’s Search Algorithm Purportedly Leaked

According to reports, SparkToro has obtained over 2,500 pages of API documentation from Google's internal "Content API Warehouse." These documents were allegedly leaked on GitHub in March 2024 but were later removed. However, versions v0.4.0 and v0.5.0 of google_api_content_warehouse can still be found on Hexdocs. It is important to note that we at Android Authority cannot verify the authenticity of these leaked documents, so reader discretion is advised.

The leaked documentation appears to offer a rare glimpse into the inner workings of Google Search's algorithm, without directly revealing the weight assigned to different characteristics of websites or their content. Instead, it provides insights into the data Google collects from websites and web pages.

What Happens Behind the Scenes When People Run a Google Search?

A Google Search query may seem like an innocent and inconsequential action to a consumer like you, but it is oiling the wheels of a multi-million dollar industry. So, to understand the gravity of the leak, it is crucial to understand what happens when you do a Google Search.

The basics: Search engines, web crawling, web indexing, and ranking search results

When users have questions that they want answered on the internet, they approach a website called a “search engine.” They input a query for the search engine to look up, and the search engine presents them with an answer that hopefully answers their question. Simple, right?

On the back of it, the search engine does a lot of work, but it can be broken down into three main tasks:

A simple Google Search query can be misleadingly innocuous, yet it plays a pivotal role in driving a multi-billion dollar industry. Therefore, understanding the process behind a Google Search is crucial, especially in light of the recent leak. At the heart of this process are three fundamental tasks:

  1. Crawling: Search engines, like Google, must constantly explore the vast expanse of the internet to discover and catalog websites and their content. This is known as crawling.

  2. Indexing: Once the search engine has visited a webpage, it analyzes the data and content, storing it in a format that allows for easy retrieval. This process is known as indexing.

  3. Ranking: With potentially hundreds, if not thousands, of websites vying to answer a single query, a system is needed to determine which websites are presented to the user first. This is the ranking system, which decides the order in which websites appear on the search engine results page (SERP).

This complex process is the engine that powers Google Search, making it the world's most widely used search engine.

Why Does Search Ranking Matter in the First Place?

Google Search, commonly known as Google, stands as the world's largest search engine, managing a significant portion of the internet's search traffic. The sheer volume of daily search queries multiplied by the global population illustrates the immense power these engines hold in directing internet traffic. Often likened to traffic signals of the internet, they can significantly boost a business's online presence when properly utilized.

Securing the top spot on a popular Search Engine Results Page (SERP) can lead to a substantial increase in business revenue. Users tend to click on the first result, with traffic significantly decreasing for lower-ranked positions.

Consider the last time you clicked on the second or subsequent results on a Google Search. Likely, the first result didn't meet your needs, prompting you to refine your search query. This behavior is common, with many users not exploring beyond the first page of results.

Google has even removed pagination in favor of continuous scrolling for Search, further emphasizing the importance of the first page. Users rarely venture beyond the initial set of answers, either finding what they need or refining their search.

Google’s Secret Sauce Recipe Revealed?

The pressure to optimize for Google Search is immense, given the potential for significant traffic and revenue. Understanding Google's ranking system, or the Google Search algorithm, could enable websites to consistently rank highly, driving substantial views and revenue.

However, this knowledge could also lead to widespread manipulation of search results, negatively impacting the end-user experience. Despite this, Google has traditionally been the go-to tool for finding new online information.

To guide content creation, Google publishes its public "recipe" in the form of content guidelines. The core advice is to create "people-first content," focusing on user experience rather than search engine optimization. This content should demonstrate expertise, authoritativeness, and trustworthiness, known as E-E-A-T.

Google encourages content creators to focus on end-users, leaving the ranking process to the Search algorithm. By following these guidelines, content is more likely to be recognized as high-quality and ranked accordingly. While not the direct secret sauce, these guidelines offer the best strategy for optimizing content for Google Search.

The Problem: What Google Says Publicly Doens’t Match What the Search Giant Does Privately

In recent years, many website owners have lamented the decline in their traffic despite adhering to Google's best practices for creating user-centric content, as outlined in the E-E-A-T guidelines. Google officials have publicly commented on the appropriate strategies and practices for website owners.

However, the purported leak of Google's Search algorithm has raised questions about the alignment between Google's public guidelines and its actual ranking practices. The leaked information appears to contradict the advice given by Google officials and the content guidelines provided by the company. This discrepancy has left many website owners and SEO professionals questioning the accuracy and relevance of Google's public guidance.

Still, there is no shortage of motivation to believe this purported leak is not only genuine but current. For instance, Google has long maintained it does not use “overall domain authority” for ranking SERPs. But, the documents cite a characteristic called “siteAuthority.” The same holds true for the collection of Chrome browser data. Google says it does collect this information, yet the leaked docs include a few Chrome-related measurement attributes.
Then, there are clicks. Google official search spokespeople have repeatedly denied using clicks directly in SERP rankings. However, there’s certainly evidence in the document dump (not to mention plenty of outside evidence). If this data is indeed accurate, then clicks most definitely count. 

Finally, the new website sandbox and authors. Google denies the existence of a search engine sandbox and author bylines are strictly for readers, not SERP rankings. But, the documents include an attribute called “hostAge” that is used specifically to “sandbox fresh spam in serving time.” Additionally, the information reveals Google collected author data on pages, though it may not be a ranking metric.

Last but not least, there are other concerns, including how much freshness matters, link weight, the prowess of branding, change history, and content demotion. 

The Purported Google Search Algorithm Leak Makes This One Point Abundantly Clear (And It's Not What You Think)

Now, it’s well-known that Google uses various technologies, including NLP (Natural Language Processing, NLU (Natural Language Understanding), LSI (Latent Semantic Indexing), and plenty more. So, there is a lot that goes on to crawl, index, and rank content across the web. But, if you take a 30,000-foot view, it’s all an attempt to quantify what people value most.

Because Google uses such technological tools, the most notable experts in the SEO community rely on different tools to perform better on the SERP - be it the leveraging of keyword gaps, long-tail key phrases, or other techniques - tools to either manipulate or exploit search engine signals. In other words, SEOs have built, used, and continue to use and update all sorts of complex systems to help their clients rank higher. Now, does that sound like they’re creating content for people or to game the search engines?

And if anyone thinks that Google isn't intimately familiar with these case-specific apps, they are sorely mistaken. Regardless, the company continues to repeat the same guidance - write for people, not search engines.

So, these SEO industry experts and their purpose-built tools are not creating content for search engines, but not necessarily for people. Or, to put it more accurately, creating content that puts search engines first and people second.

Since Google has issued warnings about the leaked documents, and many respected SEOs have cautioned about their case uses (some saying these are not actual ranking signals but API compatible), it’s wise to take a step back and approach the matter judiciously. 

To this end, it’s best not to miss the point Google has made all along: web properties should create and publish high-quality content that puts people first and search engines second. After all, consumers are the ones who need and seek out information and have buying power, while search engines do not. Meaning, the lesson learned here is that you should be writing with people in mind and not creating optimized content that caters to search engines. Because Goole can always roll out or change its ranking metrics but people will always be looking for reliable and unique content.


P.S. – Whenever you’re ready, we can help you with the content you need. Just click the "Order Custom Content Now" button below!

Owen E. Richason IV

Owen has written for several publications and websites in the US, Canada, and Australia including the Houston Chronicle, San Francisco Gate, AOL, BAM Magazine, and regional outlets. He is also a fiction author and a musician.

https://www.oer4.com
Previous
Previous

Small Business Owners, Get Ready for Life in the Post-Google World of Local Search: How Marketing Is Rapidly Changing and What You Need to Know Right Now

Next
Next

The Ugly Truth About Google Ads and Facebook Ads Small Businesses Need to Know Before Creating Marketing Campaigns