π» Website Representation Vectors

Google's Website Representation Vectors patent application, filed in August 2018, introduces a system using Neural Networks to classify websites based on their content. This classification helps Google refine search results by identifying authoritative and expert-driven sites within particular knowledge domains like health, finance, and artificial intelligence.
β¨ Key Takeaways from the Patent
1οΈβ£ Classification of Websites
Google classifies websites based on their content and assigns them to specific knowledge domains. The classification includes:
- π¨ββοΈ Expert Websites: Written by professionals (e.g., doctors in the medical domain).
- π Apprentice Websites: Created by those in training (e.g., medical students).
- π Layperson Websites: Authored by non-experts but containing relevant information.
This structure mirrors Google's E-A-T (Expertise, Authority, and Trustworthiness) principles.
2οΈβ£ Quality Scores and Ranking
The patent indicates that websites are ranked based on quality scores, but does not define these scores explicitly. However, Google has previous patents covering quality assessments for ranking purposes.
- π Search results ranking factors:
- π Information Retrieval (IR) scores
- π Authority scores
Example: A medical query like "What are the symptoms of mononucleosis?" is best answered by a medical domain website rather than a general blog.
3οΈβ£ Website Representation Vectors & Neural Networks
Google uses Neural Networks to generate composite representations (vectors) of websites. These vectors help classify websites into categories such as:
- π¨ββοΈ Expert-Level Websites
- π Apprentice-Level Websites
- π Layperson Websites
This process allows Google to refine search results based on relevance and authority while reducing computational resource usage.
4οΈβ£ How the Search Engine Uses These Classifications
- π Queries from users trigger search processes within specific knowledge domains.
- β Google selects only relevant and authoritative sites for search results.
- β‘ Reduces processing power by limiting searches to classified sites.
5οΈβ£ Data Considered for Classification
The classification system uses multiple factors, including:
- β Text Content: Words and phrases used on the website.
- πΌ Images: Visual data contributing to classification.
- π Links: External and internal linking structure.
- π· Labels: Identifying site type (e.g., nonprofit, commercial, medical, etc.).
- π Similarity Measures: Comparing new sites with known classified sites.
βοΈ Benefits of Website Representation Vectors
- π More Relevant Search Results: Prioritizes high-quality, authoritative content.
- πΌ Reduces Computational Resources: Google avoids scanning irrelevant websites.
- π Improved Classification System: Enhances the accuracy of search queries.
- π Better SEO Strategy: Sites aligning with E-A-T principles gain an advantage.
π Conclusion
The Website Representation Vectors patent enhances Google's ability to classify, rank, and retrieve websites based on expertise and authority. This approach aligns with the principles outlined in Google's Quality Rater Guidelines, ensuring that authoritative content ranks higher in search results.
For SEO experts, understanding this classification system is crucial in optimizing websites for higher visibility and better rankings. β¨