How to leverage cosine similarity for ecommerce SEO

2024-11-08 00:00:06

Traditional SEO tactics alone aren’t enough to keep ecommerce sites competitive in today’s AI-driven search landscape. 

To improve search visibility and connect with relevant queries, ecommerce brands can leverage cosine similarity – a mathematical concept that helps search engines understand content relationships. 

By using cosine similarity, you can enhance your site’s content relevance, making it easier for Google to recognize and rank your pages accurately. 

This article will explain cosine similarity, how it works in modern search algorithms and practical ways to apply it to boost your ecommerce SEO strategy.

First, let’s dive into two key concepts: embeddings and cosine similarity.

What are embeddings? 

Embeddings are critical for large language models (LLMs) and modern search. When either a search engine or LLM reads your content, they need a scalable way to analyze it. 

So what do they do?

They use embeddings to vectorize the content and translate it into a numeric value. See a representation here: 

This is exactly what the Google BERT model does. It extracts content from your site and then creates an embedding, which is a numerical representation of your content. 

These embeddings are then stored in a vector database. Since they’re stored as numerical representations, they can be “plotted out” within the database: 

This is an extremely important concept to understand cosine similarity.

What is cosine similarity?

After these concepts are translated into numerical values and stored, models can perform calculations to determine the “distance” or similarity between any two points. 

Cosine similarity is one method used to measure how closely related these points are.

Simply put, concepts that have high cosine similarity are understood to be more related to each other. Concepts with lower similarity are less related. 

So “SEO” and “PPC” would exhibit higher cosine similarity than “shark” and “PPC.” 

This is how Google can numerically identify whether two concepts are related or if a page is optimized for the target. 

There’s a laundry list of evidence that Google uses this concept in its own algorithm. Google’s Pandu Nayak wrote the following in a Stanford course on information retrieval: 

  • “As a consequence, we can use the cosine similarity between the query vector and a document vector as a measure of the score of the document for that query.” 

In layman’s terms, they can use cosine similarity to understand how relevant a piece of content is to a given query. 

The Google Search API leaks contain numerous references to embeddings, with over 100 mentions of the concept throughout the documents.

Analyzing cosine similarity on sites

Understanding cosine similarity conceptually is useful, but how can you apply it to your own site? 

The good news is that Google’s BERT model is open-source, allowing you to use it to analyze your site’s content.

This means you can use Google’s own tools to test and measure how relevant your content is to target queries.

This blog post from Go Fish Digital (disclosure: I serve as the agency’s VP of marketing) shares a Python code you can use to access BERT and test the relevance of your content.

We’ve also built an extension that creates embeddings for an entire page. 

The extension extracts your content, runs it through Vertex AI and BERT, and gives you actual scoring of your content for all the sections of a page.

The extension also gives you an overall Page Similarity score. This calculates the average of all of the embeddings on a given page into a single 0 to 10 score. (As of now, the extension is in beta, but you can request access.)

Even without these tools, you can still incorporate the concept of cosine similarity into your ecommerce optimization. 

Some general concepts that help improve cosine similarity evaluations include: 

  • Using target terminology on the page.
  • Ensuring content is higher on the page and has strong similarity. 
  • Using related terminology of the core topic.
  • Reducing and removing content that isn’t about the topic of the page. 
  • Ensuring core headings are optimized for similarity.

Applying cosine similarity to ecommerce sites 

With this knowledge, we can better understand the factors that drive high-performing ecommerce sites. 

Sites that optimize for cosine similarity at scale are more likely to perform better in search.

But how do these high-performing sites naturally incorporate cosine similarity?

Let’s explore some examples using our similarity score extension.

1. Product naming convention optimization 

Optimizing product description pages (PDPs) optimizes your product listing pages (PLPs).

This means that optimizing your product pages for a specific query also enhances the relevance of your category pages. 

The products listed on your category pages naturally adopt the same queries and terminology as the parent category. 

For example, REI’s use of “Men’s Hiking Boots” in their product naming conventions also helps optimize the parent category page.

Understanding the concept of cosine similarity, we can now see why this helps improve SEO at scale. 

When running our similarity score extension on top of this page, we can see that REI’s own product grids have strong matches against the parent category.

Dig deeper: Product page SEO: A complete guide

2. SEO text on category pages

A best practice for ecommerce sites is to include search-optimized text at the bottom of category pages. 

Typically, this consists of 3-5 paragraphs of content placed beneath the product listings, providing additional information about the category as a whole.

I recently conducted a LinkedIn poll asking if this type of text at the bottom of category pages is beneficial, with 82% of respondents confirming that it is.

When we view this initiative through the lens of cosine similarity, it becomes clearer why it’s effective. It helps ecommerce sites significantly improve the content relevance of their category pages. 

When done correctly, you can see how well this content aligns with target keywords. For example, on Chewy’s Dry Dog Food page, most of their sections score 7.0 or higher in similarity.

Dig deeper: How to make your ecommerce content more helpful

3. Related categories

Another effective ecommerce SEO strategy I’ve long advocated is using internal links at the bottom of category pages to cross-link to other relevant categories. These are often labeled as “Related categories” or “Related searches.”

For example, REI includes “Related searches” at the bottom of its category pages, which helps to strengthen the relevance and connectivity of its content.

Looking at the Wayfair site, they actually include both options: 

Let’s see how these features impact the content’s cosine similarity against the core query. For REI, we can see that each item strongly impacts content optimization. 

Not only are they helping the core category page, but they’re also establishing a strong internal linking system to other semantically related pages. 

Dig deeper: Retailers: Google is becoming your new category page

4. Product reviews

It may be surprising, but I believe product reviews are an underutilized asset for SEO on ecommerce sites. 

Often, I see sites allow only 5-10 reviews to be indexed, then use JavaScript to prevent further indexation.

However, when reviews are relevant, they can be a powerful tool for leveraging cosine similarity at scale. 

If reviewers consistently mention the product’s name or category, it helps bring your product closer to the root query, improving its relevance.

For example, look at how relevant reviews can enhance a page’s similarity score and overall SEO performance.

Now that you understand the role of cosine similarity and its impact on search, you can apply these principles to optimize your ecommerce site and content structure. 

The most significant improvements will come from scaling your efforts to enhance similarity across your site.

Dig deeper: Ecommerce content: How to demonstrate beneficial purpose and expertise