How to stop Google from indexing unnecessary WordPress URLs.

Enter
  • Blogs
  • Development
  • How to stop Google from indexing unnecessary WordPress URLs.

Development / 5 min read

How to stop Google from indexing unnecessary WordPress URLs.

Sometimes, you don’t want Google to advertise one of your pages for the world to see. Learn how to stop Google’s crawler from indexing your WordPress site’s pages using Yoast SEO.

search

Google indexing unnecessary URLs can really bring down your site’s overall SEO score. It’s also bad for business since it can look unprofessional depending on the contents of the page.

All search engines organize search results by “indexing” them. This is good in most cases; it gives your WordPress site exposure. But in others, they may index URLs you want to hide from search results. One of the most common reasons this happens is out-of-the-box WordPress installations that haven’t been customized for SEO.

Default WordPress installations include a significant amount of empty templates. Despite this, they are crawlable and indexable to Google results by default. Having a ton of empty indexed pages is not great for business or your overall SEO ratings.

In this article, we’ll explore what Google can crawl and index behind the scenes and learn how to avoid Google from indexing unwanted pages using Yoast SEO. This guide is valid for out-of-the-box installations and already-established websites alike.

How Google crawls and indexes your WordPress pages

“Crawling” and “indexing” are often used interchangeably. In reality, they are different but related terms.

Crawling is following the links within a page, and then following the links in those linked pages until there are no more links to follow. A “spider” is a software program designed to do this. Google’s spider is called Googlebot.

Indexing, on the other hand, is storing and organizing the information found on pages, whether they’ve been crawled or not. The indexed pages appear on Google’s search result pages.

You can prevent a page from being crawled by various means, but that doesn’t immediately stop Googlebot from capturing the link and adding it to page results.

How to know what Google is indexing from your WordPress site

It’s not always immediately clear which specific pages from your WordPress site Googlebot is indexing. You may need to manually check which pages are being indexed.

One of the simplest ways to check all the URLs Googlebot is indexing from your site is by going to Google.com and typing this on the search bar:

site:your-domain

E.g.: site:wcanvas.com

The “site:” keyword restricts search results to pages from that domain. It’s important to not leave any blank spaces in the text between the “:” and the “.com”.’

sitemap

If you run this test and the results contain links that you don’t want Googlebot to index, you need to block Googlebot from indexing those pages to improve SEO for your entire site.

Another option is to visit your site’s sitemap by going to “your-domain.com/sitemap.xml” and start manually following links.

Sometimes you’ll find pages for authors, tags, and other content lacking proper web design. You don’t want these pages appearing on search results.

Now let’s dive into how to prevent them from appearing.

How to stop Google from indexing your WordPress pages using Yoast SEO

Now that you’ve identified the pages you don’t want Googlebot to index, let’s use the Yoast SEO plugin to hide these pages from search engines. Follow these steps:

  • Log in to your WordPress website to access your dashboard.

  • Access the sidebar and navigate to the post or page you want to exclude from Google results.
article

  • Once on the post or page, expand the ‘Advanced’ section. Look for the “Allow search engines to show this Post in search results?” option and change it to “No.” It will block Googlebot and other web crawlers from indexing the page.
serach-engine-config

  • Publish or update the post to confirm the change.
setting

To verify these changes are in effect, all you need to do is check your sitemap. The page shouldn’t be there anymore.

Be aware that pages may take time to unindex, so if you still see it in your sitemap, it doesn’t necessarily mean the changes didn’t work.
Alternatively, if you want to exclude multiple pages in bulk, you could consider using a filter. Explore this Yoast SEO documentation to learn more about bulk filtering.

Why would you want to stop Google from indexing some of your WordPress pages?

As we’ve explored in this post, you may want to unindex some of your WordPress site’s pages from Google, even if temporarily. There are many reasons for this. These are some of the most common:

Your website is unfinished

When you’re still testing the website, you don’t want anyone but your team to have access to it. Using WordPress staging environments will keep the progress private.

A second, maybe better option is to stop Googlebot from crawling your website at all by editing the “robots.txt” file. You can do this by login into your WordPress admin area, then going to Settings -> Reading. Scroll down until you locate the “Search Engine Visibility” option.

After that, simply check the “Discourage search engines from indexing this site” option. Finally, save changes and the robots.txt file will be edited.

It’s a restricted page

Invite-only, gated download pages for e-books that require contact information, and other restricted pages aimed at specific audiences should not appear on search results.

Duplicate test sites

Duplicate sites for trials and testing for the production site should stay off search results.

Content duplicates

If you have the same content offered to visitors in different forms, make sure not all of them are indexed, as Google penalizes your overall SEO ranking if you have duplicate content. 

Content you’ll update later

If one of your posts is outdated, but you plan to update it in the future, it may be better to unindex it until you get around to it.

Final thoughts

If you have a WordPress site and want to improve its SEO, you should take a few hours to dive deep into the pages Google is crawling and indexing to filter out the ones that hurt your overall SEO.

This action should be part of an overarching strategy to boost your WordPress site’s SEO, not the only one.

We know it’s not great for business to have broken, gated, or empty pages advertised for the world to see, and you should take the time to unindex them. But on its own, it won’t make you a superstar in Google’s eyes unless it’s accompanied by other strategies.

Take this as one of the first steps of a long-term plan to power up your WordPress site’s SEO.