Enter

How to... / 7 min read

How to stop Google from indexing unnecessary WordPress URLs

Sometimes, you don’t want Google to advertise one of your pages for the world to see. Learn how to stop Google’s crawler from indexing your WordPress site’s pages using Yoast SEO.

search

Google indexing unnecessary URLs can really bring down your site’s overall SEO score. It’s also bad for business since it can look unprofessional depending on the contents of the page. As a result, you need to find a way to stop Google from indexing unnecessary WordPress URLs.

All search engines organize search results by “indexing” them. This is good in most cases; it gives your WordPress site exposure. But in others, they may index URLs you want to hide from search results. One of the most common reasons this happens is out-of-the-box WordPress installations that haven’t been customized for SEO.

Default WordPress installations include a significant amount of empty templates. Despite this, they are crawlable and indexable to Google results by default. Having a ton of empty indexed pages is not great for business or your overall SEO ratings.

In this article, we’ll explore what Google can crawl and index behind the scenes and learn how to avoid Google from indexing unwanted pages using Yoast SEO. This guide is valid for out-of-the-box installations and already-established websites alike.

How Google crawls and indexes your WordPress pages

“Crawling” and “indexing” are often used interchangeably. In reality, they are different but related terms.

Crawling is following the links within a page, and then following the links in those linked pages until there are no more links to follow. A “spider” is a software program designed to do this. Google’s spider is called Googlebot.

Indexing, on the other hand, is storing and organizing the information found on pages, whether they’ve been crawled or not. The indexed pages appear on Google’s search result pages.

You can prevent a page from being crawled by various means, but that doesn’t immediately stop Googlebot from capturing the link and adding it to page results.

How to know what Google is indexing from your WordPress site

It’s not always immediately clear which specific pages from your WordPress site Googlebot is indexing. You may need to check which pages are being indexed manually.

One of the simplest ways to check all the URLs Googlebot is indexing from your site is by going to Google.com and typing this on the search bar:

site:your-domain

E.g.: site:wcanvas.com

The site: keyword restricts search results to pages from that domain. It’s important to not leave any blank spaces in the text between the : and the .com.

sitemap

If you run this test and the results contain links that you don’t want Googlebot to index, you need to block Googlebot from indexing those pages to improve SEO for your entire site.

Another option is to visit your site’s sitemap by going to your-domain.com/sitemap.xml and start manually following links.

Sometimes you’ll find pages for authors, tags, and other content lacking proper web design. You don’t want these pages appearing on search results.

Now let’s dive into how to prevent them from appearing.

How to stop Google from indexing your WordPress pages using Yoast SEO

Now that you’ve identified the pages you don’t want Googlebot to index, let’s use the Yoast SEO plugin to hide these pages from search engines. Follow these steps:

Log in to your WordPress website to access your dashboard.

Access the sidebar and navigate to the post or page you want to exclude from Google results.

article

Once on the post or page, expand the ‘Advanced’ section. Look for the “Allow search engines to show this Post in search results?” option and change it to “No.” It will block Googlebot and other web crawlers from indexing the page.

serach-engine-config

Publish or update the post to confirm the change.

setting

To verify these changes are in effect, all you need to do is check your sitemap. The page shouldn’t be there anymore.

Be aware that pages may take time to unindex, so if you still see it in your sitemap, it doesn’t necessarily mean the changes didn’t work.
Alternatively, if you want to exclude multiple pages in bulk, you could consider using a filter. Explore this Yoast SEO documentation to learn more about bulk filtering.

How to Disable Your Entire Site From Google Indexing?

There are 2 main methods for disabling Google indexing on your entire WordPress site: using the built-in feature in Settings > Reading and editing the robots.txt file manually. Let’s explore both.

Method #1: Configure the Reading Settings

The most straightforward way to prevent Google from indexing your site is to go to Settings > Reading. Once there, check the box that reads Discourage search engines from indexing this site.

The Settings > Reading interface in WordPress. The arrow points to a setting for preventing your site from appearing on search engine results

After checking the box, remember to click the Save Changes button.

Method #2: Edit the robots.txt File

robots.txt is a text file available in your website’s folders. You use it to issue commands that tell search engine crawlers (such as Google’s Googlebot) which of your site’s resources they can access.

The robots.txt file allows you to specify which directories, subdirectories, URLs, or files you don’t want the search engines to crawl. Additionally, you can use it to prevent Google from indexing your entire site.

There are multiple ways to edit robots.txt, but we believe the simplest one is using Yoast SEO. Go to Yoast SEO > Tools and click on File Editor.

The Tools section in Yoast SEO. The arrow points to the File Editor feature

Once in the File Editor feature, you should see a textbox with the contents of the robots.txt file.

The File Editor feature in Yoast SEO, showing the contents of the robots.txt file

If you want all search engines to avoid indexing your site, you should include the following commands:

User-agent: *
Disallow: /

This code tells all search engine crawlers (including Google’s) to avoid indexing your site. After you edit the file, it should look something like this:

The File Editor feature in Yoast SEO, showing the contents of the robots.txt file

What if You Don’t Have Yoast SEO?

If you don’t have Yoast SEO, the alternative is to use cPanel or FTP.

You can either connect to your web server using your FTP credentials (your hosting account should provide them) or via cPanel. To log into your cPanel account, do it from your hosting account’s dashboard or go to your-domain-name.com/cpanel.

Regardless of the tool you use, navigate to your server’s public_html folder (sometimes named simply public).

The FileZilla interface. The are folders and individual files on both the local machine (left) and the remote server (right)

Once in the public_html folder, look for the robots.txt file. Right-click on it and select View/Edit to edit the file.

The FileZilla interface. The user right-clicked the robots.txt file and is selecting the View/Edit option from the resulting dropdown menu

You need to add the following commands to the robots.txt file to disable Google indexing on your WordPress site.

User-agent: *
Disallow: /

Why would you want to stop Google from indexing some of your WordPress pages?

As we’ve explored in this post, you may want to unindex some of your WordPress site’s pages from Google, even if temporarily. There are many reasons for this. These are some of the most common:

Your website is unfinished

When you’re still testing the website, you don’t want anyone but your team to have access to it. Using WordPress staging environments will keep the progress private.

It’s a restricted page

Restricted pages like invite-only or gated download pages for ebooks aimed at specific audiences should not appear on search results.

Duplicate test sites

Duplicate sites for trials and testing for the production site should stay off search results.

Content duplicates

If you have the same content offered to visitors in different forms, make sure not all of them are indexed, as Google penalizes your overall SEO ranking if you have duplicate content. 

Content you’ll update later

If one of your posts is outdated but you plan to update it in the future, it may be better to unindex it until you get around to it.

Final thoughts

If you have a WordPress site and want to improve its SEO, you should take a few hours to dive deep into the pages Google is crawling and indexing to filter out the ones that hurt your overall SEO.

This action should be part of an overarching strategy to boost your WordPress site’s SEO, not the only one.

We know it’s not great for business to have broken, gated, or empty pages advertised for the world to see, and you should take the time to unindex them. But on its own, it won’t make you a superstar in Google’s eyes unless it’s accompanied by other strategies.

Take this as one of the first steps of a long-term plan to power up your WordPress site’s SEO.

If you found this post useful, read our blog and resources for more insights and guides!