You might find the process of optimizing your website to help search engines find, understand and index your pages a little difficult to comprehend. Technical SEO doesn't need to be all that technical.
If you are a beginner, you are at the right place, because this blog will be focused on the basics so you can perform regular maintenance on your site and ensure that your pages can be discovered and indexed by search engines.
Let's get started with why technical SEO is important at the core.
Basically, if search engines can't properly access, read, understand, or index your pages, then you won't rank or even be found for that matter. So to avoid innocent mistakes like removing yourself from Google's index or diluting a page's backlinks, all you need is the below mentioned 5 things:
1. No index meta tag
By adding this piece of code to your page, you are telling search engines not to add it to their index. And you probably don't want to do that. However, this actually happens more often than you might think. For example, let's say you hire Design Inc to create or redesign a website for you. During the development phase, they may create it on a subdomain on their own site. Hence, it actually makes sense for them to no-index the site they're working on.
Usually, they'll migrate it over to your domain after you approve the design. But they often forget to remove the meta no index tag. Resulting in the removal of your pages from Google's search index or never making it in.
There are times when it actually makes sense to no-index certain pages. For instance, one of the famous author's pages is not indexed because from an SEO perspective, these pages provide very little value to search engines.
Nonetheless, from a user experience standpoint, it can be argued that it makes sense to be there. Some people may have their favorite authors on a blog and want to read just their content.
Generally, for small sites, you won't need to worry about not indexing specific pages. Just keep your eye out for no index tags on your pages, especially after a redesign.
Robots.txt is a file that usually lives on your root domain. And you should be able to access it at yourdomain.com/robots.txt.
The file itself includes a set of rules for search engine crawlers and tells them where they can and cannot go on your site. It is important to note that a website can have multiple robot files if you're using subdomains.
For example, if you have a blog on domain.com, then you'd have a robot.txt file for just the root domain. But you might also have an eCommerce store that lives on store.domain.com. So you could have a separate robots file for your online store.
This means, crawlers could be given two different sets of rules depending on the domain they're trying to crawl.
The rules are created using something called "directives." While you probably don't need to know what all of them are or what they do, there are two that you should know about from an indexing standpoint.
a. User-agent: It defines the crawler that the rule applies to. Also, the value for this directive would be the name of the crawler.
For example, Google's user agent is named Googlebot.
b. Disallow: This is a page or directory on your domain that you don't want the user-agent to crawl.
For example, if you set the user agent to Googlebot and the disallow value to a slash, you're telling Google not to crawl any pages on your site.
Not good, right?
If you were to set the user-agent to an asterisk, that means your rule should apply to all crawlers. Therefore, if your robot file looks something like this, then it's telling all crawlers, 'please don't crawl any pages on my site.'
This might sound like something you would never use, but there are times when it makes sense to block certain parts of your site or to block certain crawlers. For instance, if you have a WordPress website and you don't want your wp-admin folder to be crawled, then you can simply set the user agent to "All crawlers," and set the disallow value to /wp-admin/.
Sitemaps are usually XML files and they list the important URLs on your website.
So these can be pages, images, videos, and other files. And sitemaps help search engines like Google to more intelligently crawl your site. Creating an XML file can be complicated if you don't know how to code and it's almost impossible to maintain manually. However, if you're using a CMS like WordPress, there are plugins like Yoast and Rank Math that will automatically generate sitemaps for you.
To help search engines find your sitemaps, you can use the Sitemap directive in your robots file and also submit it in the Google search console.
A redirect takes visitors and bots from one URL to another. Their purpose is to consolidate signals.
Let's say you have two pages on your website on the best golf balls. An old one at domain.com/best-golf-balls-2018, and another at domain.com/best-golf-balls. Seeing as these are highly relevant to one another, it would make sense to redirect
The 2018 version to the current version. And by consolidating these pages, you're telling search engines to pass the signals from the redirected URL to the destination URL.
5. Canonical tags
A canonical tag is a snippet of HTML code. Its purpose is to tell search engines what the preferred URL is for a page. This helps solve duplicate content issues.
For example, your website is accessible at both http://yourdomain.com and https://yourdomain.com. And for some reason, you weren't able to use a redirect. These would be exact duplicates. By setting a canonical URL, you're telling search engines that there's a preferred version of the page.
As a result, they'll pass signals, such as links to the canonical URL so they're not diluted across two different pages. However, keep in mind that Google may choose to ignore your canonical tag.
Looking back at the previous example, if we set the canonical tag to the insecure HTTP page, Google would probably choose the secure HTTPS version instead.
If you're running a simple WordPress site, you shouldn't have to worry about this too much. CMS platforms are competent enough to handle a lot of these basic technical issues for you.
So these are some of the foundational things that are good to know when it comes to indexing, which is arguably the most important part in SEO.
Because again, if your pages aren't getting indexed, nothing else really matters.