Technical SEO is, well, technical. Okay, sure, that’s a bit obvious, right? What isn’t always obvious, especially to laypersons, is how to address technical SEO concerns and how to fix problems when they occur.
If your site is struggling, but you aren’t exactly a virtuoso in the industry, hiring an expert to help you correct the problem is always best. But you don’t necessarily need to be an expert to correct every problem. In this post, I’ll outline some of the most common technical troubles and reveal why they pose a problem for websites in the first place. Then, I’ll tell you how to correct them so you can get back on track.
Overview
Verify if You’re Indexed
Maybe you change your URL, or a developer accidentally no-follows your homepage instead of an internal page. For whatever reason, accidental de-indexing does happen, and when it does, webmasters usually see a sharp and sudden decline in traffic. Sometimes the issue really is this simple!
To identify whether this is your issue using Google’s provided verification tool within the search console (start here). Then, track down whatever’s preventing you from being indexed and correct it so you can get back on track.
Check Your Robots.txt
Nearly all websites and web platforms have a Robots.txt file that sits in the top-level directory of your web server. This tiny text file tells crawlers what to do when they encounter specific directories or files. The standard contents look a little something like this:
User-agent: *
Disallow:
When formatted as above, this file tells crawlers they are encouraged to crawl every page and resource in your site directory.
Unfortunately, this isn’t always ideal; code-heavy structure elements and pages that aren’t shown to the public can slow down crawlers, causing them to miss other important elements.
To fix this problem, developers alter the Robots.txt file to specify which elements should be ignored, making it look a little something like this:
User-agent: *
Disallow: /alternate/
Disallow: /mail/
The “disallowed” elements here are simply server directories that shouldn’t be crawled because they don’t contain any useful content.
Where the problem occurs is when hasty developers absent-mindedly forget to finish off the Disallow sections after the forward-slash. Listing only “Disallow: /” tells crawlers not to crawl your entire web directory, including your website, meaning you are effectively de-indexing yourself if you do it.
The fix for this is easy, of course – just remove that pesky forward slash and properly list your disallowed entries.
Review All Meta Tags
It isn’t enough to check your Robots.txt anymore; you should also check the source code (including meta tags) for each individual page. Sometimes, web platforms (including certain WordPress plugins) will default to inserting a “NoFollow” tag on individual pages, instead of within the Robots.txt file. These are often overlooked because they weren’t placed there intentionally in the first place.
If an internal page does contain the NoFollow meta tag, crawlers will effectively ignore it and any other internal page linking from it. If the tag placement happens to be only one or two pages deep, crawlers may miss up to 80 percent of your site when they crawl. Another symptom of this issue is a high bounce rate on a seemingly-random internal page.
Track Rel=Canonical Tags
Rel=canonical exists to make life easier for web developers and webmasters. It tells crawlers that two pages, either on or off site, are identical, and identifies which of the two pages crawlers should consider when indexing your site. This prevents search engines like Google and Bing from labeling you as a plagiarizer, which can potentially get you de-indexed and sandboxed permanently.
Failing to use Rel=Canonical tags at all is just as much of an issue in technical SEO troubleshooting as failing to use them correctly. They need to be in place, on the right pages, pointing to the right zones, in order to positively impact SEO.
Using analytics tools, search your site for instances of duplicate content. Then, repeat the search for off site versions. Don’t forget to factor in syndication, multiple website versions (e.g., www.example.com versus example.com, or https:\\ vs. http:\\), and multiple web platforms (mobile vs. desktop versions). Then, use Rel=Canonical meta tags to redirect crawlers to the right page.
One final note: be cautious of over-reliance on Rel=Canonical for pages that differ slightly, but aren’t quite the same. Moz talks about this concept here, but essentially, it’s debatable whether it’s wise or harmful in the long-run. If pages have anything but just one or two small differences, they may be better off as standalone pages.
Repair Broken Backlinks
We’ve seen it a thousand times before: a site owner sits down and re-categorizes their entire WordPress site, shortening their URL structure. Or, a developer moves the site to a new host, which necessitates changing the site’s structure slightly. Suddenly, backlinks everywhere cry out as they break en masse.
Even minor changes to the structure of your site, including changing tags or categories, can have this affect. Often, developers don’t notice the problem until they see a sudden dive in traffic or SEO results a month or two later.
Unfortunately, the fix isn’t nearly as easy as in the previous sections. Fixing broken backlinks coming from sites you don’t own can be exhaustive. First you have to identify them; then, it’s time for an outreach campaign to provide them with the correct and updated URLs. It is, however, a fantastic time to cultivate new relationships with people who are already paying attention!
Reformat PDFs and Text Images
Plagiarism is a real problem in the online sphere; that’s why website owners sometimes turn to image-based PDFs for text they fear will be stolen. Or, they post images of text articles and blogs instead. This rarely wise; the benefits gained just plain do not outweigh the negative effects lost in the process.
The problem with image-based text is two-fold: one, it frustrates readers, who may highlight to help them follow along, and two, search engines just aren’t great at understanding the text within as of yet. You lose SEO power, frustrate the people who find you, and water down any benefits gained from the content.
Instead, change how you think about your content. If it’s really so important that theft matters, put it behind an email signup page or reformat it for web consumption. It’s fine to continue using PDFs; just be sure they contain actual copyable text and and not images so search engines can parse them. As for images of blogs and articles, skip it altogether.
Improve Security with HTTPS
Still operating on HTTPS? It’s time to step into the future. Google recently announced a preference for secure sites with current SSL certificates, meaning HTTPS has more SEO weight than HTTP alone. If you use secure site technology to keep your viewers safe, Google will reward you with a slight ranking boost.
Caveat emptor: in certain rare situations, HTTPS can cause your site to become slow. This is evidence of a site problem in and of itself, and shouldn’t preclude you from switching to HTTPS altogether. Instead, it’s a sign that you need to review developer best-practices, such as minifying resources for mobile devices.
Check for Negative SEO
If you’ve gone through all of these suggestions, yet you still seem to be having trouble, there’s a small and rare risk that you may be suffering from a negative SEO campaign. This is when a competitor (or someone who just plain doesn’t like you) engages in malicious SEO practices designed to make search engines think you’re manipulating results.
What does negative SEO look like? Common practices include listing you on link farms, backlinking to your content from “adult” websites, or plagiarizing your content on multiple other sites without your permission to make you appear to be engaging in content theft. You can’t control such activities, but you can disavow links here and report plagiarized content to Google.