Tuesday, May 31, 2022

4 technical SEO issues auditing tools won’t show you

Throughout the history of SEO, people have debated the pros and cons of relying on technical SEO tools. Relying on the hints from auditing tools isn’t the same thing as a true SEO strategy, but we’d be nowhere without them. It’s just not feasible to manually check dozen of issues page per page.

To the benefit of the SEO industry, many new auditing tools have been created in the past decade, and a few of them stand strong as industry leaders. These few technical auditing tools have done us a great service by continuing to improve their capabilities, which has helped us better serve our clients, bosses and other stakeholders.

However, even the best auditing tools cannot find four important technical SEO issues that could potentially damage your SEO efforts:

  1. Canonical to redirect loop
  2. Hacked pages
  3. Identifying JS Links
  4. Content hidden by JS

Get the daily newsletter search marketers rely on.

Processing...Please wait.


Why tools won’t show these

Some of these issues could be detected by tools, but they’re just not common enough to come across their desk. Other issues would be impossible for tools to detect. 

As with many cases in SEO, some issues may affect sites differently, and it all depends on the context. That’s why most tools won’t highlight these in summary reports.

Required tools to uncover these issues

Before we dive into the specific issues, there are two specific requirements to help us find these issues.

Your web crawling tool of choice

Even though most tools won’t uncover these issues by default, in most cases, we can make some modifications to help us detect them at scale.

Some tools that you could use include:

  • Screaming Frog
  • Sitebulb
  • OnCrawl
  • DeepCrawl

The most important thing we need from these tools is the ability to:

  • Crawl the entire website, sitemaps and URL list
  • Ability to have custom search/extraction features

Google Search Console

This should be a given, but if you don’t have access, make sure you acquire Google Search Console access for your technical SEO audits. You will need to be able to tap into a few historic reports to help us uncover potential issues.

Issue 1: Canonical to redirect loop

A canonical to redirect loop is when a webpage has a canonical tag pointing to a different URL that then redirects to the first URL. 

This can be a rare issue, but it’s one that I’ve seen cause serious damage to a large brand’s traffic. 

Why this matters

Canonicals provide the preferred URL for Google to index and rank. When Google discovers a canonical URL different from the current page, it may start to crawl the current page less frequently

This means that Google will start to crawl the webpage that 301 redirects more frequently, sending a type of loop signal to their Googlebot.

While Google allows you to make a redirected page the canonical, having it loop back to the previous page is a confusing signal.

I’ve seen this happen to some large brands. One recently came to me asking to investigate why one of their key pages hasn’t been driving the traffic they were hoping for. They had invested a lot of money into SEO and had a well-optimized page. But this one issue was the sore thumb that stuck out. 

How to detect canonical redirect loops

Even though this issue will not appear in any default summary reports in standard auditing tools, it’s quite easy to find. 

  • Run a standard crawl with your preferred technical SEO auditing tool. Make sure to crawl sitemaps as well as a standard spider crawl.
  • Go to your canonical report and export all of the canonicalized URLs. Not the URLs the tool crawled, but what the URL in the canonical tag is. 
  • Run a new crawl with that URL list and look at the response codes report with this list of canonicals. All response codes should return a status 200 response code. 

Issue 2: Hacked pages

Hacked websites for profit is not a new topic. Most seasoned SEOs have come across websites that have been hacked somehow, and the hackers have conducted malicious activities to either cause harm or generate profit for another website.

Some common website hacking that happens in SEO includes:

  • Site search manipulation: This occurs when a website’s search pages are indexable. A malicious person then sends a ton of backlinks to their search results page with irrelevant searches. This is common with gambling and pharma search terms. 
  • 301 redirect manipulation: This happens when someone gains access to the site, creates pages relevant to their business and gets those indexed. Then they 301 redirect them to their own websites. 
  • Site takedowns: This is the most straightforward attack when a hacker manipulates your code to make your website unusable or at least non-indexable.

There are dozens of types of site hacking that can affect SEO, but what’s important is that you maintain proper site security and conduct daily backups of your website.

Why this matters

The most important reason that hacking is bad for your website is that if Google detects that your website might have malware or is conducting social engineering, you could receive a manual action. 

How to detect hacked pages

Luckily, there are many tools out there to not only mitigate hacking threats and attempts but there are also tools to detect if your website gets hacked. 

However, most of those tools only look for malware. Many hackers are good at covering their tracks, but there are ways to see if a website has been hacked in the past for financial gain.

Use Google Search Console

  • Check manual actions report. This will tell you if there are any current penalties against the site.
  • Check the performance report. Look for any big spikes in performance. This can indicate when a change may have happened. Most importantly, check the URL list in the performance report. Hacked URLs can stick out! Many of them have irrelevant topics or may even be written in a different language.
  • Check the coverage report. Look for any big changes in each sub-report here.

Check website login accounts

  • Take a look at all users to find any unusual accounts.
  • If your website has an activity log, check for recent activity.
  • Make sure all accounts have 2FA enabled. 

Use online scanning tools

Several tools will scan your website for malware, but that may not tell you if your website has been hacked in the past. A more thorough option would be to look at https://haveibeenpwned.com/ and scan all website admin email addresses. 

This website will tell you if those emails have been exposed to data breaches. Too many people use the same passwords for everything. It’s common for large organizations to use weak passwords, and your website can be vulnerable.

It’s well communicated from Google that they do not follow or crawl internal links generated by JavaScript.

By now, we’d think that our SEO auditing tools should be better at detecting internal links generated by JavaScript. Historically, we’ve had to rely on manually discovering JS links by clicking through websites or looking at link depths on reports.

Why this matters

Googlebot does not crawl JavaScript links on web pages. 

While most SEO auditing tools can’t detect JavaScript links by default, we can make some slight configurations to help us out. Most common technical SEO auditing tools can provide us with custom search tools. 

Unfortunately, browsers don’t really display the original code in the DOM, so we can’t just search for “onclick” or anything simple like that. But there are a few common types of code that we can search for. Just make sure to manually verify that these actually are JS links.

  • <button>: Most developers use the button tag to trigger JS events. Don’t assume all buttons are JS links, but identifying these could help narrow down the issue.
  • data-source: This pulls in a file to use the code to execute an action. It’s commonly used within the JS link and can help narrow down the issues.
  • .js: Much like the data-source attribute, some HTML tags will pull in an external JavaScript file to find directions to execute an action.

Issue 4: Content hidden by JavaScript

This is one of the most unfortunate issues websites fall victim to. They have so much fantastic content to share, but they want to consolidate it to display only when a user interacts with it. 

In general, it’s best practice to marry good content with good UX, but not if SEO suffers. There’s usually a workaround for issues like this. 

Why this matters

Google doesn’t actually click on anything on webpages. So if the content is hidden behind a user action and not present in the DOM, then Google won’t discover it. 

How to find content hidden by JavaScript

This can be a bit more tricky and requires a lot more manual review. Much like any technical audit generated from a tool, you need to manually verify all issues that have been found. The tips below must be manually verified.

To verify, all you need to do is check the DOM on the webpage and see if you can find any of the hidden content.

To find hidden content at scale:

  • Run a new crawl with custom search: Use the techniques I discussed in finding JS links. 
  • Check word counts at scale: Look through all pages with low word counts. See if it checks out or if the webpage looks like it should have a larger word count.

Growing beyond the tools

With experience, we learn to use tools as they are: tools.

Tools are not meant to drive our strategy but instead to help us find issues at scale. 

As you discover more uncommon issues like these, add them to your audit list and look for them in your future audits.

The post 4 technical SEO issues auditing tools won’t show you appeared first on Search Engine Land.

No comments:

Post a Comment