Find Dead Links on Your Jekyll Blog with HTML Proofer

| programming | quality | ruby |

Introduction

HTML Proofer is a super handy ruby tool that helps you check your statically generated HTML for any inconsistencies. If you have a large statically generated site then it is certainly worth setting this up because as your site continues to grow it will become more and more difficult to audit the validity of your pages. I have used HTML Proofer in the past, but for whatever reason I had "disable_external" set to true which ignored all outgoing links. It is still useful to find things like missing alt tags in images, and general invalid HTML, but this feature makes it a must for all blogs.

Configuration

Rather than clicking on every link on every page, let HTML Proofer do the heavy lifting for you with the following simple steps:

Install HTML Proofer

Add the following to your Gemfile
gem "html-proofer"
gem "rake"
Install all of your Gems
bundle install

Configure a Rake Task

Add the following to your Rakefile.
require 'html-proofer'

task :test do sh “bundle exec jekyll build” HTMLProofer.check_directory("./_site", { :allow_hash_href => true }).run end

Run the Task

bundle exec rake test
This will show you any failures and allow you to act upon them. Some sample output looks like this:
*  External link https://levlaz.org/tag/lxc/ failed: 404 No error
- ./_site/projects/index.html
*  External link https://ezbadge.levlaz.org/ failed: 301 Peer certificate cannot be authenticated with given CA certificates
- ./_site/salting-your-lxc-container-fleet/index.html
*  image /images/minions.jpg does not have an alt attribute (line 150)
- ./_site/setting-up-antlr4-on-windows/index.html
*  image /images/antlr.png does not have an alt attribute (line 156)
*  image /images/grun.png does not have an alt attribute (line 163)
- ./_site/share-this-on-facebook/index.html

Configure CI

If are using CircleCI you can add the following to your circle.yml to run the proofer automatically.
test:
  override:
    - bundle exec rake test
The proofer returns an exit code of 1 upon failure, so this is a great way to enforce quality before deploying your site to production.

Conclusion

There is nothing worse than clicking on a link and seeing a 404. When the post is from 2013, perhaps you can excuse it, but it is still a terrible experience for the user and as a "web master" you owe it to your users to prevent link rot.

Thank you for reading! Share your thoughts with me on mastodon or via email.

Check out some more stuff to read down below.

Most popular posts this month

Recent Favorite Blog Posts

This is a collection of the last 8 posts that I bookmarked.

Articles from blogs I follow around the net

Supermicro NVIDIA GB200 NVL72 System at Computex 2024

We checked out the Supermicro NVIDIA GB200 NVL72 rack at Computex 2024. This is a prime example of why power is becoming such a big deal The post Supermicro NVIDIA GB200 NVL72 System at Computex 2024 appeared first on ServeTheHome.

via ServeTheHome July 22, 2024

Forum

Community is one of the main topics I touch on here. My community, my social media is an email and iMessage, but I thought I could do more. I did such an experiment last year and I quit soon after. I don't think I was ready for it. Today I still don&#…

via Michal Zelazny July 21, 2024

Weekly Update 409

It feels weird to be writing anything right now that isn't somehow related to Friday's CrowdStrike incident, but given I recorded this video just a few hours before all hell broke loose, it'll have to wait until next week. This week, the issue…

via Troy Hunt July 21, 2024

Generated by openring