Terminal Reader Mode with Pandoc and Less

The other day Aosheng send me an article to read from the verge. When I tried to read it, it took about 5 minutes to load because of the 15 various JavaScript things that were running in addition to ads loading in the background. Firefox was unhappy, and even when I tried to turn on “Reader View” (which strips out all of the junk) it took another minute to load.

I’ve been on a UNIX binge lately so I figured there had to be a clever hack to make my own reader view in a terminal.

This is where pandoc comes to the rescue. I’ve written about this
in the past discussing how to easily convert Markdown to PDF. It turns out that pandoc also supports arbitrary URL arguments which means that you can convert HTML files on the fly without having to download them first.

This means that we can take an arbitrary URL, pass it into pandoc, and spit out plain text. Furthermore, we can pipe this into less to get a nice pager for longer documents. The full string is shown below:

pandoc -f html -t plain
| less

In the example above, -f specifies the input filetype, in this case HTML. -t specifies the conversion filetype, in this case plain text. Pandoc supports a ton of different formats, you can read the man page for more info.

The image below shows the output in the terminal.

Terminal Reader Mode

Terminal Reader Mode

The next logical step is to make a script like my wordpress mutt poster to make this even easier. You could make a simple program called reader and put it in /usr/local/bin/reader. The contents of this script are:

# Terminal Reader Mode using Pandoc and Less


pandoc -f html -t plain $url | less

You can then use this  by typing reader $URL.

If you made it this far, you should probably follow me on twitter. 🙂

This entry was posted in Linux and tagged . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *