For Day 3 I continued to work more on my old_posts python script. My favorite part of 100 Days of Code is that I am taking the time to actually think through some of these problems, read documentation, and try to learn something.
Learned a ton about python lists thanks to this wonderful google developer guide. Specifically (after writing python for about 5 years) I learned about
list.extend() for the very first time. Came in handy in this particular use case because I was doing some very inefficient
for loops to
append to a list when it was more efficient to
extend since it requires less operations.
The key differnce is that append will add a single to the end of a list, where extend will inject a list to the end of a list merging the two lists. This is particularly handy when you want to grab JSON from several requests and merge them together into a single JSON object for further processing which is what I am doing in this script.
Using Requests HEAD
I also explored more of the requests library and made an optimization that looks really silly in hindsight.
In the script I was making a single request in order to grab the headers to see the total number of pages. Instea of using
request.head which has a tiny payload of headers, I was using
request.get which gets the headers along with the entire JSON payload. This was immediately thrown away since I did not use the response in later parts of the function.
Exploring the WordPress API Filters
I also explored more of the WordPress API and started to use some API level filters to reduce the payload that I was receiving in an effort to reduce the overall time that the script takes to run. Specifically I am now using
context=embed which removes the text body (since I only need the title and the link), and
before=(today - 1 year + 1 day) since I only care about posts that were written more than a year ago today.
JSON is Not SQL
I’ve been thinking about my very first forray into any sort of programming years ago. I primarily worked with Microsoft SQL Server and learned how to write efficient queries. I was thinking of how easy this problem would have been to solve if I had direct access to the database. The lesson here, that it is still taking me a while to fully wrap my head around, is that JSON is not a SQL database. You have to think about it differently. If an API offers the ability to do some filtering you should take advantage of it when you can.