56 Books, My Literary Journey in 2018

According to goodreads, I read 56 books in 2018. This was 4 over my goal of 52 books! I spent a lot of the year reading books from my never ending list of tralev books but I also took some time to appreciate the classics, award winners, and a random selection of history and business books from Prime reading.

I think my absolute favorite book of the year was The Dark Forest (#2 of the Remembrance of Earth’s Past Book series). I read the entire series this year and gave each book 5 stars. Im a huge sci-fi fan and this is the best series I’ve ever read hands down.

I’m pretty conservative with my good reads ratings.

1 – why did I ever read this

2 – “ok”, if you’re a fan of this topic

3 – good book

4 – great book

5 – read this before you die

A couple other books that I really enjoyed last year were:

  • Giant of Enterprise – a book about business tycoons that was recommended to me by Nathan.
  • Kitchen Confidential – a book by the late Anthony Bourdain which gives us a peek into the real world of a career as a chef.
  • Less – a pulitzer prize winning book about life, love, and loss.
  • East of Eden – a masterpiece by one of Americas most legendary authors.

I hate being negative, but my least favorite book by far last year year was “That Dark and Bloody River” I wrote a scathing review of this one on my other blog. It was also the longest book I read last year. I have a terrible problem of not being able to stop reading a book halfway in between. I read this tome with anger over a series of several long haul flights.

I want to read 52 more books in 2019. Join me in the challenge. Happy reading!

Posted in books | Leave a comment

Converting CSV to a SQLite Database

As a part of my data science course on EdX we have been working with a lot of csv files. I spoke SQL long before I spoke Pandas and I find that it is much easier to do initial exploration of the data using raw SQL queries compared to the Pandas DSL.

Kaggle is a great repository full of useful data sets that are ripe for exploration. While a lot of these data sets come in both csv and sql flavors, some of them are CSV only. Using SQLit we are able to easily import these csv files into a database and then run queries for further data exploration. 

Im going to use the kickstarter data set for this tutorial, feel free to download the csv files from kaggle so that you can follow along. 

Pre Requisites

Make sure that you have SQLite installed before getting started with this tutorial. 

Steps to Convert CSV to SQLite

First, Download the data set from kaggle, this will come in the form of a zip file. Unzip this and open up a terminal in the directory where you have the new unzipped kickstarter-projects folder. 

In your terminal open up a new sqlite session followed by the name of the file that you want to save your new database to. 

sqlite3 ks.db

Inside of the sqlite shell, change the mode to csv. 

.mode csv

Import the csv file, and add the name of the table that you want the data to be imported into. 

.import kickstarter-projects/ks-projects-201801.csv ks

Verify that everything was imported correctly. Take a look at the schema, and first couple of rows. Your output should look something like this: 

sqlite> .schema ks

CREATE TABLE ks(
"ID" TEXT,
"name" TEXT,
"category" TEXT,
"main_category" TEXT,
"currency" TEXT,
"deadline" TEXT,
"goal" TEXT,
"launched" TEXT,
"pledged" TEXT,
"state" TEXT,
"backers" TEXT,
"country" TEXT,
"usd pledged" TEXT,
"usd_pledged_real" TEXT,
"usd_goal_real" TEXT
);


sqlite> select * from ks limit 5;

1000002330|The Songs of Adelaide & Abullah|Poetry|Publishing|GBP|2015-10-09|1000.00|2015-08-11 12:12:28|0.00|failed|0|GB|0.00|0.00|1533.95
1000003930|Greeting From Earth: ZGAC Arts Capsule For ET|Narrative Film|Film & Video|USD|2017-11-01|30000.00|2017-09-02 04:43:57|2421.00|failed|15|US|100.00|2421.00|30000.00
1000004038|Where is Hank?|Narrative Film|Film & Video|USD|2013-02-26|45000.00|2013-01-12 00:20:50|220.00|failed|3|US|220.00|220.00|45000.00
1000007540|ToshiCapital Rekordz Needs Help to Complete Album|Music|Music|USD|2012-04-16|5000.00|2012-03-17 03:24:11|1.00|failed|1|US|1.00|1.00|5000.00
1000011046|Community Film Project: The Art of Neighborhood Filmmaking|Film & Video|Film & Video|USD|2015-08-29|19500.00|2015-07-04 08:35:03|1283.00|canceled|14|US|1283.00|1283.00|19500.00

Excellent! Now you can query this entire data set as your normally would. Happy data exploration! 

Posted in data science | 2 Comments

Emerging trends in technology provided by ThoughtWorks.

A classmate shared this link with us in regards to visualizations that are powerful. I’ve not seen this report before, but it provides so much insight into the pulse of enterprise software development.

The Technology Radar is our thoughts on emerging technology trends in the industry. Read the latest here.

Source: Technology Radar | Emerging Tech Trends for 2018 | ThoughtWorks

It was especially interesting to see “1% canary” and “Incremental delivery with COTS” moving up the list of techniques since these are the things that I am working with enterprise companies on at LaunchDarkly on a daily basis.

I’ve been really amazed at the speed at which large enterprises are adopting the progressive delivery model, this report provides further evidence for me that this is not an anomaly but rather a trend in software engineering.

Posted in software | Leave a comment

Make A Symbolic Link to Your iCloud Drive

If you use iCloud Drive to store documents and also use the terminal quite a bit, it might be handy to add a symbolic link to iCloud into your home directory. This will allow you to easily make your way around your iCloud files from a terminal.

You can do this with these steps:

  1. Open up a terminal
  2. Run the following command
ln -s "/Users/$USER/Library/Mobile Documents/com~apple~CloudDocs" iCloud

Note: the quotes above are important since there is a space in the directory path.

If you run ls, you will now see a folder called iCloud which is a symbolic link to your iCloud drive.

Next time that you need to scp a file from your iCloud Drive to some server, rather than googling the path to your iCloud folder, you can simply cd into it and go to town.

Posted in software | Leave a comment

Hans Rosling’s 200 Countries, 200 Years, 4 Minutes

As a part of the visualization section of the python for data science course on EdX we watched this awesome video showing the health and wealth correlation of 200 countries over the last 200 years.

This is a fascinating look into the power of visualization, statistics, and data science. It is also a very interesting story that Rosling was able to convey in just 4 minutes.

Posted in data science | Leave a comment

German Mini Van

I booked a mini van

Get hyped

I got this message from my friend tzeejay after he invited us to visit Yosemite with him over the Thanksgiving break. I was looking forward to the trip all week since I have not seen him in a while and I’ve never been to Yosemite.

Wednesday evening comes around, and he pulls up to our apartment in a huge white Cadillac Escalade. At first I assumed that mini van in German was slang for “huge SUV”, but then he said “They were out of mini vans.”

We made our way toward the central valley on 580, stopping in Dublin so that I could pick up a pair of boots and grabbing dinner at In-N-Out burger. We arrived in the evening at the Courtyard in Merced and spent the night there before leaving early Thursday morning toward Yosemite.

Road to Yosemite

The biggest challenge with traveling during a massive national holiday is that there is a lack of places that are open to get meals. We went through Mariposa, CA on the way to the mountain and every single restaurant and diner in the entire city was closed. Luckily there was a gas station deli that was open so we stopped there to grab some food. They had some delicious fatty and cheesy double sausage breakfast sandwiches and mediocre gas station coffee.

We finally got to Yosemite National Park. I was surprised at how many people were there during a holiday. I can’t imagine what it must be like there during a weekend in the summer.

I’ve seen a lot of beauty during my travels, but I think that Yosemite Valley is the most beautiful place on earth.

Yosemite Valley

El Capitan

Posted in travel | Leave a comment

Sofia Heisler No More Sad Pandas 

In this excellent talk, Sofia Heisler explains how to optimize Pandas code. There is a striking difference between a naive implementation and an optimized one, a nearly 500x improvement.

She also described two very useful benchmarking functions, timeit, and line_profiler.

I wonder how much of this would apply to “normal” python development? If nothing else, the two benchmarking functions will certainly come in handy in future optimization.

Posted in data science | Tagged | Leave a comment

Six Foot Long Scorpions

There was a super interesting article in the SF Chronicle today about scorpions. It profiled Lauren Esposito, an arachnologist at the California Academy of Sciences, and her work studying these creatures. The venom found in some species of scorpions has been used to provide us with insights in fighting disease and developing opioid free pain relief. She discussed how original scorpions were over six feet long. Can you image?

It all began, she said, about 450 million years ago, when the ancestors of scorpions became the first arthropods to leave the sea. At the time, they were monsters, up to 6½ feet long, with stingers to match.

Source: Bay Area scientist stung by public’s perception of scorpions as scary scourge – SFChronicle.com

 

Posted in science | Leave a comment

Installing Python Packages for DSE200x with pip

Python for Data Science is an introductory course that provides an overview of various tooling that exists in the python world that is useful for data science purposes.

This includes things like:

  • Jupyter Notebooks
  • The numpy library
  • The scipy library
  • The pandas library
  • The matplotlib library

The course provides an excellent overview to python, and suggests using anaconda which is a distribution of python geared toward data science purposes. Although this is a great way to get started with python if you have never used it before, installing multiple versions of python (which this approach would do) can be quite a pain to manage long term. This is especially true if you use python for other purposes such as web development with flask or django.

If you try to work through some of the jupyter notebooks that are presented in the course without installing anaconda, you will often see error messages like this:

---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
in
1 get_ipython().run_line_magic('matplotlib', 'inline')
2 import numpy as np
----> 3 from scipy import misc
4 import matplotlib.pyplot as plt

ModuleNotFoundError: No module named 'scipy'

The solution is of course to install the package that is missing. In the example above we can install the missing package with pip3 install scipy.

You can use the one liner below to install all of the required packages for the course in one go. Note, this assumes you already have python3 and pip3 installed on your computer.

pip3 install jupyter pandas numpy scipy matplotlib imageio folium
Posted in data science | Leave a comment

Data Science Helps Fight Wildfires

I’m taking a data science course on EdX as a part of the UCSD MicroMaster program. I’ve always been curious in data science from a cursory point of view, but now I want to use it to analyze and understand business problems.

So far we have learned that data science is used in nearly every industry to provide actionable insights into complex problems. One of the most interesting things that I’ve seen so far is how data science is being used to help fight forest fires in real time.

Coming off a terrible week for wildfires where California faced the deadliest fire in its history, this application of data science seems more relevant and urgent than ever.

Posted in data science | Leave a comment