Converting CSV to a SQLite Database
As a part of my data science course on EdX we have been working with a lot of csv files. I spoke SQL long before I spoke Pandas and I find that it is much easier to do initial exploration of the data using raw SQL queries compared to the Pandas DSL.
Kaggle is a great repository full of useful data sets that are ripe for exploration. While a lot of these data sets come in both csv and sql flavors, some of them are CSV only. Using SQLit we are able to easily import these csv files into a database and then run queries for further data exploration.
Im going to use the kickstarter data set for this tutorial, feel free to download the csv files from kaggle so that you can follow along.
Pre Requisites
Make sure that you have SQLite installed before getting started with this tutorial.
Steps to Convert CSV to SQLite
First, Download the data set from kaggle, this will come in the form of a zip file. Unzip this and open up a terminal in the directory where you have the new unzipped kickstarter-projects folder.
In your terminal open up a new sqlite session followed by the name of the file that you want to save your new database to.
sqlite3 ks.db
Inside of the sqlite shell, change the mode to csv.
.mode csv
Import the csv file, and add the name of the table that you want the data to be imported into.
.import kickstarter-projects/ks-projects-201801.csv ks
Verify that everything was imported correctly. Take a look at the schema, and first couple of rows. Your output should look something like this:
sqlite> .schema ks
CREATE TABLE ks(
"ID" TEXT,
"name" TEXT,
"category" TEXT,
"main_category" TEXT,
"currency" TEXT,
"deadline" TEXT,
"goal" TEXT,
"launched" TEXT,
"pledged" TEXT,
"state" TEXT,
"backers" TEXT,
"country" TEXT,
"usd pledged" TEXT,
"usd_pledged_real" TEXT,
"usd_goal_real" TEXT
);
sqlite> select * from ks limit 5;
1000002330|The Songs of Adelaide & Abullah|Poetry|Publishing|GBP|2015-10-09|1000.00|2015-08-11 12:12:28|0.00|failed|0|GB|0.00|0.00|1533.95
1000003930|Greeting From Earth: ZGAC Arts Capsule For ET|Narrative Film|Film & Video|USD|2017-11-01|30000.00|2017-09-02 04:43:57|2421.00|failed|15|US|100.00|2421.00|30000.00
1000004038|Where is Hank?|Narrative Film|Film & Video|USD|2013-02-26|45000.00|2013-01-12 00:20:50|220.00|failed|3|US|220.00|220.00|45000.00
1000007540|ToshiCapital Rekordz Needs Help to Complete Album|Music|Music|USD|2012-04-16|5000.00|2012-03-17 03:24:11|1.00|failed|1|US|1.00|1.00|5000.00
1000011046|Community Film Project: The Art of Neighborhood Filmmaking|Film & Video|Film & Video|USD|2015-08-29|19500.00|2015-07-04 08:35:03|1283.00|canceled|14|US|1283.00|1283.00|19500.00
Excellent! Now you can query this entire data set as your normally would. Happy data exploration!
Thank you for reading! Share your thoughts with me on bluesky, mastodon, or via email.
Check out some more stuff to read down below.
Most popular posts this month
- 2024
- Setting up ANTLR4 on Windows
- My Custom Miniflux CSS Theme
- SQLite DB Migrations with PRAGMA user_version
- Ten Years of Dreaming of San Francisco
Recent Favorite Blog Posts
This is a collection of the last 8 posts that I bookmarked.
- Fedora Magazine: Contribute to Fedora 44 KDE and GNOME Test Days from Fedora People
- Pluralistic: bunnie's piggyback hack (09 Jan 2026) from Pluralistic: Daily links from Cory Doctorow
- Clicks Communicator from Chris Hannah
- A Year Of Vibes from Armin Ronacher's Thoughts and Writings
- Pluralistic: A perfect distillation of the social uselessness of finance (18 Dec 2025) from Pluralistic: Daily links from Cory Doctorow
- Moving from WordPress to Substack from charity.wtf
- Grow, Like a Tree Not a Cancer from Jim Nielsen’s Blog
- Pluralistic: All the books I reviewed in 2025 (02 Dec 2025) from Pluralistic: Daily links from Cory Doctorow
Articles from blogs I follow around the net
Announcing Live AI & Design Systems Jam Sessions!
Ian, TJ, and I are excited to announce live AI & Design Systems Jam Sessions with our AI & Design Systems course community! Our first jam session will be Thursday, February 26 at 10AM ET. In these recurring biweekly Zoom […]
via Blog – Brad Frost February 16, 2026I Sold Out for $20 a Month and All I Got Was This Perfectly Generated Terraform
Until recently the LLM tools I’ve tried have been, to be frank, worthless. Copilot was best at writing extremely verbose comments. Gemini would turn a 200 line script into a 700 line collection of gibberish. It was easy for me to, more or less, ignore LLM…
via matduggan.com February 16, 2026Pluralistic: The online community trilemma (16 Feb 2026)
Today's links The online community trilemma: Reach, community and information, pick two. Hey look at this: Delights to delectate. Object permanence: Bruces x Sony DRM; Eniac tell-all; HBO v PVRs; Fucking damselflies; Gil Scout Cookie wine-pairings; Bi…
via Pluralistic: Daily links from Cory Doctorow February 16, 2026Generated by openring