Preface: This blog post is part of a series of posts about work I did during my time as in the 12-week Metis Data Science Bootcamp in Winter 2021. This post focuses on gathering track metadata for import into pandas. It will be updated for clarity as feedback is gathered and as my project expands in scope. In the future, I will link articles related to the construction of an audio content-based Spotify song recommender.
Hardware Used: MacBook Pro (Late 2016)
Software Used: Google Chrome (Version 89.0), Jupyter Labs (Version 3.0.7), Spotify Desktop App (Version 1.1.54), Spotify Web App, Google Drive, Google Colab
Install SpotiPy 
pip install spotipy
or upgrade it:
pip install spotipy --upgrade
Choose your preferred Python IDE as any one should work for smaller projects. I used Jupyter Labs with local computing for the exploratory part of my project. Later, when I was building a song recommender, more RAM was required so I used the cloud computing services on Google Colab.
2) Create an App
- Head to https://developer.spotify.com/dashboard/ and log in to Spotify using whatever Spotify account you prefer. Note: I used a premium account (read: paid) and cannot comment on accessibility and usability with a free account.
- Select “Create an App”
- Record and store your “Client ID” and “Client Secret” in a secure location. These will be required for accessing the API. Do not share your “Client ID and Client Secret” to avoid getting your account hacked. You can reset the “Client Secret” if you need to, but it is not advised unless necessary (see screenshot below).
3) Add Redirect URL(s)
- Click on “Edit Settings”. Add ≥ 1 Redirect URLs. Depending on your IDE this can be a local address (e.g., “http://localhost:8990/callback/”) or a website “http://<yourpersonalwebsite.com>/”). Notice the “/” at the end of the two addresses. Some users reported issues when that “/” was not added at the end. For Jupyter Lab I was able to use a local host but for Google Colab I was only successful when using a public website. I didn’t try it myself, but other have reported success using websites like “http://.google.com”.
- Add it and click outside the box window to save it.
- Optional: Add a website that links to your project such as a GitHub profile or a personal website.
4) Get authorization
Certain actions require authorization from Spotify. It is recommended for long-running applications. Queries without authorization are read-only. The following is a code snippet I used for authorization:
sp = spotipy.Spotify(auth_manager=SpotifyOAuth(client_id="<YOUROWNUSERID>",client_secret="<YOUROWNCLIENTSECRET>",redirect_uri="<YOUROWNRESPONSEURL",scope="user-library-read"))
5) Query Results
Query response data are returned in JSON format. A example of a general search query is:
results = sp.search(q='<THINGYOUWANT>', limit=X)
where “X” is the specified result limit. Below is an example of a query results for “congratulations”. Note that the “available_markets” field will be long so you may want to restrict your results to just a handful at first. I like to use 5 as my starting point.
6) Parse metadata JSON
The metadata JSON can be a bit cumbersome and intimidating upon first glance, or as I initially put it, look like “nonsense soup”. So, I wrote a function to parse the JSON data into a format ready for a pandas dataframe. The function “tr_md()” takes a track and returns 22 different pieces of metadata.
Code can be found here:
7) Response Data Types
There are many different pieces of information that can be obtained from the Spotify API. The official documentation lists all the possibilities at https://developer.spotify.com/documentation/web-api/reference/#objects-index. I found the TrackObject keys the most helpful as a single group, but your use case will will determine which keys are most appropriate. Use the metadata parser function ‘tr_md()’ as a starting point when trying to deal with the JSON data. Playing around with the bracketing structure [‘example’]seen in the “tr_md()” is how I navigated the JSON when building the function.
Hopefully this short introduction to allowed you to connect to the Spotify API and start exploring. There are innumerable directions a Spotify-based project can take and this is just a tiny example of some of the information that can be obtained with a fairly small amount of code.
Resources and Upcoming Posts
- Pagination for returning lots of results in a single query.
- A function to return audio feature data and parse the JSON.
- Functions for getting track data from playlists, artists, albums, and Spotify users.
- Constructing large libraries of songs efficiently and quickly.
- A discussions or errors and problems I encountered along the way
- A link to -and discussion of- my GitHub project page for my Spotify content-based song recommender.
Lastly, please posts your comments, questions, and tips! This is the first in a series of posts on how to build an audio content-based Spotify recommender. It is by no means exhaustive in any way and I’m sure I’ve overlooked some key issues I had when I first began working with the API. Happy fun exploring the song data!
 https://newsroom.spotify.com/company-info/, accessed on 03/25/2021
 https://github.com/plamere/spotipy, accessed on 03/25/2021
 https://spotipy.readthedocs.io/en/2.17.1/, accessed on 03/25/2021
 https://developer.spotify.com/documentation/web-api/reference/#objects-index, accessed on 3/25/2021