levi leach projects

The Dataset

Source and Explanation

The dataset used in this project was obtained from Kaggle (https://www.kaggle.com/yamaerenay/spotify-dataset-19212020-160k-tracks), and contains the audio features of about 175,000 songs in the Spotify database released between 1921 and 2021. The first visualization (scatterplot) represents a random sampling of 1000 tracks from the dataset, while the other visualizations represent the entire dataset.

Motivation

As an avid music listener, this project presented an opportunity to dive deeper into understanding one of my interests from the perspective of a data scientist. Are there any common relationships or correlations between the various attributes of tracks? Are there any significant trends in the types of published music? How do the qualities of my favorite songs compare to the "average" song? All of these questions are explored through the visualizations created.

Insights

In the first visualization, I was surprised to see correlations weaker than what was expected across several relationships. For example, a strong positive correlation was expected between danceability and tempo (faster songs are better to dance to). Instead, a very weak correlation existed. Across attributes, correlations tend to fall between -0.5 and +0.5, representing weak to moderate relationships.

In the second visualization, there appears to be a significant shift in trends beginning in the 1950s. Most significantly, acousticness begins to plummet and energy rises rapidly around this time. Moreover, year-to-year averages appear much more chaotic prior to 1950. Another trend of note is that in the 1990s, the percentage of tracks that are explicit rises dramatically. This can likely be attributed to the rise of hip-hop and rap music, which is frequently explicit.

With the third visualization being user-directed, insights will be unique to each person using it. After searching some of the songs I frequently listen to, I found that my taste is higher in danceability and energy than the median, and lower in acousticness.

Attribute Explanations

The audio features of the tracks, as determined by Spotify

Attribute	Description
Acousticness	A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic.
Danceability	How suitable a track is for dancing (from 0.0 to 1.0) based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity.
Energy	A measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy.
Explicit	Whether or not the track has explicit lyrics. 0 if false, 1 if true.
Liveness	Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live.
Loudness	The overall loudness of a track in decibels (dB), averaged across the entire track. Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude).
Popularity	A value between 0 and 100, with 100 being the most popular. The popularity is calculated by algorithm and is based, in the most part, on the total number of plays the track has had and how recent those plays are.
Speechiness	Detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value.
Tempo	The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece.
Valence	A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).

Attribute

Description

Acousticness

A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic.

Danceability

How suitable a track is for dancing (from 0.0 to 1.0) based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity.

Energy

A measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy.

Explicit

Whether or not the track has explicit lyrics. 0 if false, 1 if true.

Liveness

Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live.

Loudness

The overall loudness of a track in decibels (dB), averaged across the entire track. Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude).

Popularity

A value between 0 and 100, with 100 being the most popular. The popularity is calculated by algorithm and is based, in the most part, on the total number of plays the track has had and how recent those plays are.

Speechiness

Detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value.

Tempo

The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece.

Valence

A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).

How Does Your Song Compare?

Search a song and compare its attributes to the medians of all songs.
Note: Not all songs in Spotify's database are present in the dataset (about 175,000 tracks are searchable).

Exploring Spotify Track Data

The Data

Attributes

Explore Relationships

Find Trends

Search a Track

The Dataset

Source and Explanation

Motivation

Insights

Attribute Explanations

The audio features of the tracks, as determined by Spotify

Explore Relationships

Do any relationships exist between the attributes of tracks?
Select from the dropdown menus to find out. Hover on a point to see that song's data.

How Have Songs Changed Over the Years?

See the average score of all songs across time.
Highlight an area to zoom.

How Does Your Song Compare?

Search a song and compare its attributes to the medians of all songs.
Note: Not all songs in Spotify's database are present in the dataset (about 175,000 tracks are searchable).

Exploring Spotify Track Data

The Data

Attributes

Explore Relationships

Find Trends

Search a Track

The Dataset

Source and Explanation

Motivation

Insights

Attribute Explanations

The audio features of the tracks, as determined by Spotify

Explore Relationships

Do any relationships exist between the attributes of tracks? Select from the dropdown menus to find out. Hover on a point to see that song's data.

How Have Songs Changed Over the Years?

See the average score of all songs across time. Highlight an area to zoom.

How Does Your Song Compare?

Search a song and compare its attributes to the medians of all songs. Note: Not all songs in Spotify's database are present in the dataset (about 175,000 tracks are searchable).

Do any relationships exist between the attributes of tracks?
Select from the dropdown menus to find out. Hover on a point to see that song's data.

See the average score of all songs across time.
Highlight an area to zoom.

Search a song and compare its attributes to the medians of all songs.
Note: Not all songs in Spotify's database are present in the dataset (about 175,000 tracks are searchable).