Calculating Pearson Correlation Between Bitcoin and Ethereum Prices Using Python

In my previous blog, I highlighted the use of correlation methodologies such as Pearson correlation. Understanding the relationship between the prices of different cryptocurrencies can provide valuable insights for investors and analysts. One statistical measure that helps quantify the linear relationship between two variables is the Pearson correlation coefficient. In this blog, we will demonstrate how to calculate the Pearson correlation between Bitcoin (BTC) and Ethereum (ETH) prices using Python.

What is Pearson Correlation?

The Pearson correlation coefficient measures the strength and direction of the linear relationship between two variables. Its value ranges from -1 to 1:

  • 1: Perfect positive linear relationship.
  • -1: Perfect negative linear relationship.
  • 0: No linear relationship.

Steps to Calculate Pearson Correlation

  1. Import Necessary Libraries – We will use pandas for data manipulation and scipy.stats for statistical calculations.
  2. Fetch Historical Price Data – Obtain historical daily closing prices for BTC and ETH.
  3. Calculate Daily Returns – Compute the daily percentage change in prices.
  4. Compute the Pearson Correlation Coefficient – Use the daily returns to calculate the correlation.

Implementation

Here is a step-by-step guide to performing the analysis:

1. Import Necessary Libraries

python

import pandas as pd

import scipy.stats as stats

2. Fetch Historical Price Data

For this example, we will use historical data from the Gemini exchange. You can download the datasets directly:

  • Bitcoin (BTC) Data
  • Ethereum (ETH) Data

Load the data into pandas DataFrames:

python
# Load BTC data 
eth_df = pd.read_csv(‘Gemini_BTCUSD_d.csv’,skiprows=1)
# Load ETH data 
eth_df = pd.read_csv(‘Gemini_ETHUSD_d.csv’,skiprows=1)

3. Preprocess the Data

Ensure the data is sorted by date and calculate the daily percentage change:

python
# Convert ‘date’ to datetime and sort
btc_df[‘date’] = pd.to_datetime(btc_df[‘date’])
btc_df.sort_values (‘date’, inplace=True)
eth_df[‘date’] = pd.to_datetime(eth_df[‘date’])
eth_df.sort_values (‘date’, inplace=True)
# Calculate daily percentage change
btc_df[‘pct_change’] = btc_df[‘close’].pct_change()
eth_df[‘pct_change’] = eth_df[‘close’].pct_change()
4. Merge the DataFrames

Combine the BTC and ETH DataFrames on the ‘date’ column:

python
merged_df = pd.merge(btc_df[[‘date’,’pct_change’]],eth_df[[‘date’, ‘pct_change’]],
                                                    on=’date’,suffixes=(‘_btc’,’_eth’))
5. Calculate the Pearson Correlation Coefficient
python
# Drop missing values
merged_df.dropna(inplace=True)
# Calculate Pearson correlation
correlation, p_value = stats.pearsonr(merged_df[‘pct_change_btc’], merged_df[pct_change_eth’])
print(f’Pearson Correlation Coefficient: {correlation}’)
print(f’P-value: {p_value}’)
Interpretation
  • Pearson Correlation Coefficient – Indicates the strength and direction of the linear relationship between BTC and ETH daily returns.
  • P-value – Helps determine the statistical significance of the correlation.

A coefficient close to 1 suggests a strong positive correlation, meaning that BTC and ETH prices tend to move in the same direction. A coefficient close to 0 indicates little to no linear relationship.

Takeaway

By following the steps above, you can calculate the Pearson correlation between Bitcoin and Ethereum prices using Python. This analysis can be extended to other cryptocurrencies or financial assets to better understand their relationships.

— Caio Marchesani

Scroll to Top