Python: 稼げる仮想通貨を見つけるには〔４〕Yahoo Financeの250の仮想通貨から累積リターンのベスト10/ワースト10をグラフに表示する【Yahoo Finance, Ranking, Cumulative Log Return】

ようこそ「Python」へ...

Python»記事(Article132) ◀前の記事次の記事▶

稼げる仮想通貨を見つけるには〔４〕Yahoo! Financeの250の仮想通貨から累積リターンのベスト10/ワースト10をグラフに表示する

【１】稼げる仮想通貨を見つけるには：仮想通貨のリターン(収益率)と累積リターンを計算する〔Log Return, Cumulative Log Return〕
【２】稼げる仮想通貨を見つけるには：複数の仮想通貨のログリターン・累積ログリターンをグラフに表示して比較する〔Matplotlib, Log Return, Cumulative Log Return〕
【３】稼げる仮想通貨を見つけるには：複数の仮想通貨の累積リターンをホバリング機能付きでグラフに表示する【Plotly Express, Hovering, Cumulative Log Return】
【４】稼げる仮想通貨を見つけるには：Yahoo Financeの250の仮想通貨から累積リターンのベスト10/ワースト10をグラフに表示する【Yahoo Finance, Ranking, Cumulative Log Return】
【５】稼げる仮想通貨を見つけるには：仮想通貨のデータをローソク足チャート(移動平均線, ボリューム付き)に表示する【Matplotlib, Mpfinane, Plotly Express】
【６】稼げる仮想通貨を見つけるには：Binanceの424のコインから累積リターンが337749%のコインを探す【Binance, Cumulative Return, Ranking】
【７】稼げる仮想通貨を見つけるには：仮想通貨を任意の期間(15分,30分,60分,...)で再編集して累積リターンを比較する【Pandas DataFrame, ReSample, Round】
【８】稼げる仮想通貨を見つけるには：GMOコインから累積リターンのベスト10/ワースト10を探す【GMO Coin, Ranking, Cumulative Return

ここでは８回に分けて稼げる仮想通貨を見つける方法について解説します。第４回目では、米国のYahoo! Financeから250種類の仮想通貨のデータをダウンロードして、累積リターンのベスト10／ワースト10を表示します。ランキングはテキストとグラフで表示します。なお、累積リターンについては第１回目の「記事(Article129)」で詳しく解説しています。

米国のYahoo! Financeから全ての仮想通貨(コイン)のシンボルをダウンロードするには、 PythonのYahooqueryライブラリを使用します。このライブラリのScreenerクラスの「.get_screeners()」メソッドを使用すると最大250コインのシンボルが取得できます。

PandasのDataFrameに格納されているコインのデータを並べ替えるには、 DataFrameの「nlargest()/nsmallest()」メソッドを使う方法と、「sort_values()」メソッドを使う方法があります。ここでは、双方のメソッドを使用してランキングを表示します。

DataFrame上で並べ替えたコインは、MatplotlibとPlotly Expressを使用してグラフに表示します。

一般に仮想通貨に投資するときは、次のようなメトリック（投資のパフォーマンスや効果を測定するための指標や数値）を使用して判断します。

日次リターン (Daily Return)
仮想通貨の投資において1日あたりの収益率を示します。
ログリターン (Log Return)
仮想通貨の投資においての対数収益率を示します。
累積ログリターン (Cumulative Log Return)
仮想通貨の投資においての累積的な対数収益率を示します。
トレーディングボリューム (Trading Volume)
仮想通貨の取引量を示します。
マーケットキャップ (Market Cap)
仮想通貨の時価総額を示します。
価格波動性 (Price Volatility)
仮想通貨の価格の変動率を示します。

さらに、実際にトレードするときは、次のようなテクニカルインジケーターを複数組み合わせて売買のタイミングを判断します。

Simple Moving Average (SMA)
SMAが上昇傾向であれば買い。
Exponential Moving Average (EMA)
EMAが上昇傾向であれば買い。
Bollinger Bands (BB)
Bollinger Bandsで帯域が狭まっている場合は価格変動が大きいと見られ、買い/売りタイミングに注意。
Relative Strength Index (RSI)
RSIが70以上であれば売り、30以下であれば買い。
Stochastic Oscillator (STO)
STOが80以上であれば売り、20以下であれば買い。

説明文の左側に図の画像が表示されていますが縮小されています。画像を拡大するにはマウスを画像上に移動してクリックします。画像が拡大表示されます。拡大された画像を閉じるには右上の[X]をクリックします。画像の任意の場所をクリックして閉じることもできます。

Yahoo! Financeの250のコインから累積リターンのベスト10/ワースト10をテキストとグラフに表示する

まずは、Visual Studio Codeを起動してプログラムファイルを作成する

Visual Studio Code (VS Code)を起動したら新規ファイル(*.py)を作成して行1-567をコピペします。ここでは、Jupter NotebookのようにPythonのプログラムをセル単位で実行します。 VS Codeの場合は「#%%」から「#%%」の間がセルになります。セルを選択したら[Ctrl + Enter」でセルのコードを実行します。 IPythonが起動されて「インタラクティブ」ウィンドウが表示されます。「インタラクティブ」ウィンドウからはPythonのコードを入力して実行させることができます。たとえば、「df.info()」を入力して[Shift + Enter」で実行します。

* Article.py:

# Daily returns vs Log returns article (Part4)
# %%

### Import pandas, matplotlib, plotly libraries 
import os
import math
import numpy as np

import pandas as pd

import matplotlib.pyplot as plt 
import matplotlib.dates as mdates

import plotly.offline as offline
import plotly.express as px
import plotly.graph_objs as go

import datetime as dt
from datetime import timedelta
from time import sleep

import yfinance as yf           
from yahooquery import Screener # pip install yahooquery 

import warnings
warnings.simplefilter('ignore')
plt.style.use('fivethirtyeight')
pd.set_option('display.max_rows', 10)


# %%

######################################################################################################################################
def load_data(symbol: str, start_date: dt.datetime , end_date: dt.datetime, period='1d', interval='1d', prepost=True) -> pd.DataFrame:
    # valid periods: 1d,5d,1mo,3mo,6mo,1y,2y,5y,10y,ytd,max
    # fetch data by interval (including intraday if period < 60 days)
    # valid intervals: 1m,2m,5m,15m,30m,60m,90m,1h,1d,5d,1wk,1mo,3mo    
    try:
        end_date = end_date + timedelta(days=1)
        start_date_str = dt.datetime.strftime(start_date, "%Y-%m-%d")
        end_date_str = dt.datetime.strftime(end_date, "%Y-%m-%d")
        print(f"Loading data for {symbol}: start_date={start_date_str}, end_date={end_date_str}, {period=}, {interval=}")
        df = yf.download(symbol, start=start_date_str, end=end_date_str, period=period, interval=interval, prepost=prepost)
        # Date     Open          High           Low         Close     Adj Close       Volume   Symbol : interval=1d,5d,1wk,1mo,3mo
        # Datetime Open          High           Low         Close     Adj Close       Volume   Symbol : interval=1m,2m,5m,15m,30m,60m,90m,1h

        # Add symbol
        df['symbol'] = symbol 

        # Reset index
        df.reset_index(inplace=True) 

        # Rename Date or Datetime column name to Time
        if interval in '1m,2m,5m,15m,30m,60m,90m,1h':
            df.rename(columns={'Datetime': 'Date'}, inplace=True)
        else: # interval=1d,5d,1wk,1mo,3mo
            df.rename(columns={'Date': 'Date'}, inplace=True)    

        # Convert column names to lower case    
        df.columns = df.columns.str.lower()
        return df
    except Exception as e:
        print(f"load_data({symbol}) exception error: {str(e)}")
        return pd.DataFrame()

############################################
def get_data(csv_file: str) -> pd.DataFrame:
    print(f"Loading data: {csv_file} ")
    df = pd.read_csv(csv_file)       
    # date,open,high,low,close,adj close,volume,symbol
    df['date'] = pd.to_datetime(df['date'])            
    df.set_index(['date'], inplace=True)
    return df   

###############################################################
def calculate_cum_log_return(df: pd.DataFrame) -> pd.DataFrame:
    # Calculate log return
    df['log_return'] = np.log(df['close'] / df['close'].shift(1))  
 
    # Calculate cumulative log return
    df['cum_log_return'] = np.exp(df['log_return'].cumsum()) - 1
    df['cum_log_return_pct'] = df['cum_log_return'] * 100

    # Preview the resulting dataframe 
    print(f"Cumulative Log Return for {df.iloc[-1]['symbol']} = {df.iloc[-1]['cum_log_return_pct']:.2%}")     
    return df

##############################
# Main
##############################

### Load crypto symbols from yahoo finance
s = Screener()
# s.available_screeners

data = s.get_screeners('all_cryptocurrencies_us', count=250)    # max=250 

# data is in the quotes key
dicts = data['all_cryptocurrencies_us']['quotes']
symbols = [d['symbol'] for d in dicts]
print(f"{len(symbols)=}") # 250 coins
# symbols


# %%

### Load the crypto data from yahoo finance
interval = '1d' # 1m,2m,5m,15m,30m,60m,90m,1h,1d,5d,1wk,1mo,3mo

symbol_list = []
cum_log_return_list = []
cum_log_return_pct_list = []

for symbol in symbols:
    csv_file = f"data/csv/all_cryptocurrencies({symbol})_{interval}.csv"  # data/csv/all_cryptocurrencies(BTC_USD)_1d.csv
    isFile = os.path.isfile(csv_file)
    if not isFile:    
        if interval in '1m,2m,5m,15m,30m,60m,90m,1h':
            end = dt.datetime.now()          
            start = end - timedelta(days=7)      
        else: # interval=1d,5d,1wk,1mo,3mo  
            start = dt.datetime(2020,1,1)   # 2014,1,1 or 2020,1,1 or 2023,1,1
            end = dt.datetime.now()         
        df = load_data(symbol, start, end, period='1d', interval=interval)
        if not df.empty:    
            df.to_csv(csv_file, index=False)
        else:
            symbols.remove(symbol)    
    # end of if not isFile:

    isFile = os.path.isfile(csv_file)    
    if isFile:
        df = get_data(csv_file)
        print(f"{csv_file=}, {df.shape}")            
        if not df.empty:
            df = calculate_cum_log_return(df)
            df.replace([np.inf, -np.inf], np.nan).dropna(axis=1, inplace=True)
            cum_log_return = df.iloc[-1]['cum_log_return']
            cum_log_return_pct = df.iloc[-1]['cum_log_return_pct']
            symbol_list.append(symbol)
            cum_log_return_list.append(cum_log_return)
            cum_log_return_pct_list.append(cum_log_return_pct)
# end of for symbol in symbols:

# print(symbol_list)
# print(cum_log_return_list)
# print(cum_log_return_pct_list)


# %%

### Create DataFrame from dict
data = {
    'symbol': symbol_list,
    'cum_log_return': cum_log_return_list,
    'cum_log_return_pct': cum_log_return_pct_list
}

raw_df = pd.DataFrame(data)
if raw_df.empty:
    print(f"Quit the program due to raw_df is empty: {raw_df.empty=}")
    quit()


# %%

df = raw_df.copy()
df = df.drop(columns=['symbol'])

# Printing the count of not a number values 
c = int(np.isnan(df).values.sum())
print("It contains " + str(c) + " not a number values")

# Printing the count of infinity values 
c = int(np.isinf(df).values.sum())
print("It contains " + str(c) + " infinite values")


# %%

# Printing column name where not a number is present 
print('-'*60)
print("Printing column name where not a number is present")
col_name = df.columns.to_series()[np.isnan(df).any()]
print(col_name)

# Printing column name where infinity is present 
print('-'*60)
print("Printing column name where infinity is present")
col_name = df.columns.to_series()[np.isinf(df).any()]
print(col_name)


# %%

# Printing row index with not a number  
print("Printing row index with not a number ")

r = df.index[np.isnan(df).any(1)]
print(r)    # Int64Index([115, 168, 193, 220], dtype='int64')    
# print('-'*50)
# print(raw_df.loc[115])
# print('-'*50)
# print(raw_df.loc[168])
# print('-'*50)
# print(raw_df.loc[193])
# print('-'*50)
# print(raw_df.loc[220])


# %%

# Printing row index with infinity  
print("Printing row index with infinity ")

r = df.index[np.isinf(df).any(1)]
print(r)    # Int64Index([13, 133], dtype='int64')
# print('-'*50)
# print(raw_df.loc[13])
# print('-'*50)
# print(raw_df.loc[133])


# %%

### Replace np.inf or -np.inf (positive or negative infinity) with np.nan(Not A Number)
df = df.replace([np.inf, -np.inf], np.nan)
df.isnull().sum() 


# %%

### Drop rows if np.nan (Not A Number)
df.dropna(axis=0, inplace=True)
df.isnull().sum() 


# %%

df = raw_df.copy()
df = df.replace([np.inf, -np.inf], np.nan)
df.dropna(axis=0, inplace=True)
raw2_df = df.copy()

### Print Top or Bottom 10 Cryptocurrencies by Cumulative Log Return : Reset index and add 1 to each index
best_df = df.nlargest(10, 'cum_log_return')
worst_df = df.nsmallest(10, 'cum_log_return')
best_df.reset_index(drop=True, inplace=True) 
worst_df.reset_index(drop=True, inplace=True) 

# Add 1 to each index
best_df.index = best_df.index + 1
worst_df.index = worst_df.index + 1

print('Top 10 Cryptocurrencies by Cumulative Log Return')
print('-'*60)
print(best_df)
print()
print('Bottom 10 Cryptocurrencies by Cumulative Log Return')
print('-'*60)
print(worst_df)


# %%

### Print Top or Bottom 10 Cryptocurrencies by Cumulative Log Return : df.iterrows() 
best_df = df.nlargest(10, 'cum_log_return_pct')
worst_df = df.nsmallest(10, 'cum_log_return_pct')
best_df.reset_index(inplace=True) 
worst_df.reset_index(inplace=True) 

# Add 1 to each index
best_df.index = best_df.index + 1
worst_df.index = worst_df.index + 1

print('Top 10 Cryptocurrencies by Cumulative Log Return (%)')
print('-'*60)
for i, row in best_df.iterrows():    
    print(f"{i}: {row['symbol']} \t cumulative log return: {row['cum_log_return_pct']:.2%}")

print()
print('Bottom 10 Cryptocurrencies by Cumulative Log Return (%)')
print('-'*60)
for i, row in worst_df.iterrows():    
    print(f"{i}: {row['symbol']} \t cumulative log return: {row['cum_log_return_pct']:.2%}")


# %%

### Find the specific coin's rank
best_df = df.nlargest(df.shape[0], 'cum_log_return')
worst_df = df.nsmallest(df.shape[0], 'cum_log_return')

best_df.reset_index(inplace=True) 
worst_df.reset_index(inplace=True) 

coin = 'BTC-USD' # BTC-USD, ETH-USD, LTC-USD, MATIC-USD
find_coin = best_df['symbol'] == coin
found_df = best_df[find_coin]
if found_df.shape[0]:
    print(f"The {coin} coin is ranked {found_df.index.values[0]+1}th.")


# %%

### Plot 10 best crypto : Matplotlib version

# Sort cumulative log return in descending order
best_df = df.sort_values('cum_log_return_pct', ascending=True)   
best10_df = best_df.tail(10)
best10_df.reset_index(inplace=True) 

# Draw bars using dataframe.iterrows()
plt.figure(figsize=(10,6))
plt.title('Top 10 Cryptocurrencies by Cumulative Log Return (%)') 

for _, row in best10_df.iterrows():
    bars = plt.barh(row['symbol'], row['cum_log_return_pct'])
    plt.bar_label(bars)

plt.grid(False)   
plt.xlabel('Cumulative Log Returns (%)')
plt.ylabel('Symbol')    
plt.tight_layout()
plt.show()


# %%


# Sort cumulative log return in descending order
best_df = df.sort_values('cum_log_return_pct', ascending=True)   
best10_df = best_df.tail(10)
best10_df.reset_index(inplace=True) 

# Draw bars with colors
plt.figure(figsize=(10,6))
plt.title('Top 10 Cryptocurrencies by Cumulative Log Return (%)') 

colors = ['tab:gray', 'tab:orange', 'tab:brown', 'tab:green', 'tab:blue', 
          'tab:purple', 'tab:pink', 'tab:olive', 'tab:cyan', 'tab:red']

bars = plt.barh(best10_df['symbol'], best10_df['cum_log_return_pct'], color=colors)
plt.bar_label(bars, fmt='%.2f%%')

plt.grid(False)   
plt.xlabel('Cumulative Log Returns (%)')
plt.ylabel('Symbol')    
plt.tight_layout()
plt.show()


# %%


### Plot 10 worst 10 crypto : Matplotlib version

# Sort cumulative log return in descending order
worst_df = df.sort_values('cum_log_return_pct', ascending=False)   
worst10_df = worst_df.tail(10)
worst10_df.reset_index(inplace=True) 
worst10_df['abs_cum_log_return_pct'] = worst10_df['cum_log_return_pct'].apply(lambda x: abs(x))
worst10_df = worst10_df.sort_values('abs_cum_log_return_pct', ascending=False)   
worst10_df.reset_index(inplace=True) 

# Draw bras using dataframe iterrow()
plt.figure(figsize=(16,8)) 
plt.title('Bottom 10 Cryptocurrencies by Cumulative Log Return (%)') 

for _, row in worst10_df.iterrows():
    bars = plt.barh(row['symbol'], -row['abs_cum_log_return_pct'])
    plt.bar_label(bars)
    plt.grid(False)   

plt.xlabel('Cumulative Log Returns (%)')
plt.ylabel('Symbol')    
plt.tight_layout()
plt.show()


# %%

# Sort cumulative log return in descending order
worst_df = df.sort_values('cum_log_return_pct', ascending=False)   
worst10_df = worst_df.tail(10)
worst10_df.reset_index(inplace=True) 
worst10_df['abs_cum_log_return_pct'] = worst10_df['cum_log_return_pct'].apply(lambda x: abs(x))
worst10_df = worst10_df.sort_values('abs_cum_log_return_pct', ascending=False)   
worst10_df.reset_index(inplace=True) 

# Foramt labels add minus '-'
plt.figure(figsize=(16,8))
plt.title('Bottom 10 Cryptocurrencies by Cumulative Log Return (%)') 

bars = plt.barh(worst10_df['symbol'], -worst10_df['abs_cum_log_return_pct'])
plt.bar_label(bars)

plt.grid(False)   
plt.xlabel('Cumulative Log Returns (%)')
plt.ylabel('Symbol')    
plt.tight_layout()
plt.show()


# %%

worst_df = df.sort_values('cum_log_return_pct', ascending=False)   
worst10_df = worst_df.tail(10)
worst10_df.reset_index(inplace=True) 
worst10_df['abs_cum_log_return_pct'] = worst10_df['cum_log_return_pct'].apply(lambda x: abs(x))
worst10_df = worst10_df.sort_values('abs_cum_log_return_pct', ascending=False)   
worst10_df.reset_index(inplace=True) 

# Draw bars with colors
plt.figure(figsize=(16,8))
plt.title('Bottom 10 Cryptocurrencies by Cumulative Log Return (%)') 

colors = ['tab:red', 'tab:orange', 'tab:brown', 'tab:green', 'tab:blue', 
          'tab:purple', 'tab:pink', 'tab:olive', 'tab:cyan', 'tab:gray']

bars = plt.barh(worst10_df['symbol'], -worst10_df['abs_cum_log_return_pct'], color=colors)
labels = [f"{x:.2f}%" if x >= 0 else f"-{-x:.2f}%" for x in worst10_df['cum_log_return_pct']]
plt.bar_label(bars, labels=labels)

plt.grid(False)   
plt.xlabel('Cumulative Log Returns (%)')
plt.ylabel('Symbol')    
plt.tight_layout()
plt.show()


# %%

### Plot 10 worst 10 crypto : Plotly Express version

# Sort cumulative log return in descending order
best_df = df.sort_values('cum_log_return', ascending=True)
best10_df = best_df.tail(10)
best10_df.reset_index(inplace=True)

fig = px.bar(
    best10_df, 
    x='cum_log_return', y='symbol', 
    orientation='h'
)

fig.update_layout(
    title='Top 10 Cryptocurrencies by Cumulative Log Return (%)',
    xaxis_title='Cumulative Log Returns (%)',
    yaxis_title='Symbol',
    xaxis=dict(showgrid=False),
    yaxis=dict(showgrid=False) # width=1000  
)

fig.update_traces(
    textposition='outside', 
    texttemplate='%{x:.2%}'
)

fig.show()


# %%

### Plot 10 worst 10 crypto : Plotly Express with color version

# Sort cumulative log return in descending order
best_df = df.sort_values('cum_log_return', ascending=True)
best10_df = best_df.tail(10) # best_df.iloc[-10:-3] -3 exclusive
best10_df.reset_index(inplace=True)

fig = px.bar(
    best10_df, 
    x='cum_log_return', y='symbol', 
    orientation='h', 
    color='cum_log_return', color_continuous_scale='viridis'
)

fig.update_layout(
    title='Top 10 Cryptocurrencies by Cumulative Log Return (%)',
    xaxis_title='Cumulative Log Returns (%)',
    yaxis_title='Symbol',
    xaxis=dict(showgrid=False),
    yaxis=dict(showgrid=False),
    width=1000
)

fig.update_traces(
    textposition='outside', 
    texttemplate='%{x:.2%}',
    marker=dict(line=dict(width=1, color='black'))
)

fig.show()


# %%

# Sort by value of cum_log_return
worst_df = df.sort_values('cum_log_return', ascending=False)
worst10_df = worst_df.tail(10) 
worst10_df.reset_index(drop=True, inplace=True)
# Sort by absolute value of cum_log_return
worst10_df['abs_cum_log_return'] = worst10_df['cum_log_return'].apply(lambda x: abs(x))
# worst10_df = worst10_df.assign(abs_cum_log_return=worst10_df['cum_log_return'].abs())
worst10_df = worst10_df.sort_values('abs_cum_log_return', ascending=False)
worst10_df.reset_index(drop=True, inplace=True)

fig = px.bar(
    worst10_df, 
    x='cum_log_return', y='symbol', 
    orientation='h', 
    color='cum_log_return', color_continuous_scale='viridis'
)

fig.update_layout(
    title='Bottom 10 Cryptocurrencies by Cumulative Log Return (%)',
    xaxis_title='Cumulative Log Returns (%)',
    yaxis_title='Symbol',
    xaxis=dict(showgrid=False),
    yaxis=dict(showgrid=False)    
)

fig.update_traces(
    textposition='outside', 
    texttemplate='%{x:.2%}',
    marker=dict(line=dict(width=1, color='black'))    
)

fig.show()

# %%

# Sort by value of cum_log_return
worst_df = df.sort_values('cum_log_return', ascending=False)
worst10_df = worst_df.tail(10) 
worst10_df.reset_index(drop=True, inplace=True)
# Sort by absolute value of cum_log_return
worst10_df['abs_cum_log_return'] = worst10_df['cum_log_return'].apply(lambda x: abs(x))
# worst10_df = worst10_df.assign(abs_cum_log_return=worst10_df['cum_log_return'].abs())
worst10_df = worst10_df.sort_values('abs_cum_log_return', ascending=False)
worst10_df.reset_index(drop=True, inplace=True)

fig = px.bar(
    worst10_df, 
    x='cum_log_return', y='symbol', 
    orientation='h', 
    color='cum_log_return', color_continuous_scale='viridis'
)

fig.update_layout(
    title='Bottom 10 Cryptocurrencies by Cumulative Log Return (%)',
    xaxis_title='Cumulative Log Returns (%)',
    yaxis_title='Symbol',
    xaxis=dict(showgrid=False),
    yaxis=dict(showgrid=False),
    width=1000
)

fig.update_traces(
    textposition='outside', 
    texttemplate='%{x:.2%}',
    marker=dict(line=dict(width=1, color='black')),
    textfont_size=10
)

fig.show()

図1にはVS Codeの画面が表示されています。次のステップでは「セル」を選択して「セル」単位でPythonのコードを実行します。

Pythonのライブラリを取り込む
VS Codeから行5-28のセルをクリックして[Ctrl + Enter]で実行します。 IPythonが起動して「インタラクティブ」ウィンドウに実行結果が表示されます。ここでは、Python 3.10.9とIPython 8.9.0を使用しています。

図2

図2ではPythonの各種ライブラリを取り込んでいます。行26ではPythonの警告メッセージを抑止しています。行27ではMatplotlibのデフォルトのスタイルを設定しています。行28では、PandasのDataFrameを表示するときデータ件数を最大10件に制限しています。
米国のYahoo! Financeから250種類の仮想通貨(コイン)のシンボルをダウンロードする
行34-101のセルを選択したら[Ctrl + Enter]で実行します。ここではScreenerクラスの「get_screeners()」メソッドを実行して250種類のコインのシンボルを取得しています。取得したコインのシンボルはList型の変数「symbols」に格納します。

図3

図3には変数「symbols」の長さと内容が表示されています。「250」が表示されているので250のコインのシンボルが格納されていることになります。ここでは、get_screeners()メソッドの引数に「all_cryptocurrencies_us」を指定しているので価格は米ドルになります。ちなみに、「BTC-USD, ETH-USD,...」の「USD」は「United States Dollar」（米ドル）の略称です。仮想通貨のシンボル「BTC-USD」は、Bitcoin(ビットコイン)の米ドル建ての価格を示し、「ETH-USD」はEthereum（イーサリアム）の米ドル建ての価格を示します。つまり、これらのシンボルは、ビットコインやイーサリアムなどの仮想通貨を米ドルに換算した価格を表しています。
全ての仮想通貨の「2020/1/1～2023/2/20」の範囲の日次データをダウンロードして累積リターンを計算する
行108-142のセルを選択したら[Ctrl + Enter]で実行します。ここでは、米国のYahoo! Financeから250の仮想通貨(コイン）の価格データをダウンロードしてPandasのDataFrameに格納します。さらに、コインのログリタンと累積ログリターンを計算してDataFrameに保存します。最後に、DataFrameから最後のシンボル(symbol)、累積ログリターン(cum_log_return, cum_log_retun_pct)を抽出してList型の変数に追加します。

図4

図4にはダウンロードしたコインのCSVファイル名とデータ件数が表示されています。データ件数は各コインとも「1143」件となっています。データは「2020/1/1～2023/2/20」の範囲をダウンロードしています。
新規のDataFrame(symbol, cum_log_return, cum_log_return_pct)を作成する
行153-162のセルを選択したら[Ctrl + Enter]で実行します。ここでは、Pandasの「DataFrame()」メソッドで新規のDataFrame(raw_df)を生成しています。このDataFrameは、カラム「symbol, cum_log_return, cum_log_return_pct」から構成されます。 DataFrameには、コインのシンボル、累積ログリターン、累積ログリターン(%)が格納されています。

図5

図5にはDataFrameの構造と内容が表示されています。 DataFrameには、コインのシンボル、累積ログリターン、累積ログリターン(%)が格納されています。
DataFrameのカラムに不正値(isnan, isinf)がないかチェックする
行167-176のセルを選択したら[Ctrl + Enter]で実行します。ここでは、DataFrameに不正値(isNaN, isInf)がないかチェックしています。「np.isnan()」は、DataFrame内の各要素がNaN（Not a Number）である場合にTrueを返し、それ以外の場合はFalseを返します。「np.isinf()」は、DataFrame内の各要素が正の無限大(Positive Infinity)または負の無限大(Negative Infinity)である場合にTrueを返し、それ以外の場合はFalseを返します。

図6-1

図6-1には「np.isnan(), np.isinf()」の実行結果が表示されています。このDataFrameには「Not A Number」の値が８件、「Infinity」の値が４件存在します。

図6-2

図6-2には不正値がどのカラムに存在するのかを表示しています。「Not A Number」は「cum_log_return, cum_log_return_pct」のカラムに存在します。同様に、「Infinity」は「cum_log_return, cum_log_return_pct」のカラムに存在します。

図6-3

図6-3には「Not A Number」が存在する行のインデックス番号を表示しています。「Not A Nubmer」は「115, 168, 193, 220」のインデックス番号に存在します。 DataFrameの「loc()」メソッドにインデックス番号を指定して実行すると、行の内容が表示されます。ここではコインのシンボル「FLOKI-USD」と、不正値(NaN: Not A Number)が表示されています。

図6-4

図6-4には「Infinity」が存在する行のインデックス番号を表示しています。「Infinity」は「13, 133」のインデックス番号に存在します。 DataFrameの「loc()」メソッドにインデックス番号を指定して実行すると、行の内容が表示されます。ここではコインのシンボル「SHIB-USD」と、不正値(inf: Infinity)が表示されています。

図6-5

図6-5には、DataFrameの「replace()」メソッドで「np.inf, -np.inf」を「np.nan」に置換した結果が表示されています。 DataFrameの「cum_log_return, cum_log_return_pct」には6個の不正値(NaN)が存在します。

※DataFrameの「dropna()」メソッドは「np.inf, -np.inf」には適用されないので、ここでは「np.nan」に置換しています。これで、「dropna()」が適応されます。

図6-6

図6-6には、DataFrameの「dropna()」メソッドで不正値(NaN)が存在する行を削除した結果が表示されています。「df.isnull().sum()」の件数が０件になっているので行が削除されていることが分かります。
DataFrameの「nlargest()/nsmallest()」メソッドで仮想通貨のベスト10/ワースト10を表示する【df】
行230-261のセルを選択したら[Ctrl + Enter]で実行します。ここでは、DataFrameに格納されてコインのデータを「cum_log_retun」の降順・昇順に並べかえて上位10件と下位10件を表示しています。 DataFrameを降順に並べ変えるには、DataFrameの「nlargest() 」メソッドを使用します。引数1には件数を指定します。引数2には、DataFrameのカラム名を指定します。ここでは「cum_log_return: 累積ログリターン」を指定しています。 DataFrameを昇順に並べ変えるには、DataFrameの「nsmallest() 」メソッドを使用します。引数1には件数を指定します。引数2には、DataFrameのカラム名を指定します。ここでは「cum_log_return: 累積ログリターン」を指定しています。

図7-1

図7-1ではDataFrameの内容を表示しています。図の上段にはコインの上位10件(累積ログリターンの降順)が表示されています。１位のコインは「COCOS」で累積ログリターンがなんと「425033.57%」になっています。ちなみに、COCOSはゲーム内での仮想アイテムや仮想通貨の取引、開発者への報酬、ネットワーク手数料など、様々な用途に使用されるトークンです。２位のコインは「Optimism（オプティミズム）」のトークン「OP」で、累積ログリターンが「313957.99%」になっています。図の下段にはコインの下位10件(累積ログリターンの昇順)が表示されています。

図7-2

図7-2では、DataFrameのカラム「cum_log_return」を「float型」から「str型」に変換してフォーマットして表示しています。ここでは累積ログリターンを「パーセントの書式」でフォーマットしています。なお、詳細は第６回目の記事で解説します。

図7-3

図7-3では、DataFrameのstyleに「background_gradient()」を適用しています。ここではカラム「cum_log_return_pct」にグラデーションを適用させています。この場合、累積ログリターンの値により背景色がグラデーションされて表示されます。なお、詳細は第６回目の記事で解説します。
DataFrameの「nlargest()/nsmallest()」メソッドで仮想通貨のベスト10/ワースト10をフォーマットして表示する【df.iterrows()】
行267-285のセルを選択したら[Ctrl + Enter]で実行します。ここでは、DataFrameの「iterrows()」メソッドを使用して、DataFrameから行単位でレコード（データ）を取得して表示しています。行単位でデータを取得するとコインの順位を自由にカスタマイズして表示することができます。

図8

図8では、DataFrameからデータを行単位で取得してフォーマットしながら順位を表示しています。ちなみに、「{row['cum_log_return']:.2%}」のように記述すると、累積ログリターンに「100」を掛けて「%」を追加します。なので、DataFrameにカラム「cum_log_return_pct」を追加する必要はありません。
特定の仮想通貨のランキングを調べる
行291-301のセルを選択したら[Ctrl + Enter]で実行します。ここでは、ベスト10、ワースト10のランキングに入らなかったコインの順位を表示させます。

図9

図9では、ランキング外のコイン「BTC-USD」の順位（ランク）を表示させています。ビットコイン(BTC-USD)の順位は「97」位となっています。 DataFrameの「shape[0]」には、DataFrameの行数(レコード件数）が格納されています。コインが見つかったときは「1」が格納されます。 DataFrameの[index」には「0」から始まるインデックス番号が格納されているので「+1」加算しています。
仮想通貨のベスト10をグラフに表示する【Matplotlib】
行309-325のセルを選択したら[Ctrl + Enter]で実行します。ここでは、Matplotlibを使用してベスト10のランキングを水平形の棒グラフに表示しています。 DataFrame中のコインの並べ替えは、DataFrameの「sort_values()」メソッドを使用しています。ここでは、DataFrameを累積ログリターン(%)の昇順に並べ替えて最後から10件のデータを取得しています。これでDataFrame(best10_df)には、ベスト10のコインが格納されます。

図10-1

図10-1にはコインの上位10が水平形の棒グラフに表示されています。コインごとに「plt.barh()」メソッドを実行すると、棒（Bar)ごとに異なる色をつけてくれます。棒グラフに累積ログリターンの数値を表示させるには「plt.bar_label()」メソッドを使用します。ここでは「cum_log_return_pct」のカラムを使用しているので数値はパーセント(%)になります。

図10-2

図10-2ではMatplotlibの「plt.barh()」メソッドにDataFrameのカラムを指定して１回だけ実行しています。この場合、棒(Bar)は同じカラー(色）になります。棒(Bar)ごとに異なるカラー(色)をつけるには、「plt.barh()」メソッドに引数「color」を追加します。

ここでは、さらに累積ログリターンの数値をフォーマット「fmt='%.2f%%'」して表示させています。カラム「cum_log_return_pct」の代わりに「cum_log_return」を使うときは、「plt.bar_label(bars, fmt='%.2f%%')」を「plt.bar_label(bars, labels=[f'{x:.2%}' for x in best10_df['cum_log_return']])」のように書き換えます
```
bars = plt.barh(best10_df['symbol'], best10_df['cum_log_return_pct'], color=colors)
plt.bar_label(bars, fmt='%.2f%%')
    ⇓⇓
bars = plt.barh(best10_df['symbol'], best10_df['cum_log_return'], color=colors)
plt.bar_label(bars, [f'{x:.2%}' for x in best10_df['cum_log_return']])
```
仮想通貨のワースト10をグラフに表示する【Matplotlib】
行359-378のセルを選択したら[Ctrl + Enter]で実行します。ここでは、Matplotlibを使用してワースト10のランキングを水平形の棒グラフに表示しています。累積ログリターンが負の値になるときは、Pythonの「abs()」関数で絶対値に変換して並べ替える必要があります。

DataFrame中のコインの並べ替えは、DataFrameの「sort_values()」メソッドを使用しています。ここでは、DataFrameを累積ログリターン(%)の降順に並べ替えて最後から10件のデータを取得しています。これでDataFrame(worst10_df)には、ワースト10のコインが格納されます。次にグラフ上に下位から上位の順番に表示させるために、累積ログリターン(%)の値を「abs()」関数で絶対値に変換して、この値「abs_cum_log_return_pct」を降順に並べ替えます。これでグラフには下位から上位の順番に表示されまうｓ．

図11-1

図10-1にはコインの下位10が水平形の棒グラフに表示されています。ワースト10のときは、下から上の順番に棒グラフが表示されます。つまり、一番下の棒(Bar)が最下位のコインとなります。コインごとに「plt.barh()」メソッドを実行すると、棒（Bar)ごとに異なる色をつけてくれます。棒グラフに累積ログリターンの数値を表示させるには「plt.bar_label()」メソッドを使用します。ここでは「cum_log_return_pct」のカラムを使用しているので数値はパーセント(%)になります。

図11-2

図11-2では、Matplotlibの「barh()」メソッドにDataFrameのカラムを指定して１回のみ実行させています。この場合、棒(Bar)はすべて同じカラー(色)になります。

図11-3

図11-3では棒(Bar)ごとに異なるカラー(色)を表示させています。さらに、棒(Bar)に累積ログリターンの数値をフォーマット「{x:.2%}」、「-{-x:.2f}%」して表示させています。
仮想通貨のベスト10をグラフに表示する【Plotly Express】
行437-460を選択したら[Ctrl + Enter]で実行します。ここでは、Plotly Expressを使用してコインのベスト10を水平形の棒グラフに表示させています。

図12-1

図12-1にはコインのベスト10が水平形の棒グラフに表示されています。ここではカラム「cum_log_return」を使用していますが、 texttemplateで「'%{x:.2%}'」のようにフォーマットしているのでパーセント(%)で表示されます。

図12-2

図12-2にはコインのベスト10を水平形の棒グラフにグラデーション付きで表示しています。累積ログリターンの数値に対応して棒(Bar)の色がグラデーションされて表示されます。
仮想通貨のワースト10をグラフに表示する【Plotly Express】
行500-530のセルを選択したら[Ctrl + Enter]で実行します。ここではコインのワースト10をPlotly Expressで水平形の棒グラフに表示しています。

図13-1

図13-1にはコインのワースト10が水平形の棒グラフに表示されています。

図13-2

図13-2では、figクラスの「update_layout()」メソッドに引数「width」を追加して累積ログリターンのすべての数値が表示されるように幅を調整してします。さらに、figクラスの「update_traces()」メソッドに引数「textfont_size」を追加してフォントサイズも調整しています。