Portfolio Analysis Using Python

Portfolio Analysis Using Python

Have you ever had wondered Whether an Investment🧐 in a Stock is actually a good investment? Or thought of building an Optimal Portfolio using the Analysis being done with the historical data?

Well, making money from the stock market is no walk in the park, and to gain an edge, people are now adopting the Data-Driven-Investing approach to make rational investment decisions. Whether you're just beginning with your investment journey or you already invest, it's the perfect time to start backing your investment decisions with data.

In this blog post, we will learn how to back your investment decisions with Data. We will see the implementation in Python.

Disclaimer: The material in this article is purely educational and should not be taken as professional investment advice. The premise of this article is not to show how to "GET RICH QUICKLY." The idea of this article is to get you started and to showcase the possibilities with Python.

What will be covered in this Blog

  • Fetching Data with nsepy library

  • Portfolio Analysis: Assessing Mean Daily Simple Returns & Standard Deviation of the same (Risk & Return)

  • Portfolio Performance: Cumulative Returns, Expected annual returns, Annual Volatility, Sharpe Ratio

Let's get started!

Time to Code!

1. Installing the required libraries

Open the terminal and activate the conda environment to install the following packages.

pip install matplotlib

pip install seaborn

pip install nsepy

Screenshot 2021-07-05 at 1.31.57 PM.png

2. Importing the libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sb
from datetime import date
from nsepy import get_history as gh
plt.style.use('fivethirtyeight') #setting matplotlib style

3. Defining Parameters

stocksymbols = ['TATAMOTORS','DABUR', 'ICICIBANK','WIPRO','BPCL','IRCTC','INFY','RELIANCE']
startdate = date(2019,10,14)
end_date = date.today()
print(end_date)
print(f"You have {len(stocksymbols)} assets in your porfolio" )

Here, we've created a list of stocks for which we want to fetch data and analyze that data. We've defined the starting date, i.e., the date from which we want to fetch the data, and the end date as well, i.e., today.

4. Fetching Data

Now, we'll be iterating over the list of stocks to fetch data one by one for every single stock and combine it towards the end to have it in one data frame.

df = pd.DataFrame()
for i in range(len(stocksymbols)):
    data = gh(symbol=stocksymbols[i],start=startdate, end=(end_date))[['Symbol','Close']]
    data.rename(columns={'Close':data['Symbol'][0]},inplace=True)
    data.drop(['Symbol'], axis=1,inplace=True)
    if i == 0:
        df = data
    if i != 0:
        df = df.join(data)
df

We've fetched the data for two columns only, the Symbol and Close Price. While fetching the data, we renamed the Close Price Column with the Symbol/Ticker and then dropped the Symbol Column.

Output:

Screenshot 2021-07-05 at 11.46.18 AM.png

Now, with this dataset, we'll do a great deal of Portfolio Analysis.

5. Analysis

Plotting the Close Price history.

fig, ax = plt.subplots(figsize=(15,8))
for i in df.columns.values :
    ax.plot(df[i], label = i)
ax.set_title("Portfolio Close Price History")
ax.set_xlabel('Date', fontsize=18)
ax.set_ylabel('Close Price INR (₨)' , fontsize=18)
ax.legend(df.columns.values , loc = 'upper left')
plt.show(fig)

Output:

Unknown.png

Correlation Matrix

A Coefficient of correlation is a statistical measure of the relationship between two variables. It varies from -1 to 1, with 1 or -1 indicating perfect correlation. A correlation value close to 0 indicates no association between the variables. A correlation matrix is a table showing correlation coefficients between variables. Each cell in the table shows the correlation between two variables.

The correlation matrix will tell us the strength of the relationship between the stocks in our portfolio, which essentially can be used for effective diversification.

Code to determine correlation matrix:

correlation_matrix = df.corr(method='pearson')
correlation_matrix

Output:

Screenshot 2021-07-05 at 12.16.17 PM.png

Plotting the Correlation Matrix:

fig1 = plt.figure()
sb.heatmap(correlation_matrix,xticklabels=correlation_matrix.columns, yticklabels=correlation_matrix.columns,
cmap='YlGnBu', annot=True, linewidth=0.5)
print('Correlation between Stocks in your portfolio')
plt.show(fig1)

Output:

Unknown.png

With this matrix, we can see that Wipro and Infosys are heavily correlated, which is very logical as both companies belong to the same industry. It can also be seen that BPCL and IRCTC are negatively correlated. Hence, it is wise to have them in our portfolio for efficient diversification, which ensures that if, for some reason, the BPCL goes in one particular direction, let's say down. There's less chance of IRCTC also moving in the same direction.

Risk & Return

Daily Simple Returns:

To ascertain daily simple return, we'll write this code:

daily_simple_return = df.pct_change(1)
daily_simple_return.dropna(inplace=True)
daily_simple_return

Output:

Screenshot 2021-07-05 at 12.33.50 PM.png

Daily Simple Returns is essentially the percentage change in the Prices being calculated daily.

Visualizing Daily Simple Returns:
print('Daily simple returns')
fig, ax = plt.subplots(figsize=(15,8))


for i in daily_simple_return.columns.values :
    ax.plot(daily_simple_return[i], lw =2 ,label = i)


ax.legend( loc = 'upper right' , fontsize =10)
ax.set_title('Volatility in Daily simple returns ')
ax.set_xlabel('Date')
ax.set_ylabel('Daily simple returns')
plt.show(fig)

Unknown.png

Based on the above graph, on a day-to-day basis, TATAMOTORS at large is the most volatile than any of the individual stocks. DABUR looks to be the least volatile stock, with swings much lower than any other stock.

Average Daily returns:

print('Average Daily returns(%) of stocks in your portfolio')
Avg_daily = daily_simple_return.mean()
print(Avg_daily*100)

Output:

Screenshot 2021-07-05 at 12.47.50 PM.png

Risk

Plotting Risk using Daily Returns:

daily_simple_return.plot(kind = "box",figsize = (20,10), title = "Risk Box Plot")

Output:

Unknown.png

The largest spread in the above box plot is for the TATAMOTORS, which makes sense as **TATAMOTORS **had the highest Average Daily Returns. The smallest spread in the above box plot is for DABUR. It should be noted that although IRCTC does not have the largest spread, it does have more positive outliers, which translates into a higher average daily return.

Before moving forward, it was necessary to calculate each portfolios' Standard Deviation using the ".std()" function, along with the annualized Standard Deviation:

print('Annualized Standard Deviation (Volatality(%), 252 trading days) of individual stocks in your portfolio on the basis of daily simple returns.')
print(daily_simple_return.std() * np.sqrt(252) * 100)

Output:

Screenshot 2021-07-05 at 1.04.07 PM.png

Return Per Unit Of Risk:

Avg_daily / (daily_simple_return.std() * np.sqrt(252)) *100

Output:

Screenshot 2021-07-05 at 1.09.00 PM.png

The higher this ratio, the better it is. Hence, IRCTC has the best Return to Risk ratio, BPCL having the lowest. After adjusting for a risk-free rate, this ratio is also called Sharpe Ratio, a measure of risk-adjusted return. It describes how much excess return you receive for the volatility of holding a riskier asset.

Cumulative Returns:

daily_cummulative_simple_return =(daily_simple_return+1).cumprod()
daily_cummulative_simple_return

Output:

Screenshot 2021-07-05 at 1.16.36 PM.png

#visualize the daily cummulative simple return
print('Cummulative Returns')
fig, ax = plt.subplots(figsize=(18,8))

for i in daily_cummulative_simple_return.columns.values :
    ax.plot(daily_cummulative_simple_return[i], lw =2 ,label = i)

ax.legend( loc = 'upper left' , fontsize =10)
ax.set_title('Daily Cummulative Simple returns/growth of investment')
ax.set_xlabel('Date')
ax.set_ylabel('Growth of ₨ 1 investment')
plt.show(fig)

Output:

Unknown.png

Based on the above graph, during this 2-year stretch from 2019-2021, **IRCTC **performed the best and led to the most cumulative returns: it started to separate itself in early 2020. This is followed by TATAMOTORS, and in third, WIPRO. **BPCL **had the worst cumulative returns over the 2-year period; in fact, it is the only one to finish in the red.

6. Wrapping it up

And with that, it's a wrap!

A lot goes into Constructing an Optimal Portfolio, and the topic is itself very vast as it entails so much of theory. In the upcoming article, I shall be covering Portfolio Optimization with Python so stay tuned :)

I hope you enjoyed this article!

You can also access the GitHub link here to view the entire code in one single file directly.

Thank you for reading; if you have reached it so far, please like the article; it will encourage me to write more articles. Do share your valuable suggestions; I would appreciate your honest feedback!🙂

Please feel free to leave a comment and connect if you have any questions regarding this or require any further information. Consider subscribing to my mailing list for automatic updates on future articles. 📬

💡
Please note we haven't made any new posts since Nov 2021 on this blog, you are free to subscribe to the mailing list, however, you will be auto-added to the new blog's (thealtinvestor.in) mailing list as well.

I would love to connect with you over Mail, or you can also find me on Linkedin

If you liked this article, consider buying me a book 📖 by clicking here or the button below.

Did you find this article valuable?

Support Trade With Python by becoming a sponsor. Any amount is appreciated!