## Table of contents

Have you ever had wondered Whether an Investment🧐 in a Stock is actually a good investment? Or thought of building an **Optimal Portfolio** using the Analysis being done with the historical data?

Well, making money from the stock market is no walk in the park, and to gain an edge, people are now adopting the Data-Driven-Investing approach to make rational investment decisions. Whether you're just beginning with your investment journey or you already invest, it's the perfect time to start backing your investment decisions with data.

**In this blog post, we will learn how to back your investment decisions with Data. We will see the implementation in Python.**

Disclaimer: The material in this article is purely educational and should not be taken as professional investment advice. The premise of this article is

notto show how to "GET RICH QUICKLY." The idea of this article is to get you started and to showcase the possibilities with Python.

## What will be covered in this Blog

Fetching Data with nsepy library

Portfolio Analysis: Assessing Mean Daily Simple Returns & Standard Deviation of the same (Risk & Return)

Portfolio Performance: Cumulative Returns, Expected annual returns, Annual Volatility, Sharpe Ratio

*Let's get started!*

## Time to Code!

### 1. Installing the required libraries

Open the terminal and activate the conda environment to install the following packages.

`pip install matplotlib`

`pip install seaborn`

`pip install nsepy`

### 2. Importing the libraries

```
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sb
from datetime import date
from nsepy import get_history as gh
plt.style.use('fivethirtyeight') #setting matplotlib style
```

### 3. Defining Parameters

```
stocksymbols = ['TATAMOTORS','DABUR', 'ICICIBANK','WIPRO','BPCL','IRCTC','INFY','RELIANCE']
startdate = date(2019,10,14)
end_date = date.today()
print(end_date)
print(f"You have {len(stocksymbols)} assets in your porfolio" )
```

Here, we've created a list of stocks for which we want to fetch data and analyze that data. We've defined the starting date, i.e., the date from which we want to fetch the data, and the end date as well, i.e., today.

### 4. Fetching Data

Now, we'll be iterating over the list of stocks to fetch data one by one for every single stock and combine it towards the end to have it in one data frame.

```
df = pd.DataFrame()
for i in range(len(stocksymbols)):
data = gh(symbol=stocksymbols[i],start=startdate, end=(end_date))[['Symbol','Close']]
data.rename(columns={'Close':data['Symbol'][0]},inplace=True)
data.drop(['Symbol'], axis=1,inplace=True)
if i == 0:
df = data
if i != 0:
df = df.join(data)
df
```

We've fetched the data for two columns only, the Symbol and Close Price. While fetching the data, we renamed the Close Price Column with the Symbol/Ticker and then dropped the Symbol Column.

**Output:**

Now, with this dataset, we'll do a great deal of Portfolio Analysis.

### 5. Analysis

#### Plotting the Close Price history.

```
fig, ax = plt.subplots(figsize=(15,8))
for i in df.columns.values :
ax.plot(df[i], label = i)
ax.set_title("Portfolio Close Price History")
ax.set_xlabel('Date', fontsize=18)
ax.set_ylabel('Close Price INR (₨)' , fontsize=18)
ax.legend(df.columns.values , loc = 'upper left')
plt.show(fig)
```

**Output:**

#### Correlation Matrix

A Coefficient of correlation is a statistical measure of the relationship between two variables. It varies from -1 to 1, with 1 or -1 indicating perfect correlation. A correlation value close to 0 indicates no **association** between the variables. A correlation matrix is a table showing correlation coefficients between variables. Each cell in the table shows the correlation between two variables.

The correlation matrix will tell us the strength of the relationship between the stocks in our portfolio, which essentially can be used for effective diversification.

Code to determine correlation matrix:

```
correlation_matrix = df.corr(method='pearson')
correlation_matrix
```

**Output:**

Plotting the Correlation Matrix:

```
fig1 = plt.figure()
sb.heatmap(correlation_matrix,xticklabels=correlation_matrix.columns, yticklabels=correlation_matrix.columns,
cmap='YlGnBu', annot=True, linewidth=0.5)
print('Correlation between Stocks in your portfolio')
plt.show(fig1)
```

**Output:**

With this matrix, we can see that **Wipro** and **Infosys** are heavily correlated, which is very logical as both companies belong to the same industry. It can also be seen that **BPCL** and **IRCTC** are negatively correlated. Hence, it is wise to have them in our portfolio for efficient diversification, which ensures that if, for some reason, the BPCL goes in one particular direction, let's say down. There's less chance of IRCTC also moving in the same direction.

#### Risk & Return

##### Daily Simple Returns:

To ascertain daily simple return, we'll write this code:

```
daily_simple_return = df.pct_change(1)
daily_simple_return.dropna(inplace=True)
daily_simple_return
```

**Output:**

Daily Simple Returns is essentially the percentage change in the Prices being calculated daily.

##### Visualizing Daily Simple Returns:

```
print('Daily simple returns')
fig, ax = plt.subplots(figsize=(15,8))
for i in daily_simple_return.columns.values :
ax.plot(daily_simple_return[i], lw =2 ,label = i)
ax.legend( loc = 'upper right' , fontsize =10)
ax.set_title('Volatility in Daily simple returns ')
ax.set_xlabel('Date')
ax.set_ylabel('Daily simple returns')
plt.show(fig)
```

Based on the above graph, on a day-to-day basis, TATAMOTORS at large is the most volatile than any of the individual stocks. DABUR looks to be the least volatile stock, with swings much lower than any other stock.

Average Daily returns:

```
print('Average Daily returns(%) of stocks in your portfolio')
Avg_daily = daily_simple_return.mean()
print(Avg_daily*100)
```

**Output:**

#### Risk

Plotting Risk using Daily Returns:

```
daily_simple_return.plot(kind = "box",figsize = (20,10), title = "Risk Box Plot")
```

**Output:**

The largest spread in the above box plot is for the **TATAMOTORS**, which makes sense as **TATAMOTORS **had the highest Average Daily Returns. The smallest spread in the above box plot is for **DABUR**. It should be noted that although IRCTC does not have the largest spread, it does have more positive outliers, which translates into a higher average daily return.

Before moving forward, it was necessary to calculate each portfolios' Standard Deviation using the ".std()" function, along with the annualized Standard Deviation:

```
print('Annualized Standard Deviation (Volatality(%), 252 trading days) of individual stocks in your portfolio on the basis of daily simple returns.')
print(daily_simple_return.std() * np.sqrt(252) * 100)
```

**Output:**

Return Per Unit Of Risk:

```
Avg_daily / (daily_simple_return.std() * np.sqrt(252)) *100
```

**Output:**

The higher this ratio, the better it is. Hence, IRCTC has the best Return to Risk ratio, BPCL having the lowest. After adjusting for a risk-free rate, this ratio is also called **Sharpe Ratio**, a measure of risk-adjusted return. It describes how much excess return you receive for the volatility of holding a riskier asset.

### Cumulative Returns:

```
daily_cummulative_simple_return =(daily_simple_return+1).cumprod()
daily_cummulative_simple_return
```

**Output:**

```
#visualize the daily cummulative simple return
print('Cummulative Returns')
fig, ax = plt.subplots(figsize=(18,8))
for i in daily_cummulative_simple_return.columns.values :
ax.plot(daily_cummulative_simple_return[i], lw =2 ,label = i)
ax.legend( loc = 'upper left' , fontsize =10)
ax.set_title('Daily Cummulative Simple returns/growth of investment')
ax.set_xlabel('Date')
ax.set_ylabel('Growth of ₨ 1 investment')
plt.show(fig)
```

**Output:**

Based on the above graph, during this 2-year stretch from 2019-2021, **IRCTC **performed the best and led to the most cumulative returns: it started to separate itself in early 2020. This is followed by **TATAMOTORS**, and in third, **WIPRO**. **BPCL **had the worst cumulative returns over the 2-year period; in fact, it is the only one to finish in the red.

### 6. Wrapping it up

And with that, it's a wrap!

A lot goes into Constructing an Optimal Portfolio, and the topic is itself very vast as it entails so much of theory. In the upcoming article, I shall be covering **Portfolio Optimization with Python** so stay tuned :)

I hope you enjoyed this article!

You can also access the GitHub link here to view the entire code in one single file directly.

Thank you for reading; if you have reached it so far, please like the article; it will encourage me to write more articles. Do share your valuable suggestions; I would appreciate your honest feedback!🙂

Please feel free to leave a comment and connect if you have any questions regarding this or require any further information. Consider subscribing to my mailing list for automatic updates on future articles. 📬

**thealtinvestor.in**

**)**mailing list as well.

I would love to connect with you over Mail, or you can also find me on Linkedin

If you liked this article, consider buying me a book 📖 by clicking here or the button below.