A Quant’s ToolBox: Beautiful Soup, Python, Excel and EasyLanguage

Many Times It Takes Multiple Tools to Get the Job Done

Just like a mechanic, a Quant needs tools to accomplish many programming tasks.  In this post, I use a toolbox to construct an EasyLanguage function that will test a date and determine if it is considered a Holiday in the eyes of the NYSE.

Why a Holiday Function?

TradeStation will pump holiday data into a chart and then later go back and take it out of the database.  Many times the data will only be removed from the daily database, but still persist in the intraday database.  Many mechanical day traders don’t want to trade on a shortened holiday session or use the data for indicator/signal calculations.  Here is an example of a gold chart reflecting President’s Day data in the intra-day data and not in the daily.

Holiday Data Throws A Monkey Wrench Into the Works

This affects many stock index day traders.  Especially if automation is turned on.  At the end of this post I provide a link to my youTube channel for a complete tutorial on the use of these tools to accomplish this task.  It goes along with this post.

First Get The Data

I searched the web for a list of historical holiday dates and came across this:

Historic List of Holidays and Their Dates

You might be able to find this in a easier to use format, but this was perfect for this post.

Extract Data with Beautiful Soup

Here is where Python and the plethora of its libraries come in handy.  I used pip to install the requests and the bs4 libraries.  If this sounds like Latin to you drop me an email and I will shoot you some instructions on how to install these libraries.  If you have Python, then you have the download/install tool known as pip.

Here is the Python code.  Don’t worry it is quite short.

# Created:     24/02/2020
# Copyright: (c) George 2020
# Licence: <your licence>
#-------------------------------------------------------------------------------

import requests
from bs4 import BeautifulSoup

url = 'http://www.market-holidays.com/'
page = requests.get(url)
soup = BeautifulSoup(page.text,'html.parser')
print(soup.title.text)
all_tables = soup.findAll('table')
#print (all_tables)
print (len(all_tables))
#print (all_tables[0])
print("***")
a = list()
b = list()
c = list()
#print(all_tables[0].find_all('tr')[0].text)
for numTables in range(len(all_tables)-1):
for rows in all_tables[numTables].find_all('tr'):
a.append(rows.find_all('td')[0].text)
b.append(rows.find_all('td')[1].text)

for j in range(len(a)-1):
print(a[j],"-",b[j])
Using Beautiful Soup to Extract Table Data

As you can see this is very simple code.  First I set the variable url to the website where the holidays are located.  I Googled on how to do this – another cool thing about Python – tons of users.  I pulled the data from the website and stuffed it into the page object.  The page object has several attributes (properties) and one of them  is a text representation of the entire page.  I pass this text to the BeautifulSoup library and inform it to parse it with the html.parser.  In other words, prepare to extract certain values based on html tags.  All_tables contains all of the tables that were parsed from the text file using Soup.  Don’t worry how this works, as its not important, just use it as a tool.  In my younger days as a programmer I would have delved into how this works, but it wouldn’t be worth the time because I just need the data to carry out my objective; this is one of the reasons classically trained programmers never pick up the object concept.  Now that I have all the tables in a list I can loop through each row in each table.  It looked liker there were 9 rows and 2 columns in the different sections of the website, but I didn’t know for sure so I just let the library figure this out for me.  So I played around with the code and found out that the first two columns of the table contained the name of the holiday and the date of the holiday.  So, I simply stuffed the text values of these columns in two lists:  a and b.  Finally I print out the contents of the two lists, separated by a hyphen, into the Interpreter window.  At this point I could simply carry on with Python and create the EasyLanguage statements and fill in the data I need.  But I wanted to play around with Excel in case readers didn’t want to go the Python route.  I could have used a powerful editor such as NotePad++ to extract the data from the website in place of Python.  GREP could have done this.  GREP is an editor tool to find and replace expressions in a text file.

Use Excel to Create Actual EasyLanguage – Really!

I created a new spreadsheet.  I used Excel, but you could use any spreadsheet software.   I first created a prototype of the code I would need to encapsulate the data into array structures.  Here is what I want the code to look like:

Arrays: holidayName[300](""),holidayDate[300](0);

holidayName[1]="New Year's Day "; holidayDate[1]=19900101;
Code Prototype

This is just the first few lines of the function prototype.  But you can notice a repetitive pattern.  The array names stay the same – the only values that change are the array elements and the array indices.  Computers love repetitiveness.  I can use this information a build a spreadsheet – take a look.

Type EasyLanguage Into the Columns and Fill Down!

I haven’t copied the data that I got out of Python just yet.  That will be step 2.  Column A has the first array name holidayName (notice I put the left square [ bracket in the column as well).  Column B will contain the array index and this is a formula.  Column C contains ]=”.  Column D will contain the actual holiday name and Column E contains theThese columns will build the holidayName array.  Columns G throuh K will build the holidayDates array.    Notice column  H  equals column B.  So whatever we do to column B (Index) will be reflected in Column H (Index).  So we have basically put all the parts of the EasyLanguage into  Columns A thru K. 

Excel provides tools for manipulating strings and text.  I will use the Concat function to build my EasyLanguageBut before I can use Concat all the stuff I want to string together must be in a string or text format.  The only column in the first five that is not a string is Column B.  So the first thing I have to do is convert it to text.  First copy the column and paste special as values.  Then go to your Data Tab and select Text To Columns. 

Text To Columns

It will ask if fixed width or delimited – I don’t think it matters which you pick.  On step 3 select text.

Text To Columns – A Powerful Tool

The Text To Columns button will solve 90% of your formatting issues in Excel.    Once you do this you will notice the numbers will be left justified – this signifies a text format.  Now lets select another sheet in the workbook and past the holiday data.

Copy Holiday Data Into Another Spreadsheet

New Year's Day - January 1, 2021
Martin Luther King, Jr. Day - January 18, 2021
Washington's Birthday (Presidents' Day) - February 15, 2021
Good Friday - April 2, 2021
Memorial Day - May 31, 2021
Independence Day - July 5, 2021
Labor Day - September 6, 2021
Thanksgiving - November 25, 2021
Christmas - December 24, 2021
New Year's Day - January 1, 2020
Martin Luther King, Jr. Day - January 20, 2020
Washington's Birthday (Presidents' Day) - February 17, 2020
Good Friday - April 10, 2020
Memorial Day - May 25, 2020
Holiday Output

 

Data Is In Column A

Text To Columns to the rescue.  Here I will separate the data with the “-” as delimiter and tell Excel to import the second column in Date format as MDY.  

Text To Columns with “-” as the delimiter and MDY as Column B Format

Now once the data is split accordingly into two columns with the correct format – we need to convert the date column into a string.

Convert Date to a String

Now the last couple of steps are really easy.  Once you have converted the date to a string, copy Column A and past into Column D from the first spreadsheet.  Since this is text, you can simply copy and then paste.  Now go back to Sheet 2 and copy Column C and paste special [values] in Column J on Sheet 1.  All we need to do now is concatenate the strings in Columns A thru E for the EasyLanguage for the holidayName array.  Columns G thru K will be concatenated for the holidayDate array.  Take a look.

Concatenate all the strings to create the EasyLanguage

Now create a function in the EasyLanguage editor and name it IsHoliday and have it return a boolean value.  Then all you need to do is copy/paste Columns F and L and the data from the website will now be available for you use.   Here is a portion of the function code.  Notice I declare the holidayNameStr as a stringRef?  I did this so I could change the variable in the function and pass it back to the calling routine.

Inputs : testDate(numericSeries),holidayNameStr(stringRef);

Arrays: holidayName[300](""),holidayDate[300](0);

holidayNameStr = "";

holidayName[1]="New Year's Day "; holidayDate[1]=19900101;
holidayName[2]="Martin Luther King, Jr. Day "; holidayDate[2]=19900115;
holidayName[3]="Washington's Birthday (Presidents' Day) "; holidayDate[3]=19900219;
holidayName[4]="Good Friday "; holidayDate[4]=19900413;
holidayName[5]="Memorial Day "; holidayDate[5]=19900528;
holidayName[6]="Independence Day "; holidayDate[6]=19900704;
holidayName[7]="Labor Day "; holidayDate[7]=19900903;
holidayName[8]="Thanksgiving "; holidayDate[8]=19901122;
holidayName[9]="New Year's Day "; holidayDate[9]=19910101;
holidayName[10]="Martin Luther King, Jr. Day "; holidayDate[10]=19910121;
holidayName[11]="Washington's Birthday (Presidents' Day) "; holidayDate[11]=19910218;

// There are 287 holiays in the database.
// Here is the looping mechanism to compare the data that is passed
// to the database

vars: j(0);
IsHoliday = False;
For j=1 to 287
Begin
If testDate = holidayDate[j] - 19000000 then
Begin
holidayNameStr = holidayName[j] + " " + numToStr(holidayDate[j],0);
IsHoliday = True;
end;
end;
A Snippet Of The Function - Including Header and Looping Mechanism

This was a pretty long tutorial and might be difficult to follow along.  If you want to watch my video, then go to this link.

I created this post to demonstrate the need to have several tools at your disposal if you really want to become a Quant programmer.  How you use those tools is up to you.  Also you will be able to take bits and pieces out of this post and use in other ways to get the data you really need.  I could have skipped the entire Excel portion of the post and just did everything in Python.  But I know a lot of Quants that just love spreadsheets.  You have to continually hone your craft in this business.   And you can’t let one software application limit your creativity.  If you have a problem always be on the lookout for alternative platforms and/or languages to help you solve it.