Embracing AI: The Journey from Skepticism to Synergy with ChatGPT.
Using TradeStation XML output, Python and ChatGPT to create a commercial level Portfolio Optimizer.
As a young and curious child my parents would buy me RadioShack Science Fair Kits, chemistry sets, a microscope and rockets. I learned enough chemistry to make rotten egg gas. I grew protozoa to delve into the microscopic world. I scared my mom and wowed my cousins with the Estes Der Big Red rocket. But it wasn’t until one Christmas morning I opened the Digital Computer Kit. Or course you had to put it together before you could even use it – just like the model rockets. Hey, you got an education assembling this stuff. Here is a picture of a glorified circuit board.
This really wasn’t a computer, but more of an education in circuit design, but you could get it to solve simple problems “Crossing the River”, decimal to binary, calculating the cube root and 97 other small projects. These problems were solved by following the various wiring diagrams. I loved how the small panels would light up with the right answers, but grew frustrated because I couldn’t get beyond the preprogrammed wiring schema. I had all these problems I wanted to solve but could not figure out the wiring. Of course, there were real computers out there such as the HAL 9000. Just kidding. I would go to the local Radio Shack and stare at the all the computers. Hoping one day I would have one sitting on my desk. My Dad was an aircraft electrician (avionics) in the U.S. Navy with a specialty in Inertial Navigation Systems. He would always want to talk about switches, gyros, some dude named Georg Ohm and oscilloscopes. I had my mind stuck in space, you know “3-D Space Battle” – the Apple II game, to listen or to learn from his vast knowledge. A couple of years later and a paper route I had a proper 16K computer, the TI-99-4A. During this time, I dreamed of a supercomputer that could answer all my questions and solve all my problems. I thought the internet was the manifestation of this dream, but in fact it was the Large Language Models such as ChatGPT.
Friend or Foe
From a programmer’s perspective AI can be scary, because you might just find yourself out of a job. From this experiment, I think we are a few years away from this possibility. Quantum computing, whenever it arrives, might be a viable replacement, but for now I think we are okay.
The Tools You Will Need
You will need a full installation of Python along with Numpy and Pandas installed on your computer if you want ChatGPT to do some serious coding for you. Python and its associated libraries are simply awesome. And Chat loves to use these tools to solve a problem. I pay $20 a month for Chat so I don’t know if you could get the same code as I did if you have the free version. You should try before signing up.
The Project: A Portfolio Optimizer using an Exhaustive Search Engine
About ten years ago, I collaborated with Mike Chalek to develop a parser that analyzes the XML files TradeStation generates when saving strategy performance data. Trading a trend-following system often requires significant capital to manage a large portfolio effectively. However, many smaller traders operate with limited resources and opt to trade a subset of the full portfolio.
This approach introduces a critical challenge: determining which markets to include in the subset to produce the most efficient equity curve. For instance, suppose you have the capital to trade only four markets out of a possible portfolio of twenty. How do you decide which four to include? Do you choose the markets with the highest individual profits? Or do you select the ones that provide the best profit-to-drawdown ratio?
For smaller traders, the latter approach—prioritizing the profit-to-drawdown ratio—is typically the smarter choice. This metric accounts for both returns and risk, making it essential for those who need to manage capital conservatively. By focusing on risk-adjusted performance, you can achieve a more stable equity curve and better protect your account from significant drawdowns.
I enhanced Mike’s parser by integrating an exhaustive search engine capable of evaluating every combination of N markets taken n at a time. This approach allowed for a complete analysis of all possible subsets within a portfolio. However, as the size of the portfolio increased, the number of combinations grew exponentially, making the computations increasingly intensive. For example, in a portfolio of 20 markets, sampling 4 markets at a time results in 4,845 unique combinations to evaluate.
Using the formula above you get 4,845 combinations. If you estimate each combination to take once second, then you are talking about 21 minutes. N/2 will produce the most combinations. Sampling 10 out of 20 will take 51.32 hours.
- 1 out of 20: 20 combinations
- 2 out of 20: 190 combinations
- 3 out of 20: 1,140 combinations
- 4 out of 20: 4,845 combinations
- 5 out of 20: 15,504 combinations
- 6 out of 20: 38,760 combinations
- 7 out of 20: 77,520 combinations
- 8 out of 20: 125,970 combinations
- 9 out of 20: 167,960 combinations
- 10 out of 20: 184,756 combinations
- 11 out of 20: 167,960 combinations
The exhaustive method is not the best way to go when trying to find the optimal portfolios across a broad search space. This is where a genetic optimizer comes in handy. I played around with that too. However, for this experiment I stuck with a portfolio of eleven markets. I used the Andromeda-like strategy that I published in my latest installment of my Easing Into EasyLanguage series.
Here is the output of the project when sampling four markets out of a total of eleven. All this was produced with Python and its libraries and ChatGPT.
Tabular Format
Graphic Format
Step 1 – Creating an XML Parser script
You can save your strategy performance report in RINA XML format. The ability to save in this format seems to come and go, but my latest version of TradeStation provides this capability. The XML files are ASCII files that contain every bar of data, market and strategy properties and trades. However, they are extremely large as each piece of data has prefix and suffix tag.
Imagine the size of the XML when working with one or even five-minute bars.
Getting ChatGPT to create an XML parser for the specific TradeStation output
- The first thing I did was save the performance, in XML format, from a workspace in TradeStation that used an Andromeda-like strategy on eleven different daily bar charts.
- I asked Chat to analyze the XML file that I attached to a new chat. I started a new chart for this project. I discovered the chat session is also known as a “workflow”. This term emphasizes:
- Collaboration: We work as a team to tackle challenges.
- Iteration: We revisit and improve upon earlier steps as needed.
- Focus: Each session builds upon the previous ones to move closer to a defined goal.
- Once it understood the mapping of the XML, I asked Chat to extract the bar data and output it to a csv file. And it did it without a hitch. When I say it did it, I mean it created a Python script that I loaded into my PyScripter IDE and executed.
- I then asked for an output of the trade-by-trade report, and it did it without a hitch. Notice these tasks do not require much in the way of reasoning.
Getting ChatGPT to combine the bar and trade data to produce a daily equity stream.
This is where Python and its number crunching libraries came in handy. Chat pulled in the following libraries:
- xml.etree
- pandas
- tkinter
- datetime
I love Python, but the level of abstraction with its libraries can make you dizzy. It is not important to fully understand the panda’s data frame to utilize it. Heck, I didn’t really know how it mapped and extracted the data from the XML file. I prompted chat with the following:
[My prompts are bold and italicized.]
With the bar data and the trade data and the bigpointvlaue, can you create an equity curve that shows a combination of open trade equity and closed trade equity? Remember the top part of the xml file contains bar data and the lower part contains all the trade data.
It produced a script that hung up. I informed chat that the script hung up and that the script wasn’t working. It found the error and fixed it. Think about what Chat was doing. It was able to align the data so that open trades produced open trade equity and closed out trades produced closed trade equity. Well, initially it had a small problem. It knew what Buy and Sell meant, and the math involved with calculating the two forms of equity, open and closed. I didn’t inform Chat of any of this. But the equity data did not look exactly right. It looked like the closed trade equity was being calculated improperly.
Is the Script checking for LExit and SExit to calculate when a trade closes out?
Once it figured out that the equity stream must contain open and closed trade equity and learned the terms, LExit and SExit, a new script was created that nearly replicated the equity curve from the TradeStation report. When Chat starts creating lengthy scripts it will open a side bar window call the “Canvas” and put the script in there. This makes it easier to copy and paste the code. I eventually noticed that the equity curve did not include commission and slippage charges.
Please extract the slippage and commission values and deduct this amount from each trade.
At this point the workflow remembered the mapping of the XML file and was able incorporate these two values into the trade processing. I wanted the user to be able to select multiple XML files and have the script process these files and produce the three output files, bar_data, trade_data, and equity_data. I did have to explain that the execution costs must be applied to all the entries and exits.
I would like the user to select multiple xml files and create the data and trade and equity files incorporating the system name and market name and their naming scheme.
A new library was imported, TKinter and a file open dialog was used for the user to select all the XML files they wanted to process. These few chores required very little interaction between Chat and me. I thought, wow this is going to be a breeze. I moved onto the next phase of the project by asking Chat the following:
Can you create a script that will use Tkinter to open the equity files from the prior script and allow the user to choose N out of the total files selected to create all the combined equity curves using an exhaustive search method.
I knew this was a BIG ASK! But it swallowed this big pill without a hitch. Maybe us programmers will be replaced sooner than later. I wanted to fine tune the script.
Can you keep track of maximum draw down for each combination and then sort the combinations by the profit to draw down ratio. For the best combination can you create a csv file with the daily returns so i can plot them in Excel.
On a quick scan of the results, I did not initially notice that the maximum draw-down metric was not right. So, I pushed on the fine tuning.
This works great. In the output files can we delete the path name to the csv files. For the output I would just like to have the system name and the symbol for each combination.
Script created.
I had the following error message. NameError: name ‘all_combinations’ is not defined. I also asked Chat if it can you add “Date” to the header for the combination daily files?
Many times, Chat will just recreate the part of the code that needs to be modified. This makes copying and pasting difficult, because you have to make sure you find the right location in the code to replace. I always ask Chat to recreate the entire script with the latest changes.
Hey Chat, the following error message was created: system, symbol = filename.replace(“.csv”, “”).split(“_”) ValueError: too many values to unpack (expected 2)
The error was fixed, but I finally noticed the maximum draw down calculations were not right.
The draw down is still not calculated properly. I think there is an error in your reasoning. I calculate draw down by iterating through the equity curve and keeping track of peaks and subsequent troughs. The largest distance between the peaks and its associated trough is the max draw down.
Script created but looked a little primitive. So, I asked.
Is this the most efficient method?
Well, I should have kept my mouth shut. I felt like the Python libraries would be able to do this much quicker with list comprehension or with some other magic. And Chat agreed with me. Script created.
The draw down is still incorrect.
Script created.
It was still wrong and instead of going back and using the original draw down script I tried to nudge Chat into the right direction.
The equity curve will have multiple peaks and multiple troughs. You will need to keep track of all of the peaks and troughs and then find the maximum distance between the sets of peaks and troughs.
Script created and was still wrong. This is where you start going down a rabbit hole. Chat will produce debugging script to try and figure out where the error resides. But this is very time consuming, so I kept pushing Chat to do it right.
Each peak must be tied to its respective trough. Each distance between each respective peak and drawdown can then be measured for the largest value.
Script is wrong.
This still not working. I don’t think you are calculating the difference between peaks and subsequent troughs properly.
Script is wrong.
This is still not working. We should be able to do this in a single pass. Iterate through the daily values and track the highest level at that point in the series and then the subsequent deepest trough. Store the first difference. Continue iterating looking for a new peak and the subsequent deepest trough. Compare this value to the prior and take the largest of the two. Store this value and continue iterating.
Script from original request is generated. Finally got back to the starting point. What a waste of time. I will be more careful with my suggestions. However, the script is still wrong – arrgh! I decide to run the debugger on the draw down function and the code is right, but the data is wrong.
The problem lies in the equity series. It should contain the combined equity for the combinations. There should be a master date and each combination populates the master date. If there is a missing date in the combinations, then the master date should copy the prior combinations combined value.
Warning message: se obj.ffill() or obj.bfill() instead. df_aligned = df.reindex(master_date_index).fillna(method=”ffill”).fillna(0)
Chat created some deprecated code. This was an easy fix, I just had to replace one line of code. However, every iteration following this still had the same damn deprecated code.
Error or warning: FutureWarning: Series.getitem treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use ser.iloc[pos]
peak = equity_series[0] # Initialize the first value as the peak
Script updated.
Can we drop the “_equity_curve” from the name of the system and symbol in the Perfomance metrics file. Will excel accept the combination names as a single column or will each symbol have its own column because of the comma. I would like for each combination in the Performance_Metrics file to occupy just one column.
Script created and this is what it should look like. Notice it was remembering the wrong maximum draw down values I fed it earlier. Don’t worry it was right in the script.
Combination | Total Profit | Max Drawdown | Profit-to-Drawdown Ratio | Combination Number |
---|
“Andromeda-Like_@SB, Andromeda-Like_@EC” | 92141.85 | 92985.60 | 0.99 | 1 |
“Andromeda-Like_@S, Andromeda-Like_@JY” | 82450.55 | 70530.45 | 1.17 | 2 |
Conclusion
I could have done the same thing ChatGPT did for me, but I wouldn’t have used the Numpy or Pandas libraries simply because I’m not familiar with them. These libraries make the exhaustive search manageable and incredibly efficient. They handle tasks much faster than pure Python alone.
To get ChatGPT to generate the code you need, being a programmer is essential. You’ll need to guide it through debugging, steer it in the right direction, and test the scripts it produces. It’s a back-and-forth process—running the script, identifying warnings or errors, and pointing out incorrect outputs. Sometimes, your programming insights and suggestions might inadvertently lead ChatGPT down a rabbit hole. Features that worked in earlier versions may stop working in subsequent iterations as modifications are applied to address earlier issues.
ChatGPT can also slow down at times, and its Canvas tool has a line limit, which can result in incomplete scripts. As a programmer, it’s easy to spot these issues—you’ll need to inform ChatGPT, and it will adjust by splitting the script into parts, some appearing in the Canvas and the rest in the chat window.
The collaboration between ChatGPT and me was powerful enough to replicate, in just one day, software that Mike Chalek and I spent weeks developing a decade ago. The original version had a cleaner GUI, but it was significantly slower compared to what we’ve achieved here.
If you’re a programmer, have Python installed with its libraries, and work with ChatGPT, the possibilities are endless. But there’s no magic—success requires thoughtful feedback and precise prompting.
Email me if you would like to have the Python scripts that accomplish the following tasks. If you are not familiar with pandas or xml processing, the code, even being Python savvy, will look a little foreign. No worries – it just works.
- XML Parser – creates data, trades and equity files in .csv format.
- TradeStationExhaustiveCombos – creates all the combos when sampling n out of N markets.
- The simple Tkinter GUI and Matplotlib graphing tool to plot the combos.
There is a total of three scripts. Remember you will need to have Python, pandas, matplotlib already installed on your computer. If you have any questions on how to install these just let me know.