METHODOLOGY

Creating an investment strategy is a complex and multi-stage process that requires the right approach. I personally use an eight-stage methodology, divided into two separate "blocks". The first block focuses on developing and testing the strategy, while the second on its practical application.

This model was developed based on the experience of many market practitioners, as well as my own experience in creating and testing investment strategies. While there is no one-size-fits-all solution, the steps below are commonly used to varying degrees by industry professionals.

Block 1: Creating an investment strategy

The first block covers five key stages:

Formulation of investment strategy – defining the basic assumptions and objectives of the strategy.
Defining investment principles – developing final, objective and computer-testable investment rules.
Conducting a preliminary test of the investment strategy – a preliminary assessment of the effectiveness of the investment strategy based on historical data.
Optimizing and assessing the stability of an investment strategy – optimizing the range of strategy parameters to maximize the objective function (risk-adjusted return) and stabilize results.
Walk-Forward Analysis – assessing a strategy’s ability to generate profits in real time.

If the strategy successfully passes all five stages and the test results indicate its potential effectiveness in real trading, the next block is moved.

Block 2: Putting the strategy into practice

The second block includes three consecutive steps:

Real-time strategy utilization – implementing strategies into a portfolio of investment strategies.
Monitoring results – comparing actual results with historical simulation results and analyzing them.
Improving your investment strategy – continually adjusting and improving your strategy based on your results.

Each of these eight stages is closely linked to the success of the previous step, and the entire process is based on continuous feedback. Information obtained at later stages is used to improve earlier steps, which makes this model extremely flexible and effective.

Thanks to this approach, the strategy evolves continuously, leading to its improvement and refinement. It can also result in the development of completely new strategies resulting from the observation and analysis of the operation of the original model.

Each step is briefly discussed below.

Step 1: Formulate of investment strategy

Every investment strategy starts with an idea. The key principles on which the strategy is based must be clearly and precisely defined. The strategy description should also include a justification for why we believe it will have a positive expected value (the so-called edge) and will be effective in practice. This will build confidence in the strategy's validity, which is extremely important in the face of more difficult market moments.

For example, in the case of a trend-following strategy, the justification might be: "I believe that trend-following is effective over the long term because markets naturally tend to move in trends, and investors often close out profitable positions too early, allowing trends to continue." The more convincing these arguments are to us, the more confident we will be in the right investment approach.

An investment strategy can be simple or complex, but it should aim to minimize the number of parameters that require optimization (more on the risk of overfitting associated with optimization can be found in the document "Overfitting and optimization"). It is best if the number of such parameters does not exceed five. Additionally, the strategy should be easy to understand and communicate.

Ultimately, the investment strategy must be reduced to a set of precise rules and formulas. Such a structure allows for unambiguous implementation of the strategy and its objective verification.

Step 2: Define investment principles

In order for the strategy to be tested, it must be converted into a format that is understandable for the chosen testing platform. In our case, this is the Trading Blox Builder II platform, which uses an object-oriented programming language. Imprecise representation of the investment idea in the form of a script leads to imprecise historical simulations, which can ultimately distort the assessment of the strategy's performance. Without a correctly written script version of the investment strategy, both historical simulation and analysis of results become impossible.

A good solution at this stage is to create so-called pseudocode – that is, a step-by-step description of the actions that the strategy code is to perform. Pseudocode helps to maintain the clarity of assumptions and facilitates their implementation.

Example pseudocode for a dual moving average strategy:

Calculate the value of the fast moving average;
Calculate the value of the slow moving average;
Take a long position the next day's open if the fast moving average was below the slow moving average yesterday and is above it today;
Stay in a long position until a sell signal occurs;
Take a short position the next day at open if the fast moving average was above the slow moving average yesterday and is below it today;
Stay in the short position until a buy signal occurs.

Thanks to the use of pseudocode, the process of implementing the strategy in the selected platform is much more effective and the risk of interpretation errors is minimized.

Step 3: Conduct a preliminary test of the investment strategy

After completing Step 2, which is defining investment rules and writing them down as scripts, the next step is to verify them. In practice, this means confirming that the code works correctly, as intended, and is consistent with the investment strategy assumptions.

At this stage, three key goals must be achieved:

Verification of the correctness of all formulas and rules in the script.
Determine whether formulas, rules, and their combinations behave as expected.
Confirmation that the investment strategy meets theoretical expectations.

To answer the first two questions, you need to generate a set of historical transactions and analyze them by answering four basic questions:

Are the formulas correct? We verify that the metrics that required manual creation are calculated correctly. For metrics that come with the testing platform, we can skip this step, provided we understand how they are calculated.
Are the trading rules correct? We check if the trading rules work as expected (e.g., a crossing of moving averages generates buy/sell signals).
Are the trades correct? We assess whether the system opens trades in the right direction, at the right price and at the right time.
Are the transactions in line with the theoretical assumptions of the strategy? We analyze whether the generated transactions reflect the expected behavior of the strategy. For example, in the case of a trend-following strategy, we expect that transactions will be made in the direction of the long-term trend.

To assess the above aspects, graphical presentations of financial instrument charts are helpful, with marked opening and closing positions and values of appropriate indicators. A review of several transactions will allow you to answer the questions posed.

To answer the third question, i.e. “does the strategy perform as theoretically expected?”, we perform a first test of the strategy on in-sample data. Although it may seem unusual to test the strategy before optimizing the parameters, we should already have some approximate parameters of the strategy that we believe should generate positive results.

At this stage, we primarily reject strategies that generate systematic losses. If a strategy linearly loses capital, this is a clear signal that further parameter optimization is unjustified. We expect the strategy to generate positive results, even at a low level. If this happens, it can be considered that the strategy works in accordance with theoretical assumptions.

If we consider that the system is working properly, we can move on to the next step. Otherwise, if the results are unsatisfactory or the strategy radically deviates from the theoretical assumptions, it may be necessary to reformulate the investment strategy (step 1) and investment principles (step 2).

At this stage, when we are already conducting the first tests of the strategy, it is necessary to determine many elements of the tests that, although not directly related to the strategies, have a significant impact on the final results. Such elements include, among others: the range of financial instruments, the time periods for in-sample and out-of-sample tests, the source of historical price data, and transaction costs.

All these elements are described in detail in the "Test Specification" document, which is a key reference point in the strategy testing process.

Step 4: Optimizing and assessing the stability of an investment strategy

This stage in the investment strategy development process is one of the most time-consuming, as it covers two key components: parameter range optimisation (based on in-sample data) and validation of this optimisation (on out-of-sample data). The goal of optimisation is to identify the parameter range that delivers the highest value of the objective function and to determine whether this range is robust (i.e. stable, as discussed below). The validation step assesses whether the optimised parameters also generate satisfactory results on out-of-sample data.

In the remainder of Step 4, we focus on two critical aspects of strategy testing: optimisation methods and stability (robustness). These elements are fundamental regardless of whether we’re working with a ready-made investment idea or building a strategy from scratch. Importantly, they are assessed simultaneously, as our goal is both:

to find a set of parameters that delivers the highest risk-adjusted return, and
to select the most stable solutions—those resistant to changing market conditions and less prone to overfitting.

Strategy Optimisation

The goal of strategy optimisation is to identify a parameter range that produces a robust and as high as possible risk-adjusted return. This means we’re not only looking for optimal parameter values but also ensuring that small variations in those parameters do not lead to significant changes in performance.

We use two main optimisation methods in our process:

The Prioritised Step Search - This method involves optimising one variable (or two related variables) at a time, assuming all other variables remain constant. We then move on to the next variable. This approach is particularly useful when iteratively enhancing a strategy by adding new components during the testing and optimisation phase.
The Grid Search - This method involves full optimisation of all designated parameters by creating a wide range of possible parameter combinations. It is particularly effective for simpler strategies where all components are known upfront (e.g. strategies based on publicly available ideas), and we are primarily seeking the most effective parameter values under specific market conditions.

Regardless of the method used, the overarching objective of optimisation is to find a stable parameter range. This means the value of the strategy’s objective function (in our case: MAR = CAGR%/Max Drawdown) should remain relatively consistent when the parameters are slightly changed within the optimised range. Stability is a key factor in ensuring that a strategy can withstand different market conditions and avoids excessive curve fitting to historical data.

Below is an example of unstable optimisation (commonly referred to as Spiky Space) for a strategy using two moving averages. The optimisation aims to find the optimal lengths of the moving averages that maximise the MAR ratio.

In unstable strategies, we observe isolated “peaks”—areas where MAR values are high but surrounded by significantly lower (often negative) results. This type of pattern signals instability, as even minor adjustments to the moving average lengths can lead to substantial changes in results. A strategy with such characteristics is harder to sustain under changing market conditions, calling into question its practical applicability.

Below is an example of a Donchian breakout strategy that we optimize for the number of breakout days to open and close a position. The target function, as before, is the MAR indicator (CAGR%/Max Drawdown).

In this strategy, MAR values remain stable over a wide range of parameter combinations, which means there is no so-called Spiky Space. This characteristic indicates greater stability of the strategy. The marked areas, corresponding to the global maximum of the objective function, indicate the most stable and potentially most effective parameter combinations.

A strategy with such stability is more suitable for use in real trading, as small changes in market conditions should not significantly affect its results or behavior.

Investment Strategy Robustness

As mentioned earlier, strategy robustness is a key element in the strategy development process. The term “robustness” refers to the ability to withstand and adapt to adverse conditions—a perfect analogy for an investment strategy. A robust strategy is one that is likely to remain profitable even under volatile and challenging market conditions, ensuring durability and long-term viability.

The five core types of strategy robustness:

Stability across a wide range of optimised parameters – the strategy should perform well not only with a single, perfectly tuned set of parameters but also across a broader range.
Monte Carlo simulation – assesses the impact of random changes in trade sequencing and market volatility to evaluate the strategy’s resilience to randomness.
Rolling time window stability – the strategy should deliver consistent results across different historical periods, not just in one selected timeframe.
Long/short bias stability– the strategy should generate a comparable number of long and short trades (assuming it is designed for both directions).
Cross-instrument robustness – the strategy should work across a wide basket of instruments, not just on a narrow group of assets.

Our objective is to build a strategy that is maximally robust, not just maximally profitable—those two qualities don’t always align. Robustness is our best safeguard in the unpredictable world of trading. As Larry Hite aptly put it: "We are not looking for the best strategy—we are looking for the most robust one."

Below is a brief overview of each robustness test, along with the success criteria that must be met for the test to be considered passed.

Stability Across a Wide Range of Optimised Parameters

Parameter stability is tested on both in-sample and out-of-sample data. Our objective is to ensure the strategy remains stable regardless of which dataset is used.

In the first step, we test parameter stability using in-sample data. Using the Grid Search method (or Prioritized Step Search when building a strategy from scratch), we define value ranges for all optimised parameters. These ranges should be wide enough to reliably assess the robustness of the strategy.

How to define these ranges?
Personally, I start with a heatmap and identify the areas where the objective function (MAR) is both stable and reaches its highest values. I then set the parameter ranges so that the ratio of the highest to the lowest value is at least 150%. For example, if the upper range for a moving average is 150 days, the lower bound should be set at 100 days, resulting in a 100–150 day range. I apply this approach to all parameters, ultimately creating wide value ranges for all optimised parameters.

Once all parameter ranges are set, I perform a full analysis of all possible parameter combinations using in-sample data. The key success criteria are:

All parameter combinations must produce a positive MAR, and
The maximum drawdown of any combination must not exceed 250% of the drawdown of the result with the highest MAR.

If any test produces a negative MAR or if the drawdown exceeds the 250% threshold, the strategy (or the tested component, in the case of Prioritized Step Search) is rejected. This indicates a lack of parameter stability, suggesting low robustness and insufficient suitability for dynamic market conditions.

Once the strategy passes the in-sample stability test, we move on to testing the same parameter ranges on out-of-sample data. Again, the key success criteria are:

All parameter combinations must show a positive MAR.
Although out-of-sample results are often weaker than in-sample, we establish a clear rejection threshold: the MAR may not decrease by more than 50% compared to the in-sample results (comparing the highest MAR values).
Additionally, the maximum drawdown on out-of-sample data must not exceed 150% of the in-sample drawdown.

If the strategy shows such a significant degradation on out-of-sample data, it is deemed insufficiently robust and unfit for real-world implementation.

This preliminary parameter stability assessment allows us to proceed to the next stage, where the strategy will undergo additional robustness testing—culminating in the Walk-Forward Analysis, which serves as the final validation. The ultimate goal is not merely to create a strategy that performs well historically, but one that possesses a genuine market edge and can operate effectively in future, under changing market conditions.

Monte Carlo Simulation

Monte Carlo simulation involves running a large number of simulated scenarios to assess how the strategy might perform under varying market conditions. The primary objective of this method is to evaluate the potential drawdown of the optimised strategy.

Monte Carlo simulation provides a more realistic view of potential fluctuations in the equity curve and the depth of possible drawdowns, offering a more accurate assessment of risk. It is also an ideal opportunity to compare the drawdown observed in the tests of optimised parameter ranges with the drawdown results generated through the Monte Carlo simulation, using a 99% confidence interval.

A strategy is considered robust if, in the Monte Carlo simulation:

The drawdown does not exceed 250% of the combined in-sample and out-of-sample drawdown (based on parameters optimised on in-sample data), and
The MAR ratio remains positive within the selected confidence interval.

In summary, the criteria for passing the Monte Carlo test are similar to those for parameter stability testing, with the key difference being that here, we are assessing performance under simulated market conditions rather than historical ones.

Rolling Time Window Stability

Rolling window stability testing involves evaluating the strategy’s annual and three-year returns across time windows that shift forward by one year. This analysis is conducted on both in-sample and out-of-sample data combined.

The process applies the strategy’s parameters optimised on in-sample data, sets a one-year or three-year trading window, and then shifts this window forward in one-year increments.

We then assess what proportion of these annual and three-year periods produced positive returns. A strategy is considered robust if it delivers profitable results in at least 70% of the annual and three-year rolling windows.

Long/Short Bias Stability

In many markets, there is a natural tendency for prices to move upward—known as Long Bias—which makes building strategies focused on bullish scenarios often easier than those aimed at shorting. However, optimising a strategy solely based on bullish historical data can lead to problems if the market enters a prolonged downtrend. In such conditions, the strategy may suffer significant losses.

To determine whether a strategy exhibits a Long Bias—or less commonly, a Short Bias—you should analyse the historical distribution of long and short trades. Ideally, this distribution should be close to 50%/50%. If one direction is strongly favoured (e.g. 70%/30%), the strategy may prove unstable in real market conditions.

A strategy is considered robust if it shows no more than 60% bias in either direction.

Instrument Portfolio Robustness

In this step, we assess how the strategy’s performance is distributed across different instruments in the portfolio. The goal is to avoid scenarios where the strategy’s positive results are driven solely by a small number of exceptionally well-performing instruments.

To evaluate this, we analyse both in-sample and out-of-sample data combined and calculate the percentage of instruments with a profit factor greater than 1—indicating a positive contribution to the overall performance of the strategy.

Our expectations are as follows:

For the portfolio with the highest MAR (based on in-sample data), at least 80% of instruments should have a profit factor > 1.
For the portfolio with the lowest MAR (also based on in-sample data), at least 70% of instruments should have a profit factor > 1.

If both conditions are met, the strategy can be considered robust across a broad basket of financial instruments.

Step 5: Walk-Forward Analysis

Walk-Forward Analysis (WFA) is considered the most reliable method to assess the stability of a strategy because it best reflects changing market conditions and the strategy's ability to adapt to these changes.

Additionally, it allows you to answer key questions:

What return can be expected from the strategy? The optimization result often shows an overstated return, which leads to unrealistic expectations. WFA provides a more reliable measure of return.
What set of parameters to use in the next period? Thanks to WFA it is possible to adjust the strategy parameters to the latest market changes.

Walk-Forward Analysis involves testing an investment strategy over multiple time periods, which minimizes the risk of overfitting (overfitting the strategy to historical data). The WFA process consists of two iterative steps:

Optimization: The strategy is optimized on a training period (in-sample) where parameters are adjusted to obtain the best results.
Testing: The strategy, with parameters optimized in the previous step, is tested on a test period (out-of-sample) that was not used during optimization.

The key element of WFA is the Walk-Forward Efficiency (WFE) measure, which allows to assess whether the strategy can be effective in real market conditions. WFE compares the rate of return achieved in the in-sample window (where parameters were optimized) with the rate of return in the out-of-sample window. Similarly for the drawdown value. A strategy considered stable (robust) should be characterized by WFE at the level of at least 50% for the rate of return and at most 150% for drawdown.

We perform WFA on the range of parameters optimized in the step " Stability in a wide range of optimized parameters" . Thus, WFA is the next step in verifying the stability of the parameter ranges we have selected.

WFA also has a practical function – it allows you to determine what set of parameters to use in the next period in real transactions. In the first step, you need to determine the lengths of the in-sample and out-of-sample windows that you want to base on. To do this:

We analyze the WFE (Walk-Forward Efficiency) values for different combinations of in-sample and out-of-sample windows,
We choose the combination that gives the highest WFE for return and drawdown.
If the WFE results are similar, it is worth using the equity curve and choosing the window that gives more stable results, i.e. the most regular drawdown profile.

After selecting the length of the in-sample and out-of-sample windows, we check what set of parameters was returned by WFA in the last optimization period. This set can be used in the next period in real transactions.

The parameter sets obtained in this way are also presented in the strategy reports, provided they pass the WFA test and previous stability tests.

Step 6: Real-time strategy utilization

After extensive testing, implementing a real-time trading strategy becomes relatively easy. Buy/sell signals and stop-loss orders are generated automatically by the computer based on pre-established formulas and rules. The key, however, is to consistently execute all signals without exception, which in practice can be more difficult than it seems. As Larry Williams noted, “Trading strategies work. Traders do not.”

Before making a final decision to add a strategy to the portfolio of investment strategies, it is necessary to verify whether it brings real added value to the results of the entire portfolio . It does not make sense to introduce a strategy that generates similar signals or is characterized by a similar course of the equity curve. Therefore, the assessment is made based on several key criteria:

Daily Return Correlation – The lower the correlation with other strategies, the better. The best results are obtained with a correlation close to zero or negative.
Reduction of maximum drawdown – if adding a strategy to a portfolio results in a reduction of the maximum drawdown, this is a very good signal.
Improvement of the objective function – in this case the measure is the MAR indicator. The improvement of MAR after adding a new strategy indicates its added value to the portfolio.
Better results in Monte Carlo simulation – Monte Carlo simulation, as mentioned earlier, is used to determine the potential maximum drawdown. If the simulation results improve after adding a new strategy, this is a strong positive signal.

The above elements often overlap – usually all of them are met or none of them are met.

After deciding to include a strategy in a portfolio, the question may arise: when to start using it? Should it be done right away or is it better to wait? Some studies suggest introducing an "incubation" period of 3-6 months. During this time, the strategy is monitored without taking real transactions. The generated signals, positions and results are observed in order to detect potential irregularities in its operation.

In our case, the incubation period lasts from the moment the strategy is launched in the live environment until the drawdown occurs, which is about half of the maximum drawdown observed on historical data. Only after reaching this threshold does the strategy start to be applied using real funds. In this way, we wait with investing real money until the first drawdown occurs, which is about half of the maximum drawdown from historical data.

Step 7: Monitoring Results

The investor should constantly monitor the effectiveness of the strategy in real time. It is crucial to compare historical indicators and measures of the strategy, such as: percentage of profitable transactions, average profit and loss of the position, profit factor, MAR, Sharpe, R3 and other indicators used during strategy optimization, with the results generated on an ongoing basis. Additionally, indicators describing the transaction profile of the strategy should be analyzed - including "Trading Performance" and "Win/Loss Statistics".

A properly designed and tested strategy should behave in real-world conditions similarly to how it did during testing. However, if deviations do occur, it is essential to understand their causes. An important limitation is time – before drawing any conclusions, it is necessary to wait several months to collect a sufficiently large sample of data for analysis.

Some market practitioners suggest introducing stop loss mechanisms for strategies, similarly to individual positions. Once such a threshold is reached, the strategy is no longer used. Examples of stop losses include:

a specific drawdown value (e.g. resulting from a Monte Carlo simulation at 99%),
number of losing trades in a row.

We currently do not have mechanisms in place to disable a strategy when certain thresholds are reached, but this is an interesting enough topic that it may be considered in the future.

Step 8: Improving your investment strategy

Continuously monitoring the performance of an investment strategy provides valuable information about its strengths and weaknesses. Such analysis, combined with observations of changing market conditions, can provide ideas for improving the strategy. However, any adjustments made should be carefully tested before being implemented in real transactions to avoid the risk of unforeseen consequences.

Additionally, regular Walk-Forward Analysis (WFA) allows the strategy parameters to be adjusted to changes in market behavior, which increases its effectiveness in a dynamically changing market environment.