Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Test Case Designer cannot do much about the first one, so we will talk primarily about the benefits related to data availability and quality of data. After all, 80% of a scientist’s time is spent preparing the training dataset.

...

AI algorithm itself

Low

Hyperparameter configuration

Low

Training, validation, and test data

Medium

Integration of the AI system with other workflow elements

High

The rest of the article covers phases 2-4 in more detail. Regarding phase 1, significant customization of the algorithm code is not as prominent and, to borrow the quote from Ron Schmelzer, “There’s just one way to do the math!” , so the core value proposition of Test Case Designer to explore possible combinations is not as relevant (i.e., low applicability due to the “linear” nature of operations).

...

Strength: Systematic approach to identifying relevant hyperparameter configuration profiles.

Weakness: May explore  May explore the profiles with too many changes at a time or require numerous constraints to limit the scope.

...

Robo-advisors are a popular application of AI/ML systems in finance. They use online questionnaires that obtain information about the clients’ client’s degree of risk - aversion, financial status, and desired return on investment. For this example, we will use Fidelity GO.

To build the corresponding model in TCD, you will need to forget (temporarily) some of the lessons about parameter & value definitions given different objectives. Instead of optimizing the scenario count, the goal of this data set is to become a representative sample of the real world and eliminate as much human bias as possible. This means not just data quality , but also completeness.

Such a model would not only include all parameters regardless of the impact on the business outcome but also and utilize lengthy, highly detailed value lists (often more than 10 per parameter). To distinguish between the review and the “consumption” formats, value names or value expansions can be adjusted accordingly (i.e., the value name can be “sell some” for communication to stakeholders while the expansion can be “3” given the data encoding).

...

This phase is the closest to TCD’s “bread and butter”butter.The model would serve a dual purpose – 1) smoke testing of the AI; 2) integration testing of how it is operationalized.

...

Given the execution setup, you would likely have to keep all the factors consumed by the AI system , but, for this phase, reduce the number of values based on the importance (both business- and algorithm-wise).

Scenario volume would still be largely driven by the “standard” integration priorities (i.e., key parameters affecting multiple systems) but . Still, the number of values and/or the average mixed-strength dropdown selection would be higher than typical.

...

  1. Test Case Designer at its best with the thoroughness, speed, and efficiency benefits.
  2. Ability to quickly reuse model elements from Phase 3 and models related to other systems (e.g., the old version of the non-AI advisor for systems B and C).
  3. Higher control over the variety of data at the integration points and over the workflow as a whole.


Weakness: Similar to Phase 3 but usually more manageable given the difference in goals (volume in P3 vs. integration in P4).

Conclusion

...

  AI algorithm itself

Low

  Hyperparameter configuration

Low

  Training, validation, and test data

Medium

  Integration of the AI system with other workflow elements

High


For From another perspective, using this stage classification from Infosys, Test Case Designer can deliver the most significant benefits in the highlighted testing areas:

...

Given the typical scale of AI projects, the combinations of possible inputs and outputs combinations will be almost indefinitely high. Moreover, the techniques used to implement self-learning elements are very complex.

...