Training Myths in Endurance Sports, Part 6

A narrative essay by Nemon, December 2025 / January 2026

Mai 03, 2026

This article series was originally published in German on this Substack and is now being made available in English, one chapter at a time.

A note upfront: this is a private work. I do not follow every formal standard of academic publishing, nor are the texts polished to the last stylistic detail. I do this in my spare time, and I don’t have an institute with a diligent team of assistants to handle the tedious, time-consuming legwork for me ;)

01 Preface and Introduction
02 Lactate and “Muscle Burn” 03 Energy Systems and Thresholds
04 LIT/MICT/HIT and the “Aerobic Base”
05 Fat Metabolism and the Crossover Point
06 Surrogate Markers and Target Performance
07 High Intensity, Low Volume – Efficiency as a Principle
08 False Expectations and Misconceptions 09 Side Chapter: Anthropology and Evolution
09 Side Chapter: Anthropology and Evolution

06 – Surrogate Markers and Target Performance

After lactate, energy systems, Zone 2, and fat metabolism, the focus now shifts to the metrics that are most highly valued in endurance sports: VO₂max, lactate and ventilatory thresholds, Critical Power/FTP, heart rate variability (HRV), and related parameters. These metrics have their uses – as measurement and orientation tools – but in practice they are often elevated to standalone training goals, even though by their very nature they are surrogate markers: proxies for what actually matters, such as concrete race performance, real-world functional capacity, or a stable state of health. These actual outcomes are in principle directly observable, but they cannot be captured in a single number like VO₂max or a threshold value.

This chapter initially focuses on competitive sport, where the critical stance toward isolated marker optimization applies most sharply. For people who train without competitive ambitions, the situation is different – this will be specifically addressed at the end of the chapter.

What Surrogate Markers Can Do – and What They Cannot

Surrogate markers are measurable quantities that correlate with relevant outcomes and are therefore used as indirect targets – in the hope that optimizing them is the path to the actual goal. In endurance sports, these include above all VO₂max or cardiorespiratory fitness (CRF) as a measure of maximal oxygen transport and utilization capacity, lactate, ventilatory, and performance-based thresholds (e.g., LT1/LT2, VT1/VT2, MLSS, Critical Power) that mark specific points on exertion curves, as well as heart rate-based metrics and, increasingly, HRV-derived thresholds and steering parameters.

The practical value of surrogate markers lies first and foremost in enabling standardized testing, longitudinal comparisons, and the classification of training status and adaptations – for instance, before and after training blocks. They condense complex system states (cardiovascular capacity, metabolic shifting, autonomic balance) into manageable numbers that can be communicated and tracked. In this way, they provide a shared language among athlete, coach, and diagnostician.

At the same time, each marker captures only a slice of reality: VO₂max, for example, primarily describes maximal oxygen uptake – not technique, tactics, neuromuscular explosiveness, or mental toughness in competition. Moreover, their predictive power is context-dependent. In heterogeneous populations, CRF is a strong health and fitness marker; in elite samples that are already homogeneous on these criteria, VO₂max loses considerable discriminatory power – a phenomenon that runs through all endurance disciplines.

Particularly relevant is a practical insight from training research and coaching: surrogate markers can be misleading when they are made the primary training objective. A parameter may correlate with performance, but optimizing it does not automatically lead to better race results – because, as noted, other factors (economy, technique, tactics, mental toughness, pacing ability) determine actual success at least as much. An athlete can raise their VO₂max without getting meaningfully faster. Another can deliver a markedly better performance without any notable VO₂max improvement, because they run more economically or race more tactically.

For this analysis, therefore, the central point is that surrogate markers are tools, not goals. They are suited to structuring and monitoring the path – but they should not define where the path leads.

Lactate, Thresholds, VO₂max, HRV, and Their Role in Training

Lactate and ventilatory thresholds were already critiqued in chapters 2 and 3: they treat arbitrary points on smooth, step-free curves as though they were natural boundaries between distinct operating modes. In reality, the lactate and ventilation curves are continuous – there is no hard edge or step where metabolism “switches over.” At most there are subtle changes in slope, but no true discontinuities. The various definition methods (4 mmol, Dmax, individual turnpoints, ventilatory markers) merely select different points on this smooth curve, which explains why threshold values differ considerably depending on the method used. On top of this come methodological problems with lactate measurement itself (sampling timepoints, sample collection, analytics), which introduce additional scatter. Both together – the artificial threshold concept and measurement uncertainty – substantially limit reproducibility and predictive power.

VO₂max: “Gold Standard” With Limits – and Epistemological Problems

VO₂max – maximal oxygen uptake – is often regarded as the gold standard in sports and clinical research. This claim typically rests on epidemiological data showing a robust association between measured CRF and mortality. However, considerable caution is warranted here: epidemiology can, at best, provide hints – and even that under problematic conditions. The method is highly susceptible to bias, methodological manipulation (p-hacking, selective data mining/cherry picking), and misinterpretation. Often enough, epidemiology itself generates the patterns it subsequently claims to describe. Moreover, very few people possess sufficient statistical competence to detect such distortions.

An instructive example is a recent Mendelian randomization study (MR study on VO₂max and longevity, Kjaergaard et al. 2024; see references): while VO₂max is consistently associated with longevity in observational studies, this genetic analysis – which is better suited to establishing causality than mere observation – showed that genetically predicted VO₂max had no significant association with lifespan. (Note: Mendelian randomization methods are themselves methodologically contested and should not be taken uncritically, but they at least point toward possible confounding.)

This suggests that the epidemiological association may not be causal but rather driven by other factors (better general health, higher income, lifestyle, overall activity level).

Added to this is a methodological problem specific to sports research: a systematic review of VO₂max intervention studies confirmed this picture – across 27 included studies, bias risks were predominantly high or unclear, and only about 7% reported adequately randomized sequence generation (see references: “Risk of bias and reporting practices in studies comparing VO₂max outcomes,” 2021). Complementary current reviews on CRF diagnostics show that at the individual level, VO₂max is limited by considerable measurement variability and random fluctuation – there too, VO₂max is discussed as a population-level marker of reasonable utility but only conditionally stable at the individual level (see references: “Assessing cardiorespiratory fitness in clinical and research settings,” 2024).

The media routinely generate headlines with relative risks – “20% improvement!” – while the absolute numbers may be vanishingly small. A typical schematic example: a study shows that a particular form of training reduces mortality risk from 2% to 1.6%. In absolute terms, that is a reduction of 0.4 percentage points – but in relative terms it becomes a “20% risk reduction,” which turns into a headline sensation. The practical relevance for the individual remains questionable. A robust statement about general or individual risk was likewise never made.

Even the prominent Mandsager study (2018, >122,000 subjects; see references), often cited as proof that ever-higher CRF/VO₂max extends life expectancy, explicitly states in its limitations: “The association between CRF and mortality does not prove causation.” (Study authors are expected to describe the weaknesses of their work.) Unmeasured confounders – socioeconomic status, ethnic factors, overall activity level – could fully explain the observed association. The epidemiological rhetoric around VO₂max thus often suggests a certainty that is methodologically simply not there.

What matters here is less the caveat – almost formulaic in observational studies – that association does not prove causation, but rather the structural limitation of the design itself: even with >122,000 subjects and high statistical power, residual confounders and selection mechanisms remain in principle uncontrollable. Other methodological approaches, such as Mendelian randomization analyses or bias reviews, are likewise not free of problems, but they at least demonstrate how easily strong VO₂max gradients can be explained without any genuine causal “VO₂max lever.” Put simply: even when two things are closely linked in a massive study, that still does not mean one causes the other.

The Whole Package Matters

This does not mean that VO₂max is worthless. But its value lies not in epidemiological long-term studies or longevity theories, but in something more direct and more honest: CRF is a measurable marker indicating that the cardiovascular system is trained and adapted. A trained cardiovascular system works more economically, recovers faster, is more resilient in daily life, and reduces orthopedic and metabolic vulnerability – this can be directly observed and experienced in practice, not merely suspected epidemiologically or modeled on the grand mixing console of data. Whoever raises their VO₂max is investing in a more stably functioning organ and muscle system. That is a sufficient and honest justification, without resorting to methodologically questionable longevity rhetoric.

However, what was already emphasized in earlier chapters remains decisive: VO₂max alone is not an adequate marker for health or performance. Muscular strength, body composition, and functional capacity are at least equally important. An athlete with excellent VO₂max but weak musculature and poor mobility is no more robust or healthy than someone with moderate VO₂max but good strength and flexibility. For genuine functional health in daily life – let alone in competition – the whole package matters: sufficient endurance fitness, preserved muscle mass and strength, mobility, and everyday resilience.

A digression that cannot be fully explored here could, however, highlight that a practical “workload capacity” in everyday life is generally more important than long-distance endurance on a piece of sports equipment. Examples like carrying crates of drinks or moving apartments may illustrate the point: what matters from this perspective is which everyday demands can be met in what time. In most cases, it is not maximal strength or long-distance endurance that counts, but a strength-endurance and athletic profile of the kind that functional forms of Metcon and resistance training promote.

HRV and Other Steering Parameters

Similar considerations apply to HRV-based markers. HRV-guided training models have shown in several studies that they can improve VO₂max and performance at least as well as rigid fixed plans, and often lead to better recovery management. At the same time, HRV itself is susceptible to day-to-day variability, measurement conditions, and interpretive ambiguity; HRV-derived thresholds only approximately match classical threshold methods. HRV-based thresholds do show a generally high correlation with classical lactate and ventilatory thresholds, but agreement in detail is heterogeneous: depending on the reference method, HRV approach, and measured variable (heart rate, power, speed), the deviations can be practically relevant. HRV is thus a helpful monitoring tool, but not a magic steering parameter that could reduce the complex reality of training to a single number.

What the Markers Actually Deliver – in Competitive Sport

These findings yield a clear synthesis for competitive sport: VO₂max/CRF is a strong health and fitness marker, but a weak predictor of concrete race performance at the high end. Thresholds and HRV provide additional information, but are methodologically and conceptually limited. What should be decisive for training is what the competition actually demands – not which surrogate marker rises most impressively.

Designing Training From the Target Demand – for Different Groups

This raises the question: for whom does it make sense to put VO₂max and other markers at the center – and for whom does it not? A differentiated answer depends on context.

Competitive Athletes With Clear Performance Goals

For athletes training toward specific competitions – from 5K races through marathons to the widely varying race profiles in road cycling (sprinter stages, mountain stages, classics, time trials, puncheur profiles) – VO₂max is a useful framing marker but not a suitable primary goal. What matters in each case is the specific demand profile: a sprinter optimizes explosive peak outputs and positioning battles on short lead-ins. A puncheur focuses on repeated, relatively short maximal efforts with extreme muscular and systemic intensity on ramps and in race-deciding moments. A climber trains long high-performance segments in the mountains. A time trialist aims for aerodynamically stable, steady power over defined distances. No sprinter would seriously assume they could win a mountain stage with their specific profile – and vice versa.

Training is then designed along these demands: intensive, race-specific stimuli at the center, flanked by sufficient recovery and targeted easy sessions with a clear function. VO₂max, thresholds, and HRV serve as feedback loops – they help make adaptations visible and detect overload – but the evaluation of a training phase hinges primarily on whether target performance in the relevant profile has improved, not on whether VO₂max has risen by x ml/min/kg. In this group, the VO₂max critique applies most sharply: training that tries to push a number as high as possible without optimizing the race profile can misallocate resources and even cause harm – for instance, by neglecting technique, tactics, or neuromuscular resilience.

Recreational Athletes Without Competitive Goals – VO₂max as a Legitimate Training Target

For people who do not race but want to remain capable, robust, and healthy, different standards apply. Here, VO₂max/CRF is in fact a worthwhile and pragmatic training target – not for theoretical or epidemiological-statistical reasons, but for a combination of practical and functional arguments.

First: CRF is a measurable expression of a trained cardiovascular system and correlates in practical experience with stable health and everyday resilience. Anyone who measures their own endurance capacity (e.g., via a treadmill or field test) and then deliberately trains to improve it is investing in a functionally superior organ and muscle system. This is motivationally valuable and practically tangible. That said, a faster time would be the more intuitive and obvious target, even without competitive ambitions.

Second: targeted VO₂max training can be reliably and variably structured through high-intensity intervals (HIT/HIIT) – practically speaking, one to two sessions per week with more or less short, intense stimuli (e.g., 4 × 4 minutes at 90% VO₂max or 15/15 formats). This structure is efficient, time-saving, and produces lasting adaptations. Research consistently shows that even a single HIT session per week can yield measurable VO₂max improvements, while two sessions per week are widely regarded in the literature as sufficient to capture the bulk of the effect at moderate time investment.

What Matters Is That You Train

It is worth noting, however, that there is no consensus on optimal protocols. Research shows different formats (30/30 intervals, 4×4 models, 15/15 formats) – all of which work under certain conditions. There is no need to adhere to a race-like protocol – the structure remains flexible. Anyone who trains at high intensity once or twice a week will see adaptations – regardless of the protocol. The practical efficiency lies in the training stimulus itself, not in loyalty to a specific formula. At the same time, this removes the mental burden of a “competition mindset” in everyday training: one or two focused intense sessions per week, with genuine recovery or light functional movement in between, is a formula that works.

What remains important, however, is that even in this group a pure VO₂max focus is not sensible. Strength, mobility, functional capacity, and everyday movement remain necessary components of a comprehensive fitness framework. Improving VO₂max/CRF is a legitimate intermediate goal within a broad program – ideally implemented through a mix of intense stimuli, strength work, and everyday movement, not as an isolated end in itself.

People With High Health Risk or Clear Need for Action

For individuals with overweight, metabolic syndrome, prediabetes, or manifest cardiovascular disease, CRF as well as muscular strength, body composition, and functional capacity are all important intervention targets. Here, surrogate markers play multiple roles: CRF/VO₂max-related measures serve as established clinical markers for prognosis and treatment success. Strength and functional markers (e.g., gait speed, sit-to-stand tests) help capture frailty and fall risk. Yet here too: the goal is not to maximize a VO₂max number at all costs, but to build a more robust, more resilient organism – with better endurance, more muscle mass and strength, better balance, and metabolic stability. The needs of this group often differ fundamentally from recreational and competitive sport; a medical-functional perspective takes priority.

Implications for Training Logic

Across all three groups, a consistent message emerges: surrogate markers like VO₂max, thresholds, and HRV are useful tools and health indicators, but not standalone training goals – with one important exception in the non-competitive fitness space, where VO₂max/CRF as a proxy for “general endurance health” is pragmatically and motivationally valuable. For competitive athletes, training should be designed primarily from the target performance backward; markers serve as feedback, not as trophies. For non-competitive exercisers, a deliberate increase in CRF – through simple, regular high-intensity intervals (once or twice per week) – can be a sensible and efficient training goal, without requiring a theoretical optimization chase.

This is how chapter 6 fits seamlessly into the argumentative arc so far: after the deconstruction of hard thresholds, base zones, fat-burning zones, and a “too slow” fat metabolism, the favorite numbers of diagnostics are now also evaluated within a framework where performance and functional health form the primary goals – and surrogate markers retain their place as helpful but limited landmarks. For competitive athletes this means: do not train from markers. For fitness enthusiasts it means: improving VO₂max is fine – but without theoretical overcomplexity of training programs.

References for Chapter 6

Ioannidis JPA (2005): “Why Most Published Research Findings Are False” – PLoS Med.

Foundational critique of publication bias, p-hacking, and the reproducibility of research findings. Shows that under realistic conditions, more than 50% of published effects may be false positives. https://pubmed.ncbi.nlm.nih.gov/16060722/

Royal Society Publishing (2023): “Big little lies: a compendium and simulation of p-hacking strategies.”

Explicit demonstration of p-hacking techniques and their impact on research findings. Shows how easy it is to manipulate statistical significance. https://royalsocietypublishing.org/rsos/article/10/2/220346/92017/Big-little-lies-a-compendium-and-simulation-of-p

Bonafiglia JT et al. (2021). “Risk of bias and reporting practices in studies comparing VO₂max outcomes.” J Sport Health Sci.

Systematic review of 27 SIT-vs-MICT studies shows consistently unclear bias and poor reporting quality; only 7% report adequate randomization, no study reports adequate allocation concealment. https://pmc.ncbi.nlm.nih.gov/articles/PMC9532877/

Ross R et al. (2024). “Assessing cardiorespiratory fitness in clinical and community settings.” Prog Cardiovasc Dis.

Review of measurement methods, estimation procedures, and test protocols for CRF/VO₂max; emphasizes clinical relevance but also considerable measurement variability and limited individual-level reliability of many approaches. https://www.sciencedirect.com/science/article/abs/pii/S0033062024000306

Mandsager K et al. (2018). “Association of cardiorespiratory fitness with long-term mortality among adults undergoing exercise treadmill testing.“JAMA Netw Open.

Retrospective cohort study with 122,007 treadmill tests, often cited as “proof” that ever-higher CRF/VO₂max extends life expectancy, whose authors explicitly state that CRF here is only an associated surrogate marker and no causal effect can be demonstrated (confounding, selection). https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2707428

Spiering BA et al. (2021). “Maintaining physical performance: the minimal dose of exercise needed to preserve endurance and strength over time.” J Strength Cond Res.

Narrative review showing that endurance performance can be maintained for weeks with drastically reduced training volume (down to a few short, intense sessions per week), provided the stimulus intensity of the original training is preserved – volume and frequency are far less decisive than classical surrogate marker logic would suggest. https://pubmed.ncbi.nlm.nih.gov/33629972/

Lenk M et al. (2025). “Impact of weekly frequency of high-intensity interval training on cardiorespiratory, metabolic, and performance measures in recreational runners: an exploratory study.” Physiol Rep.

Six weeks of 4×4 HIIT at 1, 2, or 3 sessions per week show: 2–3 sessions markedly improve VO₂max and time to exhaustion, 1 session has only a weak effect; a clear additional benefit of 3 over 2 sessions is not discernible – supporting an efficiency window of roughly 2–3 intense sessions per week. https://pubmed.ncbi.nlm.nih.gov/40976973/

Bacon AP et al. (2013). “VO₂max trainability and high intensity interval training in humans: a meta-analysis.” PLoS One.

Meta-analysis of 37 studies showing that widely varying HIIT protocols (short, medium, long intervals, some combined with continuous training) all improve VO₂max as long as enough hard minutes per week and sufficient program duration are accumulated – the differences lie more in total volume than in any “magic” interval formula. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0073182

Podlogar T, Leo P, Spragg J (2022). „Using V̇O₂max as a marker of training status in athletes – can we do better?” J Appl Physiol.

Viewpoint emphasizing the limited predictive power of VO₂max in well-trained athletes and arguing for classifying training status and performance capacity through critical intensity (CP/CS), economy, and performance-based metrics rather than a single marker. https://pubmed.ncbi.nlm.nih.gov/35175104/

Commentaries on “Using V̇O₂max as a marker of training status in athletes – can we do better?” (2022).

Collection of commentaries highlighting the limitations of VO₂max as a standalone marker and the advantages of alternative or complementary metrics. https://pmc.ncbi.nlm.nih.gov/articles/PMC9306772/

Follador L et al. (2022): “Relationship of critical speed derived from a 10-min submaximal treadmill test to 5-km and 10-km running performances.” Appl Physiol Nutr Metab 47(2):159–164.

Submaximal 10-min treadmill test for determining critical speed; CS explains a large proportion of the variance in 5K and 10K personal bests and thus proves to be a more performance-proximal and practically convenient marker than classical VO₂max tests. https://pubmed.ncbi.nlm.nih.gov/34610270/

World Health Organization (2020): “WHO Guidelines on Physical Activity and Sedentary Behaviour.” Geneva: WHO.

Current guidelines that explicitly recommend muscle- and bone-strengthening exercises as well as balance and functional training alongside endurance activity – and that also drop the old 10-minute rule, explicitly stating that even very short, and indeed intense, activity bouts fully count. In other words: the concept of short, intense, and functional training advocated here has by now arrived even at the official WHO level. https://www.ncbi.nlm.nih.gov/books/NBK566040/

Bahls M et al. (2025): “Physical activity and mortality: towards healthspan-oriented metrics and outcomes. A Scientific Statement from the European Association of Preventive Cardiology (EAPC) of the ESC.” Eur J Prev Cardiol, zwaf578.

Scientific statement from the EAPC arguing for moving away from one-dimensional, purely mortality-focused surrogate markers and instead adopting healthspan-oriented metrics that jointly consider physical function, cardiorespiratory fitness, strength, mental and cognitive health, chronic disease, and quality of life – precisely the perspective advocated here has thus arrived in the cardiological mainstream. https://academic.oup.com/eurjpc/advance-article/doi/10.1093/eurjpc/zwaf578/8248968

Kjaergaard AD et al. (2024): “Cardiorespiratory Fitness, Body Composition, Diabetes, and Longevity.” J Clin Endocrinol Metab 110(1):dgae393.

Bidirectional Mendelian randomization of large GWAS datasets in which genetically predicted cardiorespiratory fitness (VO₂max) shows no clear causal relationship with type 2 diabetes or longevity, while body composition, physical activity, and diabetes itself appear causally relevant – suggesting that VO₂max here functions primarily as a surrogate marker and not as the actual “longevity lever.” https://pmc.ncbi.nlm.nih.gov/articles/PMC12012764/

Substack von Nemon

Diskussion über diese Post

Sind Sie bereit für mehr?