Summary6 min read

Podcast Summary: Diagnosing Adrenal Insufficiency

Podcast: Endocrine Feedback Loop
Host: Dr. Chase Hendrickson (A)
Guests: Dr. Katie Guttenberg (B), Dr. Anand Vaidya (C)
Fellows: Natalia Freitas, Nikola Gorajevic, Rinky Panya, a kind (various institutions)
Episode: EFL063
Date: July 24, 2025
Timestamps: Approximate, in MM:SS format

Overview: Main Theme and Purpose

This episode of Endocrine Feedback Loop examines a forthcoming JCEM (Journal of Clinical Endocrinology & Metabolism) study on diagnostic strategies for adrenal insufficiency, with a focus on the combined use of baseline cortisol and dehydroepiandrosterone sulfate (DHEAS) measurements. The discussion aims to unpack whether DHEAS, a stable and widely available marker, can supplement or even replace the need for dynamic testing (such as the cosyntropin stimulation test) in certain clinical situations. Hosted by Dr. Chase Hendrickson, with expert insights from Dr. Katie Guttenberg and Dr. Anand Vaidya, the episode contextualizes the clinical utility, limitations, and practical adoption of updated diagnostic algorithms for adrenal insufficiency.

Key Discussion Points and Insights

1. Challenges in Diagnosing Adrenal Insufficiency

Diagnostic Difficulty: Symptoms are vague, and traditional testing is complex and time-sensitive (02:28).
Morning Cortisol Testing: Classic cortisol cutoffs (<3 µg/dL consistent with AI, >15 µg/dL excludes AI) may no longer be reliable due to changes in assay technologies (02:28).
Dynamic Testing: Cosyntropin stimulation is standard but inherently limited by lack of a true gold standard and practical barriers (03:41).

Quote (Anand Vaidya, 03:41):

“The term accuracy is a tricky one... there is no gold standard for adrenal insufficiency. We don’t have an international biomarker or histopathology that makes the diagnosis.”

2. DHEAS – Physiology and Diagnostic Rationale

Characteristics of DHEAS: Secreted mainly by the adrenal cortex; ACTH-dependent; long half-life; stable (no diurnal variation) (05:25, 06:23).
Comparison to Cortisol: Both are ACTH-dependent; both low in all forms of adrenal insufficiency (06:23).
Surrogate Marker Analogy: DHEAS as a “long stable surrogate” for ACTH effect on the adrenals, akin to how A1C reflects blood glucose over time (06:23).

Quote (Anand Vaidya, 06:23):

“To summarize it, DHEAS is an ACTH-dependent, highly abundant, long half-lived, stable surrogate for cortisol... kind of like how we use A1C for glucose.”

3. The Study: Design, Methods, and Key Variables

Study Type: Retrospective, single-center (Mayo Clinic), using data from 2005–2023 (09:45).
Inclusion Criteria: Adults with cosyntropin test and DHEAS measurement within 3 months before or 1 week after the test (09:45).
Exclusion Criteria: Recent hospitalization, estrogen use, congenital adrenal hyperplasia (CAH), or recent death (15:26).
Gold Standard: Cortisol <18 µg/dL at 60 minutes after cosyntropin as definition of adrenal insufficiency (17:06).
Subgroup Analyses: Timing of test, glucocorticoid use, age/sex-standardized DHEAS (20:19).

Quote (Anand Vaidya, 17:06):

“A cosyntropin-stimulated cortisol value of 18... has been around for many, many decades... but if you measured cortisol now on a modern assay, isn’t 18 saying you’re going to overcall adrenal insufficiency? ... Maybe the findings of this study are even stronger than stated.”

4. Key Results and Statistical Findings

Baseline Characteristics

n ≈ 1100; 78% women; AI patients were older with higher comorbidity index (21:49).

Diagnostic Accuracy

Baseline Cortisol:
- AUC: 0.81.
- <10 µg/dL: Sensitivity 96%, Specificity 30%.
DHEAS:
- AUC: 0.81 (overall); Better accuracy (AUC 0.83) without recent glucocorticoid use (vs. 0.72).
- <100 µg/dL: Sensitivity 90%, Specificity 43%.
- In postmenopausal women: Sensitivity 98%, but very low specificity (12%).

Quote (Katie Guttenberg, 21:49):

“Baseline cortisol and DHEAS cut off levels... Baseline cortisol less than 10 had a sensitivity of 96%... a DHEAS level less than 100 demonstrated a sensitivity of 90% for diagnosing adrenal insufficiency.”

Combined Algorithm (28:19)

Cortisol ≥10: AI excluded, no further testing.
Cortisol 5–9.9 + DHEAS ≥60: AI unlikely, no further testing.
Cortisol 5–9.9 + DHEAS <60: Proceed to stimulation test.
Cortisol <5 + DHEAS <25: Treat as AI.
Cortisol <5 + DHEAS ≥25: Stimulation test recommended.

Quote (Chase Hendrickson, 28:19):

“If your baseline cortisol is 10 or above, no further testing is required... If you have indeterminate cortisol and DHEAS is 60 or above, again, AI is pretty unlikely.”

5. Limitations and Potential Pitfalls

Generalizability: Single-institution; patient populations and testing practices may differ elsewhere (12:56).
Assay Effect: Modern vs. older immunoassays significantly affect cutoffs (17:06).
DHEAS in Elderly/Postmenopausal Women: Age-related decline makes low values less specific; must interpret cautiously (25:09).
Effect of Glucocorticoids: Recent use lowers DHEAS independently of adrenal status (31:51).
No Adjustment Necessary: Age/sex standardization of DHEAS did not improve diagnostic accuracy (30:55).
Secondary AI: Study less well validated for secondary adrenal insufficiency (34:05).

Quote (Anand Vaidya, 25:09):

“DHES changes over time... As you get older, the lower you get the DHES... you have to think twice about what a low DHES means in a 75-year-old woman.”

6. Clinical Practice Implications

Simplified Workup: Adding DHEAS to morning cortisol may reduce need for dynamic testing (34:05).
Resource Stewardship: Cosyntropin stimulation is costly and logistically challenging; DHEAS/cortisol can often suffice (37:57).

Quotes:

Katie Guttenberg (40:27):

"I would [advocate routine DHEAS use]... the more we can get towards using other markers instead of dynamic testing... that's going to improve patient care."
Anand Vaidya (41:31):

"If you're going to assess for adrenal insufficiency, you should measure morning cortisol. With that a DHEAS and... add on an ACTH. With those three values you can almost always make the assessment..."

Notable Quotes & Memorable Moments

| Timestamp | Speaker | Quote & Context | |-----------|------------------------|------------------------------------------------------------------------------------| | 03:41 | Anand Vaidya | “There is no gold standard for adrenal insufficiency...” Discussing assay issues | | 06:23 | Anand Vaidya | “DHEAS represents ACTH-dependent steroidogenesis in a long stable manner...” | | 17:06 | Anand Vaidya | “A cosyntropin-stimulated cortisol value of 18... has been around for decades...” | | 25:09 | Anand Vaidya | “The older you get, the lower the DHEAS... less prognostic value for AI diagnosis.” | | 34:05 | Chase Hendrickson | "The diagnosis of adrenal insufficiency is unlikely to be missed..." | | 37:57 | Anand Vaidya | “This algorithm... can be thought of in a more holistic way; you don’t want to be memorizing cutoffs.” |

Timeline of Important Segments

| Timestamp | Segment | |-----------|----------------------------------------------------------------------------------------------------------| | 00:00–03:41 | Introduction, background, and challenges in diagnosing adrenal insufficiency | | 03:41–06:23 | Dynamic testing and physiology of DHEAS vs. cortisol | | 09:45–15:26 | Study methods, inclusion/exclusion criteria, and generalizability concerns | | 17:06–20:19 | Cosyntropin test cutoffs, assay issues, and the “gold standard” debate | | 21:49–28:19 | Results overview: diagnostic accuracy, statistical findings | | 28:19–34:05 | Interpretation, summary scheme for diagnosis, limitations (age, glucocorticoids, etc.) | | 34:05–37:57 | Strengths, limitations, and concluding discussion | | 40:27–42:31 | Should this approach become standard of care? Experts’ practical recommendations |

Flow and Tone

The conversation is collegial, evidence-oriented, and draws on both academic rigor and pragmatic clinical experience. Both Dr. Guttenberg and Dr. Vaidya emphasize an evolving, nuanced approach to endocrine diagnostics, balancing clear statistical data with real-world implementation. The tone is supportive, educational, and optimistic about improving care and reducing unnecessary procedures for patients.

Final Takeaways & Practice Pearls

DHEAS + Cortisol: The combination of morning cortisol and DHEAS measurements can safely eliminate the need for dynamic testing in many patients suspect for adrenal insufficiency, particularly if cortisol is ≥10 µg/dL or DHEAS is robust.
Indeterminate Range Adjudication: Patients with low-normal cortisol and low DHEAS may still need dynamic testing; older age and recent glucocorticoid use may limit DHEAS’s utility.
Assay Awareness: Practitioners should ensure assay cutoffs are updated for their local lab technology; classic thresholds may overdiagnose AI with modern immunoassays.
Broader Adoption: Both experts advocate for widespread adoption of this approach, with proper education and understanding of limitations.

Practice-Changing Quote (Anand Vaidya, 41:31):

"For those of you who are specialists, I would strongly recommend this. For those of you who interface with non-specialists... I would advise your colleagues to do that so... you can use those values to make the full adjudication..."

For further study:

See the cited JCEM article for details and figures
Review recent guidelines (Endocrine Society/European Society of Endocrinology) on glucocorticoid-induced AI

Episode Summary Prepared for Clinicians and Trainees – Endocrine Feedback Loop, July 2025

Loading summary

Transcript32 lines

[00:00]
A
This is endocrine feedback loop. I am your host, Chase Hendrickson and welcome you to this Journal Club Podcast series brought to you by the Enderkin Society. Thanks for joining us as we explore an important article recently published in one of the Society's clinical journals. Welcome again to the Endocrine Feedback Loop podcast for our 63rd episode, we're recording here at the Endo 2025 conference in San Francisco. For this month's episode of the podcast, we review a recent JCEM study that suggests that we can substantially alter our diagnostic approach to adrenal insufficiency. This study and others have already led to a shift in recommendations from some experts, so we thought it would be of great interest to you all as our listeners. Four endocrinology fellows from programs throughout the United States joined us in analyzing this article and will be asking questions of our team today. By way of brief reminder, I host the Endocrine Feedback Loop podcast and work at the Vanderbilt University Medical center in Nashville, Tennessee as a general endocrinologist and medical director. With us again today is the regular contributor for this episode is the podcast resident pituitary expert Katie Guttenberg. She works at the University of Texas at Houston, where she is the director for their Endocrinology Fellowship program and focuses her clinical care on pituitary disorders. She is a master educator at McGovern Medical School, where she teaches extensively. Our guest expert today comes to us from the Brigham and Women's Hospital in Harvard Medical School in Boston, Massachusetts. Anand Vaidya is one of the world's experts in adrenal disease, both in clinical care and research. At the Brigham, he directs the center for Adrenal Disorders. His research spans the breadth of adrenal disorders and guides the care we provide to our patients. So, as usual, the perfect pair of endocrinologists joins me to discuss a paper on adrenal insufficiency. As is also always the case, everything we say is our opinions only and not those of our respective institutions or the endocrine society. Today we look at performance of dehydroepiandosterone sulfate and baseline cortisol in Assessing Adrenal Insufficiency, which is a forthcoming article in the Journal of Clinical Endocrinology and Metabolism. Ashley Hahn and Malavika Suresh served as first authors for this paper, which comes to us from the Mayo Clinic. I will now hand the discussion over to Katie. She will highlight some of the key points that these authors make in their introduction and get Anand to give us some important insights along the way Katie.
[02:29]
B
Thanks, Chase. So happy to be back and with everybody in San Francisco. So I think just to start, I think we can all acknowledge that diagnosing adrenal insufficiency is challenging. There is vague symptom ontology and complex multi step testing. So traditionally the first step is to check a morning cortisol level and which is time sensitive. As we all know, the cortisol levels drop by about 1.1 microgram per deciliter between 7am and 12am per hour. And the classic teaching has been that let's say a Cortisol of less than 3 is consistent with adrenal insufficiency. A cortisol level of greater than 15 rules out adrenal insufficiency. And we'll talk as we go through this article about how with modern immunoassays, some of these cutoffs may no longer apply. But that's sort of our baseline and that leaves a large number of patients who have indeterminate cortisol values. And what do we do with them? And so classically what we would do with that is dynamic testing. The most commonly used dynamic testing is the cosyntropin stimulation test. And so with that I'm just going to pass it over to Anant and he can talk to us a little bit about options for dynamic testing and specifically the high dose co syntropin stimulation test.
[03:42]
C
Sure. Happy to and happy to be here. So I think accuracy with testing is a big issue with any diagnostic. And the term accuracy, which is used in this publication and many others, is a tricky one because once you hear the word accuracy, you then have to ask what's the gold standard comparator for that accuracy, meaning what did you adjudicate accuracy with? And the tricky thing in adrenal insufficiency, like in many endocrine disorders, is that there is no gold standard for adrenal insufficiency. We don't have a international biomarker or histopathology that makes the diagnosis. And so you can't really state the true accuracy. The accuracy ends up being on some comparator that we've derived. And the tricky thing, as Katie mentioned, is you can measure cortisol. Cortisol is great. That's the hormone that we care about most in adrenal insufficiency. But cortisol circulates in nanomolar concentrations. It fluctuates dramatically throughout the day. So we leverage the fact that there's usually a diurnal rhythm and maybe we can capture a peak in the morning. But there's all sorts of caveats with that, as Katie mentioned. So you can get around that by saying, maybe we'll do a dynamic test. And the most commonly used One is a CO Syntropin stimulation test with 250 micrograms, which is to say you're going to give synthetic super physiologic acth and then you will watch how much the adrenals get stimulated and pump out cortisol. And so this maneuver makes a lot of sense. It's going to tell you, do the adrenals work or do they not work? But then when it comes to accuracy, the real question is how much should they work to be normal and without a gold standard, we have to kind of make up some standard. And over the years, I'm sure you all know this and we'll talk about it, there have been all sorts of standards and thresholds. And then a question ends up being, is that appropriate? When is it appropriate? How flexible can we be with those? And I think we'll address some of those as we move along. Yeah.
[05:26]
B
Thank you. And that kind of gets us to the heart of the study. So is there another blood test or some other marker that we can use to help diagnose patients with adrenal insufficiency? And so we'll be focusing on the use of DHES here. DHES is almost exclusively secreted from the adrenal cortex under the control of acth. There are small contributions from the ovaries and the testes. Most importantly, DHES has a long half life and lacks that diurnal variation. So that's really key here. Prior studies have shown that patients with adrenal insufficiency typically have low age and sex matched DHES levels. And androgen secretion is typically affected before cortisol secretion, which is important because DHES may be a more sensitive marker in detecting early stage adrenal insufficiency. And again, I'm going to pass it back over to Anand and he can kind of talk to us a little bit about comparing the physiology of cortisol and dhes.
[06:23]
C
I mean, you actually mentioned some of the key things. So DHEA is an adrenal androgen coming from the zona reticularis. Most of the circulating DHES comes from the adrenal, but as you mentioned, a small proportion comes from the gonad. What's key is that it's ACTH dependent. Okay. So both the zona fasciculata and the zona reticularis, that is to say cortisol and adrenal androgens depend on acth. So another way to think about it is if you don't have acth, the adrenal is not going to produce cortisol or dhea. Okay, so no acth, no cortisol, no dhea. So both cortisol and DHEA are deficient in all adrenal insufficiency. In primary adrenal insufficiency, they're both deficient because the cortex is either destroyed or inhibited in secondary adrenal insufficiency or glucocorticoid induced adrenal insufficiency. They're both deficient because there's no acth. So they go together. They're both low in every form of adrenal insufficiency. In addition, DHA gets sulfated. Once it's sulfated, it has a long half life and as Katie mentioned, it's stable. It has no diurnal variation. If you look at frequent sampling or intra dialysate 24 hour sampling, DHES is relatively flat. I guess the third important point is it's abundant. Cortisol is nanomolar, aldosterone is picomolar, DHES is micromolar. So three orders of magnitude there's a thousand times more DHES in your circulation than there is cortisol and a million times more than aldosterone. So to summarize it, DHES is an ACTH dependent, highly abundant, long half lived, stable surrogate for cortisol. This is not a perfect analogy, but think about glucose and A1C. We all want to know about glucose and glycemic surges, but glucose fluctuates all throughout the day. So sometimes we use an A1C that integrates all of that glucose area under the curve. But it has some caveats that you all know, or you can think about catecholamines and metanephrines. We all want to know about catecholamines when we think about say pheo. But catecholamines are picomolar and they fluctuate tremendously, so we don't measure them. Instead we measure metanephrines, a long stable surrogate for catecholamines. Veins that also have some caveats. DHES is kind of like that for cortisol. Okay, it's not even made in the same zone. But DHES represents ACT dependent steroidogenesis in a long stable manner that parallels cortisol, at least in adrenal insufficiency.
[08:40]
B
And so a variety of studies have already looked at the role of dhes in diagnosing adrenal insufficiency. However, they were typically small studies and no consensus really exists in terms of what should the optimal cutoff concentration be for DHEs to diagnose or exclude adrenal efficiency. The current study had three aims. So first was to evaluate the accuracy of dhes in diagnosing adrenal insufficiency based on coentropin stimulation testing and assess the impact of glucocorticoid use. Two, to evaluate the accuracy of baseline cortisol in diagnosing adrenal insufficiency, again based on coentropin stimulation testing and assess the impact on time. So here was the coentropin stimulation test performed before or after 10am and last, to determine the prevalence of adrenal insufficiency based on various cutoff concentrations of dhes and baseline cortisol. And with that, I'm going to hand it over to Chase and he'll walk us through the methods.
[09:45]
A
All right, thanks Katie and Anand for giving us a nice overview. I think we have a good understanding of some of the challenges here. While we might want another test that we can use besides the cosyntropin stimulation test, but also these unanswered questions that I think we need a better understanding of before we can start doing that. So first, as we typically do before we go into this study, we'll just think about study design in general. So I would describe this as a test for assessing the diagnostic performance of a test. And you all are very familiar with this. These sorts of studies are how you get sensitivity and specificity and then in certain populations, how you can come up with a positive predictive value and a negative predictive value. And most of these, I would describe them, they're typically done as cross sectional studies and we'll think about how that relates to this study here shortly. But you're usually looking at just a single point in time. Sometimes, as was the case in this study, you may collect the data over a period of time, but you think of it as all being collected as a single point in time. It's how that analysis is done. And you're looking at, you're splitting your, your, your subjects into groups based on how this initial test, this new test that you're looking at, and then your outcome is looking at what the gold standard is. And Ana has already mentioned this. So the idea of a gold standard is really, really key. And in all of these studies it's an intrinsic weakness. So we'll talk about this and the challenges around it. It's nothing that the investigators here did incorrectly. It is baked into one of these challenges is that when you do a test like this, you have to assume that the gold standard is 100 accurate. We heard already that it's not. And just the, the idea of what should the cutoff be? Is, is it 18, is it something else? What is a normal cosyntropin stimulation test? That right there introduces the idea that, that we don't really know what that should be. And so we don't know what, what the right number is to be accurate all the time. So that. That's going to be a challenge that we'll have to think about. But again, it's a necessary step in these sorts of tests is that you have to assum. Standard is always right because that's what you're comparing your new test to. And these new tests, they can be done for a variety of reasons. It might be easier, it might be cheaper, a lot of different reasons. But you've got to see how they perform based on what you do currently. So that's just in general. So this study, so this is a single center. I mentioned already, this come to us from the Mayo. It's a retrospective study and the authors describe it as a cohort study. It's one area. I would probably disagree on how the authors describe that. Cohort studies, typically you follow people over time, you put them into at least two groups and then you follow them over time to see if they develop an outcome. I would describe this as a cross sectional study because even though the data is collected for each individual, potentially over a several month period, it's analyzed as if it was collected at a single point in time. I don't think it makes a big difference, but a point that I would make as far as the subjects, the patients who were included here. So the inclusion criteria was one, they were all adults who underwent a cosyntropin stimulation test in the outpatient setting. And that data was collected individuals between 2005 and 2023. They had to have had a DHEAS measurement within three months prior or a week after that cosyntropin stimulation test was done. All right, we'll have our first fellow question at this point. So Natalia Freitas will ask Anand to comment on these criteria and some related questions to them.
[12:47]
B
What are your thoughts regarding the timeframe used to include the HEAS levels and could this lead to potential biases or confounders?
[12:56]
C
Yeah, so it's a good question. Bias confounding. And the study design, I think to step back and really to build on what Chase said. When you read a study, especially one that has clinical implications for your patients, what you really are asking as a clinician is, are these findings generalizable to my patients? Meaning, do these findings apply to the people I see? And if not, what are the reasons why? So if you were to design a study prospectively, okay, how would you do it? You can get variety of things. Let's give you some examples. You could say, I'm going to prospectively evaluate every patient with adrenal insufficiency from now going forward for a period of time, say one year, and for each of them, I will either making this up, randomize them to a tropin stimulation or a morning cortisol and dhes, and then I will evaluate outcomes using xyz. Or you could say, I'll have every patient prospectively have a cosyntropin stimulation test and a cortisol dhes, but I'll blind myself to one of those two, and then afterwards we'll see. So you could do a variety of these things. So this study, because it was retrospective, the authors weren't able to design it going forward. And the data that is available is what is available. So then the question is, is the data available generalizable? Because the data that had to be available was over 20 years, which in itself, you know, what was the drift over 20 years, the assay shift over 20 years, the knowledge change in 20 years. That aside, who are these people who had a co syntropin stimulation challenge? Is that a generalizable thing? And I would say in our institution, there's something different about people who had a co syntropin stimulation test from those who just, say, had a morning cortisol. I don't know what that difference is. Maybe they're higher risk. Maybe they're evaluated with a person with a certain style. Maybe they're evaluated by someone who doesn't know as much about adrenal insufficiency. Which one of those, or is it a mixture? We don't know. And a second feature? Maybe the more, I wouldn't say red flag, but the one glaring one is that you had to have a DHES level within a couple of weeks or months of the cosyntropid. So dhes, I would consider it to be a more sophisticated and nuanced way of assessing adrenal insufficiency. And if many of you are doing that already, bravo. But in my experience, most people are not doing that. So why was that done in this study? Were those clinicians extra smart? So are these patients, those of clinicians who are more nuanced and knowledgeable about adrenal insufficiency, were they higher risk or. This study was done at one center, Mayo Clinic. Maybe their form of practicing and approaching adrenal insufficiency is different from my center or yours. So all of these are considerations, really, when you think about generalizability, can I take these findings and apply them directly to my patients? Good.
[15:26]
A
And thank you, Natalia, for that question. Natalia, coming from the Cleveland Clinic. All right, as we move back, those are our inclusion criteria. Some of the things that we wrestled with there now, some exclusion criteria. So one would be is if the patient died within six months of having the stem test, or two, if they had been hospitalized within 30 days of having a stem test, three, if they were taking oral estrogen at the time of the testing, or four, if they had congenital adrenal hyperplasia. And that was defined as having having a 17 hydroxyprogesterone that was greater than 400 at any point or a DHEAS that was greater than the upper limit of normal. We'll get into this in a second. But one of the details about the diagnostic criteria. So for those individuals who were considered to have adrenal insufficiency, if AAacth was above 60, that was considered criteria for having primary adrenal insufficiency. Postmenopausal status was defined based on age. So women who were 50 years or older. And then finally, just a quick comment as they looked at comorbidities here, the Ellix Hauser comorbidity index was the way that they assessed the burden of comorbidities. A few more details on the stem test itself. So fairly typical, but we'll go through the details. The cosintropin simulation test was performed using a 250 mics of cosintropin. So a high dose as it's sometimes described. And cortisol was measured at baseline, that served as that baseline measurement that we'll refer to, but then also at 30 and 60 minutes after the cosintropin was administered. And then for that gold standard, the adrenal insufficiency, it was defined as a cortisol concentration of less than 18 at that 60 minute mark. So now we'll have our second fellow question. So Nikola Gorajevic from Pittsburgh will ask a question.
[17:07]
C
Can you please comment on this causentropin stimulation test cutoff? And do you have any Concern regarding its use as the gold standard in this study. I think we hit this theme before, which is there is no gold standard for adrenal insufficiency. So we kind of have to make something up. And a cosyntropin stimulated cortisol value of 18 micrograms per deciliter has been around for many, many decades. So first I'll say I do not blame the authors at all for using this. I think they had to use this in the absence of a true gold standard. But you could certainly this gold standard substantially. One of the tricky parts is once you've derived something like this, it's been around for decades. How do you override it? In the absence of a gold standard that challenges this, how do you override it? And I will just tell you, based on my experience in many other areas in the field of adrenal and endocrinology, it's almost impossible because it has been baked into all of your education since you were internal medicine residents, maybe even medical students. This is an opportunity to talk about assays. So this cutoff was derived at a time when the immunoassays used to measure cortisol were older. Almost certainly all of us practice at a center that uses more modern immunoassays. So how do you measure cortisol and other steroids? You can use immunoassay and kind of antibody or ELISA based technique, or you can use mass spec, liquid chromatography, tandem mass spectroscopy. Mass spec is really considered the gold standard for measuring steroids because you actually measure the mass of the molecule, you get a nice clean peak. You can differentiate cortisol from aldosterone. One oxygen atom moves from one place to another and you can differentiate that on aspect the problem with immunoassay. Not problem, but the limitation with immunoassay is you have an antibody that recognizes an epitope and your antibody has to be as specific as possible for cortisol and not another steroid that differs by one atom. And as a result, almost every immunoassay will pick up some other similar steroids. You know, the better the specificity of that immunoassay, the better the measurement. But there's always going to be some pickup. And as a result, mass spec values tend to be lower than amino acid because the immunoassay picks up, say, 80% cortisol and 20% other stuff. So which immunoassay you use matters. And as it turns out, more modern immunoassays pick up less interfering steroids than older immunoassays, meaning they're better. So there are studies that show that a cortisol that would have been say 18 on an old immunoassay from 30 years ago now gets picked up at say 14 micrograms per deciliter. So that's important because if you actually say well why did the authors use a cutoff of 18? I think they had to. That's the quote made up standard or historical standard for adrenal insufficiency, you could say isn't that too strict? If you measured cortisol now on a modern assay, isn't 18 saying you're going to over call adrenal insufficiency? And I would say yes, maybe the authors could have used something lower. But in a way that's a strength of the study because as Chase and Katie will talk about later, the actual results, you could argue that maybe adrenal insufficiency was over called. And if you think about it that way, when we get to the results, maybe the findings of the study are even stronger than stated. So just what's important to remember is when we say the diagnosis of adrenal insufficiency in this study it means co syntropin stimulation test below 18. Whether it's adrenal insufficiency or not is something we can all interpret differently.
[20:19]
A
We will wrap up the method section with just a couple of comments on the statistical analysis and the way the authors do this. Fairly typical for this sort of a study is they look at the area under the receiver operating characteristic curve and with that they're calculating that for both the baseline Cortisol and the DHEAs and then the optimal cutoff concentration to maximize both the sensitivity and the specificity was then calculated based on approach of minimizing the absolute sensitivity and specific specificity. The subgroup analysis, you heard a little bit about that already. But the idea is is they want to look subgroup analyses are ideally pre planned as they were here. And it's wanting to look at different groups. So not just taking your entire group as a whole, but subdividing it into at least two groups. And, and this was done in three different ways. So the first one was looking at the timing of that baseline cortisol concentration, whether that was done before 10 or after 10 in the morning when the stem test was done. The second subgroup analysis was looking at the effect of glucocorticoid use on the DHEAS concentration and that was divided as to whether glucocorticoids had been used within two months of having the stem test versus no recent glucocorticoid use. And then the third and final subgroup analysis was looking at the DHEAs and it was standardizing it by dividing the DHEAS by the value of the lower limit of the normal range based on patient's age and sex. So we're going to come back and unpack the rationale behind that and what the authors were looking for here. So just a few brief words on the stats there. I'm going to now turn it back over to Katie and she's going to walk us through these results.
[21:49]
B
So we start with the baseline characteristics. The study included about 1100 patients 78% were women. Patients with adrenal insufficiency were slightly older with an average age of about 50 compared to 41 for patients without adrenal insufficiency, and then also had a lower proportion of women, about 70% compared to 80% compared to patients with adrenal insufficiency had a higher LXR comorbidity index than patients without adrenal insufficiency. And then if we talk specifically about our two subgroup analyses. So first for time, 43% of patients underwent the cosyntropin stimulation test prior to 10am and for glucocorticoids, 19% of patients had an active prescription for glucocorticoids within two months prior to the cosyntropin stimulation test. And among patients with abnormal cosyntropin stimulation testing, 45% of patients were found to have a recent glucocorticoid use. So when we move on to our biochemical markers so first talking about cortisol. So cortisol had good diagnostic accuracy with an area under the curve of 0.81. Again if we look at our two subgroup analyses first talking about time among patients with normal cosyntropin stimulation test results, as you would expect, the medium baseline cortisol concentrations were higher before 10am compared to after 10am and among patients with abnormal cosyntropin stimulation test results, the baseline cortisol levels did not differ based on time. Glucocorticoid use within the past two months had no significant effect on baseline cortisol concentration. Moving on to dhes if DHES also had good diagnostic accuracy with an area under the curve of 0.81 and DHES demonstrated better diagnostic accuracy in patients without glucocorticoid use within the past two months compared to patients with recent glucocorticoid use. So the area under the curve there was 0.83 compared to 0.72. And sort of interestingly, the standardized DHES demonstrated similar performance as non standardized dhes. So getting to kind of the meat of this. So baseline cortisol and DHES cut off levels. So the authors investigated the accuracy of various cortisol cutoff values 3, 5, 10, 12 and 15. Baseline cortisol less than 10 had a sensitivity of 96% for the ACCUR for the diagnosis of adrenal insufficiency with a specificity of 30%. The authors also investigated the accuracy of various DHES cutoffs 25, 40, 60, 100 and 120. Among all patients, a DHES level less than 100 demonstrated a sensitivity of 90% for diagnosing Adrenal Insufficiency. Specificity was 43%. And if we look specifically at postmenopausal women, a DGS measurement less than 100 had a good sensitivity of 98% but a lower specificity of 12%. And so I'm just going to hand that over to Anant and he'll talk to us a little bit about some of the potential limitations of using DHES and postmenopausal women.
[25:09]
C
Of course DHES has limitations. I had mentioned some other surrogates, like A1C has limitations. You all know it can be sometimes falsely low metanephrines. They have limitations, sometimes they can be falsely high. DHS is no different. But I want to back up because Katie just told you about the results and said a lot of statistics, rock curves, area under the curves, cutoffs. And before you get too stuck on any of those numbers, I just want to back up and say, well, you know, what are we doing here without a gold standard? The clinician's approach to adrenal insufficiency is about developing a pre test probability and a probabilistic approach, right? So you see a patient, you have some concern about adrenal insufficiency. Maybe they have unexplained hyponatremia or nausea, vomiting, something of that sort of. So your pretest probability already rises. Then you're going to say, what tools can I use to increase or decrease my pretest probability? You might measure a cortisol, you might measure a dhes, you might measure both, you might use a close entropin stimulation test. Each one of these values will add some prognostic information, but each one has a limitation as well. We talked about cortisol at length variability and the measurement issues, but DHES changes over time. As we get older, DHS levels decline. So the older you get, the lower the dhes, which means the older you are, the less prognostic value a low DHES has in making a diagnosis of adrenal insufficiency, meaning the probabilistic value as an aid in that entire workup, starts to decline when you get through menopause. The gonadal contribution of this small contribution but relevant contribution of DHEA will also decline. So in postmenopausal women something similar might happen. So older men, older women, DHC values being low may not be as accurate. And then glucocorticoid use, which we may talk about later. But if you are on long term glucocorticoids and you have ACT suppression, the prognostic value of DHC may also decline. So those are limitations you have to consider. It doesn't make the test useless, but you have to think twice about what a low DHS means in a 75 year old woman.
[27:06]
B
The authors then looked at the prevalence of adrenal insufficiency based on baseline cortisol and DHES concentration. And so the next numbers are excluding patients with recent glucocorticoid use a baseline cortisol level greater than 10, only 1.2% of patients demonstrated a postcosyntropin stimulation test less than 18. So again, only 1.2% of those patients in this study were diagnosed with adrenal insufficiency. The patients that had kind of more those indeterminate values. So a cortisol level between 5 and 9.9. 6.3% of those patients were diagnosed with adrenal insufficiency. And then patients with a cortisol of less than 5, a much higher proportion, 36% of patients were diagnosed with adrenal insufficiency. Adrenal insufficiency was confirmed in 72% of patients with a cortisol value less than 5 and a DHES level less than 25. And then among patients with a baseline cortisol between 5 and 9.9 and a DHES greater than 6 60, only 1.3% of patients were diagnosed with adrenal insufficiency. And I'm going to hand it back over to Chase and he's going to walk us through the discussion section.
[28:19]
A
All right, Katie's done yeoman's work and getting us through a bunch of numbers and statistics here. Now we're going to wrestle with how to interpret this and we'll start with where the authors do as they Summarize their findings and I'll quote them here where they say baseline cortisol and DHEAS independently had good diagnostic performance and in combination were able to accurately identify patients with adrenaline insufficiency based on the cosine tropin stimulation test. And the way the authors put all this together, they, they come up with a proposed interpretation and management scheme. And it's, it's based on that combination of the cortisol and the DHEAS level. So first of all, if your baseline cortisol is 10 or above, they suggest no further testing is required. Adrenal insufficiency is pretty unlikely in that situation. If, however, your baseline cortisol is indeterminate. So in that that 5 to 9.9 range that Katie mentioned already and your DHEAS was 60 or above, then again, no further testing is required. Pretty unlikely that you have adrenal insufficiency. A third category, again, you have an indeterminate baseline cortisol at 5 to 9.9, but your DHEAS is less than 60, so it's low, or at least low normal. And then it's recommended that you obtain a stimulation test as a fourth category. If your cortisol is low less than 5 and your DHEAS is low, so it's less than 25, then you initiate treatment. It's highly likely that you have adrenal insufficiency. And then a final fifth category, again, the cortisol is low less than 5, but your DHEA S is at least 25, so it's 25 or above. Then this is another situation where you need to obtain a stimulation test. There are a couple of points that the authors make here. And they point out that at their institution that 17.2% of patients who underwent a stem test had adrenal insufficiency. And, and they report that that's lower when you compare it to other reports or other publications. It potentially reflects the widespread availability of stimulation tests at this institution. ANAND mentioned already that there's a lot of these studies. They take on the characteristics of the institutions where they're done. So we do have to keep that in mind. And the authors do a good job of acknowledging that. The authors then go on to point out that we know that DHEAs levels decline with increasing age. However, standardizing the DHEAs according to that age and sex specific LIM of normal did not improve the diagnostic accuracy. So I'll have our third fellow question. So, Rinky Panya from Oregon will ask a question of anand as we Already
[30:45]
B
mentioned, the DHEAS does decline with age. So why did like standardizing with age and gender specific standards of normal did not improve diagnostic accuracy.
[30:56]
C
This finding didn't surprise me at all. I. I think the authors were being diligent and trying to make a robust analysis, but I'm not surprised at all. So there are many circulating factors that change with age. When somebody says something, say declines with age, what they mean is the distribution across a population declines with age. So if you imagine a bell curve distribution in the second, third, fourth, fifth, sixth decade of life, that distribution starts to shift down. Not all the values, just the overall distribution will skew, say left. But that's not what we're looking at here. Look at the DHES levels that have prognostic value. They're really low. They're either on the very low end of what any reference range is or well below it. So we're not looking at the median distribution, we're not looking at the mean, at the distribution. We're looking at Is the DHES low or low? Normal? Frankly low. So I'm not surprised that adjusting for age related distributions didn't make a difference here.
[31:52]
A
The authors then go on to point out that the diagnostic performance of DHEAS was better in patients without a recent history of glucocorticoid use. And they report that this is the first study to examine that associated association and that lower DHEAS levels in patients with recent glucocorticoid use may not be reflective of adrenal insufficiency. So a kind who is coming from here at UCSF is going to ask our fourth and final fellow question, get a bit more speculative, maybe go beyond what the authors talk about here, but with an interesting question for anand.
[32:23]
B
Could you please comment on the pathophysiology and recovery of the HPA axis contrasting cortisol and DHAs? And do you speculate whether DHAs might provide insight into recovery of the HPA axis after glucocorticoid use?
[32:38]
C
Actually I should give a plug. You know the Endocrine Society and the European Society of Endocrinology had a joint guideline last year on glucocorticoid induced adrenal insufficiency. It includes the author Irina Bankos of this paper and myself. And it has a lot of beautiful pictures on exactly your question. So just very briefly, glucocorticoids suppress the entire HPA axis, crh, ACTH will turn off and as a result the Zona fasciculata is not stimulated. The zona Reticularis is not stimulated adrenal cortisol and androgen production will decline once you've completed, let's say a supraphysiologic glucocorticoid course that was suppressive on for say several weeks or several months. You now have to wait for the corticotrophs to wake up and produce acth. That ACTH production has to to often go into a super physiologic range to stimulate kind of work out the zona fasciculata to make cortisol again also the zona reticularis to produce androgens. What you care about is the recovery of cortisol, not dhes. And it turns out cortisol recovery precedes dhes. In fact, in my own personal practice many patients have a huge lag in when androgens wake up again from the adrenal. Sometimes they don't ever wake up. I don't really care because clinically I want to make sure that they have an adrenal sufficient status after we stop their glucocorticoids. So I would say DHS can be a very valuable prognostic marker, as this study shows, for making the diagnosis of adrenal insufficiency. But I don't think it has a major role or any role in looking at the recovery of the HPA axis after glucocorticoid.
[34:05]
A
One final point that the authors make is when they're looking at abnormal stimulation tests among those patients, that the timing of the baseline cortisol, whether it was before or after 10am did not affect the accuracy of the results, suggesting that patients who have adrenal insufficiency have low cortisol levels without a significant diurnal variation. And the results also suggest that there is an increased risk of false positives when that stem test is performed after 10am but the upshot of that would be that the diagnosis of adrenal insufficiency is unlikely to be missed. And that's a point that Anand made earlier. The strengths of this study, as the authors point out, is first of all, the large sample size. Second one would be the uniform testing and biochemical evaluation at the a single center. And then finally all those subgroup analyses that we've been talking about, particularly looking at the effect of the timing of the stimulation test and the history of glucocorticoid use. A couple of limitations that the authors also point out is that they may have failed to exclude CAH patients. And then second one would be the limitations of stimulation tests. So we've talked about that already. Another one that the authors point out is that it doesn't perform as well in secondary adrenal insufficiency, but for the reasons that we've talked about already. All right, so now we're going to wrap things up. We're going to start with where the authors conclude and we'll, we'll first of all report their summary and we'll quote them again here where they say DHEAS measurement is a valuable diagnostic test that can eliminate the need for dynamic testing a large subset of patients when assessed in conjunction with baseline cortisol. And then second of all, maybe what I would describe as an implication, I'll quote them again here as DHEAS is a simple lab test that is widely available and does not have significant diurnal variation. Its routine use alongside baseline cortisol may simplify evaluation for adrenal insufficiency. And maybe just for our entertainment's sake, well, we'll spice it up and pretend maybe they said you should definitely be doing this. And we'll see if we agree with that, if they had said something like that before, though. So we will get to whether this should change our practice, whether this should make a difference in how we're treating our patients. But before we do that, I just wanted us to spend a little bit of time talking about out the quality of this report overall. So, Katie, let's start with you. You look at a lot of these things as a lot of these studies, as a pituitary expert. So what is your sense of the quality of this report overall?
[36:20]
B
Yeah, no, I think this was a really nice study and it just builds on a variety of studies over the last several years that have looked at the use of dhes and diagnosing adrenal insufficiency. And we already talked about some of the factors that we want to think about when we're assessing the quality of the study. So mainly, you know, again, what was our goal standard the CO syntropin stimulation test in this case? And then were there other factors more broadly can that can affect the accuracy of cortisol? And I think the authors did a good job at excluding or acknowledging those things. So like for example, they excluded patients that were on estrogen, so as we know, can increase corticosteroid binding globulin and they miss the diagnosis of adrenal insufficiency. They also excluded patients that had been hospitalized within the past month. So thinking about maybe patients that had early or acute onset central adrenal insufficiency where CO syntropin stimulation testing might not be accurate. And then I think kind of the heart of this and what we already talked about has to do with the cutoff that we're using for cosyntropin stimulation testing. And I agree that we probably can be using a lower cutoff for CO syntropin stimulation test with monoaminoassays. But like we said, I think that in this case that that's probably going to over diagnose adrenal insufficiency, not under diagnose adrenal insufficiency which would be our concern. You know I personally use a DHES routinely in my practice and use very little cosyntropin stimulation testing. So I think this is a really nice study just to kind of give a little bit more evidence for that practice.
[37:53]
A
Right Anand, same question for you. Quality of this report overall. Any comments that you would add to
[37:57]
C
Katie's yeah, no, I think it's a great study. I mean again in full disclosure, the author of this study and I I wrote a review in JAMA just three weeks ago on adrenal insufficiency that encompasses a lot of the things that I was saying which is the diagnosis is probabilistic and this algorithm which I think is a little bit complex can be thought of in a more holistic way. You don't want to be memorizing cutoffs, but on a modern cortisol assay approximately a two digit cortisol makes the probability of clinically relevant adrenal insufficiency extremely unlikely. It's never zero. There's always exceptions, but extremely unlikely. On the other hand, a very low cortisol below 5ish, especially with a low DHES makes adrenal insufficiency much more concerning. But even in that middle group where you kind of have a 5 to 95 to 10 microgram per deciliter morning cortisol, if you look at least in this study only 6% had a Cort stem that failed below 18. Even if this is a biased report and you double that value or triple that value, it still means single digit cortisols between 5 and 10 almost all of the time. The vast majority of the time are not adrenal insufficiency. So how are you going to adjudicate which one of those minority do have clinically relevant adrenal insufficiency which is a can't miss diagnosis and which ones don't? In most of our centers a cosyntropin stimulation test is a bottleneck. It's a procedure that's done on a separate day. It's costly, it requires a room, a nurse, et cetera. I do only adrenal clinic and I almost never do a constant stimulations challenge. You really want to avoid doing that if you can for healthcare resource rationing and time and labor. And this DHES really helps you adjudicate that. Because if the DHS is not low, that low cortisol is almost definitely not adrenal insufficiency. And maybe just a simple repeat morning cortisol will convince you of that. On the other hand, if the DHS comes back low, normal or low, maybe you're on the right track and there's a place where you might want to use this kind of bottleneck test and use the extra labor, time and cost to make the right diagnosis.
[39:56]
A
I think you can get a sense already of where our two experts here have moved their practice. But I'm going to push them and say, well, again, let's imagine that these authors said you should definitely be doing this. So would you advocate that this be widely done? So you guys do pituitary and adrenal. That's what you do all the time. There's a lot of us, myself included, who are general endocrinologists and this is potentially something that would be done by a lot of people who are not endocrinologists or general practitioners. So would you advocate that this be widely adopted as an initial test for adrenal insufficiency? So Katie will pose that question to you, first of all.
[40:28]
B
Yeah, no, I would. I think you always need to be thinking with any test like we already discussed, you know, what your pre test probability of adrenal insufficiency is and what your clinical suspicion is. That's always the most important question when you're interpreting the testing results. But yes, I think it's an important part of our practice. And again, I think the more we can get towards using other markers instead of dynamic testing for all the reasons that we discussed, I think that's going to improve patient care.
[40:53]
A
All right, so Anand, same question for you. Maybe I'll make it a little bit harder and say you've already alluded to it already and said the schemes may be a little bit complicated because you've got different categories and well, if it's this, but this is this, then you should do this. And in our analysis you mentioned that there are differences between assays, so these are not necessarily all going to be the same. So maybe I would pose this question to you, but with the added question of is this too complicated? Is this going to be actually pretty difficult? Endocrinologist maybe this is fine, but does this get too hard to say push out to primary care providers? What are your thoughts about should this be widely adopted as a way to at least initially evaluate a patient for adrenal insufficiency?
[41:32]
C
No, I agree with Katie. I think this should be widely adopted. I mean first you should not be doing a test you don't know how to interpret, which you know, we all measure things in practice or we know colleagues of ours who measure something and then immediately ask you for help because they don't know how to interpret what they just measured. But if you're going to do it, at least do it right. So I would advocate if you're going to assess for adrenal insufficiency, you should measure morning cortisol. With that a dhes and I would go even further and say add on an ACTH all three of those. With those three values you can almost always make the assessment of whether you're worried about adrenal insufficiency or whether you can exclude the possibility. So for those of you who are specialists, I would strongly recommend this. For those of you who interface with non specialists via econsult, via consultations, I would advise your colleagues to do that so that if they ask you for help, you can use those values to make the full adjudication that I think this has been excluded. I think this needs a closing tropin stimulation test or I think you have made the diagnosis just on this morning set alone.
[42:32]
A
And with that I would like to thank Katie Gutenberg, Ana and Vaidya and our four fellows for joining me for this month's edition of Endocrine Feedback Loop. I hope that you all learned as much as I did and that you will join us again next time. And now you're in the loop. This has been Endocrine Feedback Loop. Endocrine Feedback Loop is brought to you by the Endocrine Society with production oversight by Brandy Brown and Andrew Harmon. If you want to like and subscribe, you can find us on Apple, Spotify or wherever you get your podcasts. We'd love to hear your feedback on this episode of the podcast itself. Please email us@podcastron.org.
[43:18]
C
Endocrine Feedback Loop is a free service
[43:20]
A
of the Endocrine Society. To learn more or to become a member, visit the society's website at www.endocrine.org.