Comparison of growth hormone stimulation tests in prepubertal children with short stature according to response to growth hormone replacement
Article information
Abstract
Purpose
Growth hormone (GH) stimulation tests are essential tools for diagnosing GH deficiency (GHD). We aimed to compare L-dopa, insulin, and arginine-induced stimulation tests based on response to GH replacement.
Methods
We retrospectively collected data from a review of patients who underwent the GH stimulation test. A total of 138 patients diagnosed with idiopathic short stature were categorized into group I. The remaining 135 patients, who were diagnosed with GHD and treated for 1 year, were classified into 2 subgroups: group IIa, consisting of patients with an increase of at least 0.5 in height standard deviation score (SDS), and group IIb, patients with an increase of less than 0.5 in height SDS.
Results
At the initial visit, group IIa exhibited significantly lower insulin-like growth factor binding protein-3 (IGF-BP3) and higher body mass index (BMI) SDS compared to the other groups. Following 1 year of treatment, group IIb showed significantly lower height SDS, height SDS gain, growth velocity, predicted adult height SDS, weight SDS, and a higher insulin-like growth factor-1 SDS than group IIa. Bone age and IGF-BP3 were inversely associated, and BMI SDS and IGF-BP3 were positively associated with height SDS gain in GHD patients. The specificity and accuracy rates were 50.3% and 70.3% for the L-dopa-induced stimulation test, 72.3% and 86.6% for the insulin tolerance test (ITT), and 64.7% and 87.2% for the arginine-induced stimulation test (ArST).
Conclusions
The ArST demonstrated lower specificity compared to the ITT. However, patients undergoing ArST experienced fewer side effects, suggesting that a careful selection of stimulation tests is crucial in diagnosing GHD.
Highlights
· Growth hormone provocation tests can be compared based on the increase in height standard deviation score after 1 year of growth hormone treatment as the diagnostic standard.
· Arginine induced stimulation test exhibits lower specificity yet maintains comparable accuracy to insulin induced stimulation test.
Introduction
The rising interest of parents in their children's growth and the advancements in growth hormone (GH) replacement therapy over the past 60 years have made short stature the primary reason for visits to pediatric endocrinology clinics [1-3]. Children exhibiting short stature or reduced growth velocity compared to their same age and sex peers warrant a referral for comprehensive evaluation. The decision to perform GH provocation testing should be made after meticulous consideration [1]. Only after ruling out other causes of poor growth through clinical assessment and basic laboratory tests should patients undergo GH provocation tests to determine GH deficiency (GHD) or sufficiency.
Due to the physiologic circadian variations in GH secretion and the typically low serum GH levels during daytime, GH release provocation tests are employed to assess the somatotropic axis, rather than relying on a single basal GH level measurement. GH stimulation tests are the accepted diagnostic method for GHD, which is identified by a serum peak GH concentration of less than 10 µg/L in at least 2 different tests using various stimuli. Given the low reproducibility of tests with physiological stimuli such as exercise or sleep, pharmacological agents like insulin, L-dopa, arginine, clonidine, glucagon, or GH-releasing hormone are commonly used for stimulation [4].
Years of research have focused on the superiority of various stimuli in GH stimulation tests and the correlation of these test results with different factors. However, debate persists regarding the most effective stimuli for diagnosing GHD or predicting treatment success [5]. GH stimulation tests may lead to unwarranted treatment in patients with false-positive results, as indicated by atypical therapeutic responses in treated children [6]. Favorable clinical responses to recombinant human GH (rhGH) treatment, specifically an increase in height standard deviation score (Ht SDS) of more than 0.3–0.5 after one year of therapy, can confirm hormonal deficiency [2,7]. In this context, we sought to compare L-dopa, insulin, and arginine-induced GH stimulation tests based on their response to GH replacement as a diagnostic standard, which is defined as an increase of at least 0.5 in Ht SDS during 1 year of treatment.
Materials and methods
1. Subjects
We retrospectively collected clinical data from a review of medical records in the digital database at Chonnam National University Hospital. Initially, we enrolled 429 prepubertal children with short stature who had undergone GH stimulation tests between January 2015 and December 2022. We excluded children with any genetic or chronic illness that might affect growth. Additionally, we excluded patients who had no records of growth velocity from the study. To examine the clinical responses to GH therapy, we also excluded GH-deficient patients lacking 1-year treatment results. Ultimately, 273 patients were included in this study.
We defined GHD as serum peak GH levels below 10 µg/ L, determined by stimulation with at least 2 separate tests. We also classified GHD patients into 2 subgroups by stimulation test: complete GHD, as serum peak GH levels below 5 µg/L; and partial GHD, as serum peak GH levels of 5 µg/L or higher. We diagnosed idiopathic short stature (ISS) as serum peak GH levels of 10 µg/L or higher on a stimulation test in children with a height below the third percentile for their sex and age.
Of these patients, 138 were diagnosed with ISS and were categorized into group I. The remaining 135 patients, diagnosed with GHD, received GH treatment for 1 year. We divided these GHD patients into group IIa (patients with an increase of at least 0.5 in Ht SDS) and group IIb (patients with an increase of less than 0.5 in Ht SDS). All patients were administered GH at a dose of 0.7 IU/kg/wk during 1 year of treatment.
2. Methods
1) Clinical and biochemical data
We analyzed the medical records of patients, focusing on auxologic factors such as height, midparental height (MPH), weight, sex, chronological age (CA), bone age (BA), and metabolic factors including concentrations of GH, insulin-like growth factor-1 (IGF-1), insulin-like growth factor binding protein-3 (IGF-BP3), hemoglobin (Hgb), hepatic transaminases (aspartate aminotransferase [AST] and alanine aminotransferase [ALT]), fasting glucose, total bilirubin, total cholesterol, free thyroxine (free T4), and thyroid-stimulating hormone (TSH). BA was assessed using the standard Greulich and Pyle method through left hand and wrist x-rays [8]. The pubertal stage was determined based on the criteria of Marshall and Tanner. MPH calculation involved adding 6.5 cm to the mean parental height in males and subtracting 6.5 cm in females. Predicted adult height (PAH) was estimated using the Bayley-Pinneau method, which considers the patient's height and BA [9].
2) GH stimulation test
A venous catheter was inserted in the morning following an overnight fast. Stimulation tests were performed once daily over 2 consecutive days. For the L-dopa induced stimulation test (DST), dopamine (Perkin Tab, Myung In Pharm, Hwaseong, Korea) was administered orally. The dosage of L-dopa depended on body weight: >30 kg, 500 mg; 15 to 30 kg, 250 mg; <15 kg, 125 mg. Blood tests were performed through the catheter before administration and at 30-, 45-, 60-, 90-, and 120-minute postadministration to measure serum GH levels at each time point. In the insulin tolerance test (ITT), 0.1 IU/kg of insulin was injected intravenously to induce hypoglycemia below 50 mg/dL or half the baseline glucose levels. Blood tests were done before administration and at 15, 30, 45, 60, 90, and 120 minutes afterward. During the arginine-induced stimulation test (ArST), arginine hydrochloride (GC Arginine Inj, GC Wellbeing, Seoul, Korea; 0.5 g/kg, maximum 30 g) was administered intravenously over 30 minutes. Blood tests were similarly conducted before injection and 30, 60, 90, and 120 minutes after the completion of the injection. Priming with sex steroids before the provocation test was not conducted.
3) Hormone assays
An immunoradiometric assay (IRMA) with a detection limit of 0.04 µIU/mL was performed to estimate GH levels. The intra-assay coefficients of variation (CVs) ranged from 1.0% to 3.5%, and the interassay CVs ranged from 2.0% to 17.9% (hGH [I-125] IRMA KIT, Institute of Isotopes Co. Ltd., Budapest, Hungary). IRMA with an analytical sensitivity of 4.55 ng/mL was utilized for measuring serum IGF-1 levels. The intra-assay CVs were at or below 5.6%, and interassay CVs were at or below 8.3% (IRMA IGF-1, Immunotech, Prague, Czech Republic). To measure serum IGF-BP3 levels, IRMA with an analytical sensitivity of 0.27 ng/mL was conducted. The intra-assay CVs were at or below 4.4%, and interassay CVs were at or below 13.5% (ACTIVE IGFBP-3 IRMA, Immunotech).
3. Statistical analysis
Categorical data are presented as numbers, and continuous data as the mean±standard deviation based on the characteristics of the variables. Comparisons between 2 groups were made using an independent-sample t-test, and comparisons among the 3 groups were performed using 1-way analysis of variance to compare the means of continuous variables. Comparisons of initial and final data within each group were made using a paired t-test. The Pearson correlation coefficient was applied in correlation analysis. Univariate linear regression analysis was conducted to determine variables associated with changes in Ht SDS in response to GH replacement. Independent predictors of changes in Ht SDS after treatment were identified using multiple regression analysis. Statistical significance was defined as P-values <0.05.
Standard deviation scores of height, weight, body mass index (BMI), and PAH were derived from the 2017 Korean national growth charts for children and adolescents from the Korean Disease Control and Prevention Agency [10]. IGF-1 SDS were calculated based on age- and sex-specific references of Korean children [11]. Specificity was defined as the number of true negative results divided by the total number of results. Accuracy was defined as the number of true positive and true negative results divided by the total number of tests. Statistical analysis was performed using IBM SPSS Statistics ver. 21.0 (IBM Co., Armonk, NY, USA).
4. Ethics statement
The Institutional Review Board of Chonnam National University Hospital (CNUH) approved this study (approval number: CNUH-2023-234).
Results
1. Subject characteristics
The clinical and biochemical parameters at the initial visit and after 1 year of rhGH treatment are presented in Tables 1 and 2. Of the 273 patients who underwent GH stimulation tests, 138 patients (50.5%) were diagnosed with ISS and were classified into group I. The remaining 135 patients (49.5%) diagnosed with GHD were further categorized: 110 into group IIa (patients with an increase of at least 0.5 in Ht SDS) and 25 into group IIb (patients with an increase of less than 0.5 in Ht SDS). They were also classified into 2 subgroups: 42 into the complete GHD group and 93 into the partial GHD group. Among these, 166 were boys and 107 were girls. No significant differences in sex distribution were observed between the groups. The mean CA was 7.7±2.4 years, and the mean BA was 7.1±2.4 years.
2. Comparison of peak serum GH levels in response to GH stimulation test by 3 types of stimuli
In the cohort of 273 patients, a total of 464 GH stimulation tests were performed. L-Dopa, insulin, and arginine were employed as stimuli in 273 (58.8%), 97 (20.9%), and 94 of the cases (20.3%), respectively. As anticipated, peak GH levels in group I were significantly higher than those in groups IIa and IIb across all stimulation tests. In the ArST, a notable difference in peak GH concentration was observed between group IIa (4.6±2.5 µg/L) and group IIb (2.6±1.9 µg/L) (Table 1).
3. Comparison of parameters between or within groups before and after GH treatment
1) Comparison between groups before treatment
Prior to treatment, group IIa exhibited significantly lower levels of IGF-BP3 and higher BMI SDS compared to other groups. Weight SDS and ALT were significantly elevated, while IGF-BP3 SDS was reduced in group IIa relative to group I. Growth velocity was significantly elevated in group I compared to other groups. Total cholesterol was significantly elevated in group IIb relative to other groups. No differences were observed in Ht SDS, PAH, PAH SDS, MPH SDS, IGF-1, and IGF-1 SDS among groups (Table 1).
2) Comparison of initial and final data within group IIa and group IIb
We observed a significant increase in Ht SDS, growth velocity, PAH, PAH SDS, weight SDS and a decrease in CA–BA, BMI SDS in group IIa after one year of treatment. In group IIb, however, only Ht SDS, growth velocity and weight SDS increased significantly. When comparing metabolic parameters, we noted a significant rise in IGF-1, IGF-1 SDS, IGF-BP3, and IGF-BP3 SDS within both groups IIa and IIb (Table 2).
3) Comparison of final data between groups IIa and group IIb
Consistent with our premise, the increase in Ht SDS after one year of treatment was significantly higher in group IIa than in group IIb (0.82±0.23 vs. 0.37±0.08). Group IIa differed from group IIb, with higher HT SDS, growth velocity, PAH SDS, weight SDS, and IGF-1 SDS (Table 2).
4. Comparison of parameters between or within complete GHD and partial GHD groups before and after GH treatment
1) Comparison between groups before treatment
Prior to treatment, the complete GHD group exhibited significantly higher levels of CA, BA, height and PAH compared to the partial GHD group. No differences were observed in CA– BA, Ht SDS, PAH SDS, weight SDS, BMI SDS, IGF-1, IGF-1 SDS, IGF-BP3, or IGF-BP3 SDS between groups (Table 3).
2) Comparison of initial and final data within groups
All variables showed a significant difference comparing initial to final data within each group (Table 3).
3) Comparison of final data between groups
After 1 year of GH replacement, the complete GHD group differed from partial GHD group with significantly higher CA, BA, height, and PAH. No differences were observed in Ht SDS gain or growth velocity (Table 3).
5. Correlations between variables and gain of height SDS after 1 year of treatment
The gain in Ht SDS in GHD groups after 1 year of treatment showed negative correlations with CA (r=-0.459, P<0.001), BA (r=-0.459, P<0.001), height (r=-0.436, P<0.001), IGF-1 (r=-0.185, P=0.036), and IGF-BP3 (r=-0.298, P=0.001). Conversely, weight SDS (r=0.271, P=0.001), BMI SDS (r=0.382, P<0.001), AST (r=0.258, P=0.003), and TSH (r=0.197, P=0.023) exhibited a positive correlation with Ht SDS gain. However, gain in Ht SDS did not demonstrate a statistically significant association with CA–BA, HT SDS, MPH, PAH, PAH SDS, IGF-1 SDS, IGF-BP3 SDS, Hgb, ALT, total bilirubin, total cholesterol, fasting glucose, free T4 and peak GH concentrations in stimulation tests. When dividing the subjects into groups IIa and IIb based on the diagnostic standard of with or without a 0.5 increase in Ht SDS, group IIa showed a similar correlation between variables and gain in height SDS as total GHD patients, except for IGF-1 and TSH, which showed no correlation with Ht SDS gain. Group IIb demonstrated a negative correlation of Ht SDS gain with BA (r=-0.461, P=0.020), height (r=-0.419, P=0.037), MPH (r=-0.521, P=0.008), MPH SDS (r=-0.416, P=0.039), and TSH (r=-0.487, P=0.014). Only ALT (r=0.429, P=0.032) was positively correlated with Ht SDS gain in group IIb (Table 4).
6. Multiple regression analysis of variables and gain of height SDS after one year of treatment
CA, BA, height, weight SDS, BMI SDS, IGF-1, IGF-BP3, AST, and TSH were associated with height SDS gain in univariate regression analysis. Using multiple regression analysis, we found that BA and IGF-BP3 were negatively and BMI SDS and IGF-1 were positively associated with height SDS gain after adjusting for other factors in the total GHD groups. In group IIa, BMI SDS was positively associated with height SDS gain. In group IIb, BA and TSH were negatively associated with height SDS gain (Table 5).
7. Specificity and accuracy of each GH stimulation test
As previously defined, the specificity and accuracy were 50.3% and 70.3% for DST, 72.3% and 86.6% for ITT, and 64.7% and 87.2% for ArST, respectively.
Discussion
In recent years, there has been a notable increase in the number of children and adolescents receiving GH treatment, driven by heightened parental concern about their children’s height. Parents of children with short stature frequently seek evaluation at pediatric endocrine outpatient clinics. The initial assessment involves analyzing auxologic, basal laboratorial, and radiologic data to rule out genetic, organic, or other secondary causes of short stature. Children whose height falls below the third percentile, with no abnormalities detected in initial tests, are further evaluated for GHD. Diagnosis of GHD is not based solely on basal GH levels, as these can vary due to physiological circadian rhythms and daytime fluctuations in GH secretion. Instead, GHD confirmation relies on the GH response to a stimulation test using at least 2 different stimulants.
However, the variability and reproducibility of GH stimulation tests have been subjects of debate, leading to an absence of a definitive "gold standard" for GHD diagnosis [4,12-15]. The risk associated with this definitive test for GHD lies in the potential for overreliance on its results. The outcomes of GH stimulation tests are influenced by both the intrinsic nature of the test and patient-specific factors [1]. Extensive research has been conducted to enhance or ascertain the reliability of GH stimulation tests. This research includes comparing the diagnostic accuracy of tests using various stimulants [16-20], examining factors that affect test results [21-25], and investigating responses to GH treatment, as conducted in our study [26-29].
The reliability of the GH provocation test is low, and the cutoff point for diagnosing GHD has been set somewhat arbitrarily across various studies. The challenge in testing for GHD lies in how the disorder is defined. Monitoring the growth of patients who have undergone these tests, both with and without treatment, is a beneficial approach to detect false-positive or false-negative results in GH stimulation tests. Furthermore, optimizing growth during the first year of GH treatment in prepubertal children through evaluation of their response to GH therapy is crucial, as this response is a strong predictor of future growth outcomes [26,29]. If a patient's response to GH treatment continues to be poor, conducting a prompt and thorough investigation to consider increasing the GH dose or discontinuing treatment is advisable. This approach helps to avoid unnecessary daily painful injections, wasted expenditure, and exposure to potential adverse outcomes [4,26].
The assessment of outcomes following GH therapy requires meticulous consideration, as there exists a considerable variability in individual responses [1,4]. This variability might stem from factors such as suboptimal adherence to treatment, exposure to adverse environmental conditions, or inherent genetic differences in GH sensitivity. However, the exact reasons for this variation are not entirely clear [4]. Due to the continuous nature of response to GH therapy, establishing a definitive criterion for non-responsiveness remains subjective [2,7,30]. Proposed parameters for identifying inadequate response in the first year of treatment include a height velocity SDS below +1, an increase in height velocity of less than 3 cm per year, a height velocity less than -1 SDS relative to the average [27,29], and an increase in Ht SDS of less than 0.3 to 0.5 [28,29]. Our objective was to evaluate the efficacy of 3 different stimulation tests conducted in our Pediatric Endocrinology Department. This evaluation was based on the response to 1 year of rhGH treatment, specifically measuring the change in Ht SDS, either above or below 0.5, and to identify factors influencing treatment responsiveness. The group that showed less response was set as the false-positive group to evaluate the effectiveness of the stimulation test.
In this study, patients who underwent GH stimulation testing were classified into 3 groups based on the presence of GHD and the increase in Ht SDS after one year of treatment in GHD patients. At the first visit, group IIa, identified as true GHD patients, exhibited lower IGF-BP3 levels and higher BMI SDS compared to the other groups. Prior research has established IGF-1 and IGF-BP3 as effective indicators for GHD screening and their correlation with basal GH levels [13,31-33]. Furthermore, a notable negative correlation exists between BMI and peak GH in the stimulation test, as indicated by multiple studies, which supports the use of BMI-specific thresholds for GHD diagnosis [12,34-36]. Peak GH levels in stimulation tests were expected to be higher in group I than in other groups. However, peak GH levels in the ArST for group IIb, the false-positive GHD patients, were significantly lower than those in group IIa (2.6±1.9 µg/L vs. 4.6±2.5 µg/L). While arginine, a GH stimulus, has high intraindividual reproducibility, normal individuals without GHD often exhibit peak GH concentrations below 3 µg/L in ArST, as observed in our study, which is similar to patients with severe GHD [37]. The cause of this result is unclear, so additional research is necessary.
After 1 year of treatment, the increase in Ht SDS was significantly higher in group IIa compared to group IIb (0.83±0.23 vs. 0.37±0.08). At the initial visit, the change in Ht SDS was negatively correlated with CA, BA, height, IGF-1, and IGF-BP3, and positively correlated with weight SDS, BMI SDS, AST, and TSH in the total GHD patient cohort. In multiple regression analysis, prior to GH replacement, BA and IGF-BP3 emerged as negative predictors, while BMI SDS and IGF-1 were positive predictors of Ht SDS change. Notably, BA, CA, and height, which are likely interrelated, were significantly higher in group IIb than in group IIa at the initial visit. However, BA and TSH showed a negative association with the change in Ht SDS in group IIb. Collett-Solberg et al. [2] have shown that GHD-induced persistent short stature, absent precocious puberty, is marked by a BA delay of at least 1–2 years. Consequently, children with a BA equal to or exceeding CA should not be diagnosed with GHD. This underscores that Ht SDS gain, a crucial factor in GHD diagnosis, is likely inversely related to higher BA. Additionally, obese children receiving GH replacement therapy exhibited higher IGF-1 levels than their normal-weight counterparts [31]. Despite the lower stimulated peak GH concentrations, which suggest an increased likelihood of false positives in GH stimulation tests, obese GHD patients demonstrated a more pronounced height increase compared to non-obese GHD children [36].
Therefore, based on our definition, DST demonstrated lower specificity and accuracy compared to the other 2 tests, with specificity at 50.3% and accuracy at 70.3%. The accuracies of ITT and ArST were notably similar (86.6% vs. 87.2%), yet they differed in specificity (72.3% for ITT vs. 64.7% for ArST). While it would be logical to select the test with the highest validity according to these results, practical considerations often dictate otherwise. The provocative tests are associated with certain side effects due to the stimulus. DST frequently causes gastrointestinal issues, predominantly nausea and vomiting. Insulin-induced hypoglycemia can emerge as a severe condition demanding immediate intervention. Compared to other tests, ArST has a lower incidence of side effects, which can include headaches, nausea, vomiting, hypoglycemia, and allergic reactions. In our study, of the 273 patients subjected to DST, 175 (64.1%) experienced gastrointestinal symptoms such as nausea, vomiting, and abdominal pain, yet none required urgent medical attention. In the ITT group of 97 subjects, the average serum glucose level was 41.6 mg/dL, with the average time to reach the nadir blood sugar level being 28.8 minutes after insulin administration. Hypoglycemic symptoms like cold sweating, drowsiness, and dizziness were observed in 27 patients (27.8%), who were advised to restore glucose levels through oral supplements. However, one patient exhibiting severe hypoglycemia with mental alteration required intravenous dextrose administration. Among the 94 subjects undergoing ArST, only 3 cases (3.2%) reported complications, including 2 instances of vomiting and 1 of hypoglycemia.
Our study has several limitations. This study was performed retrospectively in a single center. Our case size is too small to represent the general pediatric population with short stature. Further studies including larger cohorts should be performed to compare our results.
We aimed to compare 3 GH stimulation tests based on the response to GH replacement as a diagnostic standard, defined as an increase of at least 0.5 in Ht SDS during one year of treatment. To assess the specificity and accuracy of the stimulation tests, we identified the group with fewer responses indicated as false positives. In conclusion, the response to GH replacement indicates that ArST exhibits lower specificity yet maintains comparable accuracy to ITT. Notably, patients subjected to the arginine test experienced significantly fewer side effects and did not require immediate treatment for recovery, unlike those tested with insulin. These findings underscore the importance for clinicians to carefully select stimulation tests when diagnosing GHD. Furthermore, it is imperative for all children undergoing GH treatment to receive periodic follow-up assessments to monitor their response to the treatment.
Notes
Conflicts of interest
No potential conflict of interest relevant to this article was reported.
Funding
This study received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
Data availability
The data that support the findings of this study can be provided by the corresponding author upon reasonable request.
Author contribution
Conceptualization: CJK; Data curation: SHC; Formal analysis: SHC; Methodology: SHC; Project administration: CJK; Visualization: SHC, CJK; Writing - original draft: SHC; Writing - review & editing: SHC