Phys. Ther. Korea 2021; 28(3): 186-193
Published online August 20, 2021
© Korean Research Society of Physical Therapy
1Sports Movement Artificial-Intelligence Robotics Technology (SMART) Institute, Department of Physical Therapy, 2Department of Physical Therapy, Yonsei University, Wonju, 3Rehabilitation Center, Chungnam National University Hospital, Daejeon, Korea
Correspondence to: Joshua (Sung) Hyun You
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Background: While the formal test has been used to provide a quantitative measurement of core stability, studies have reported inconsistent results regarding its test-retest and intraobserver reliabilities. Furthermore, the validity of the formal test has never been established.
Objects: This study aimed to establish the concurrent validity and test-retest reliability of the formal test.
Methods: Twenty-two young adults with and without core instability (23.1 ± 2.0 years) were recruited. Concurrent validity was determined by comparing the muscle thickness changes of the external oblique, internal oblique, and transverse abdominal muscle to changes in core stability pressure during the formal test using ultrasound (US) imaging and pressure biofeedback, respectively. For the test-retest reliability, muscle thickness and pressure changes were repeatedly measured approximately 24 hours apart. Electromyography (EMG) was used to monitor trunk muscle activity during the formal test.
Results: The Pearson’s correlation analysis showed an excellent correlation between transverse abdominal thickness and pressure biofeedback unit (PBU) pressure as well as internal oblique thickness and PBU pressure, ranging from r = 0.856–0.980, p < 0.05. The test-retest reliability was good, intraclass correlation coefficient (ICC1,2) = 0.876 for the core stability pressure measure and ICC1,2 = 0.939 to 0.989 for the abdominal muscle thickness measure.
Conclusion: Our results provide clinical evidence that the formal test is valid and reliable, when concurrently incorporated into EMG and US measurements.
Keywords: Electromyography, Formal Test, Test-Retest Reliability, Ultrasonography, Validity
Spinal segmental stabilization exercises have been developed to provide effective and efficient exercise programs for athletes with sports injuries and patients with low back pain (LBP) associated with core instability [1-4]. The abdominal draw-in maneuver (ADIM), a form of core stabilization technique [5,6], involves selective neuromuscular recruitment control of the transverse abdominal (TrA), internal oblique (IO), and multifidus muscles together with the minimal contraction of the superficial external oblique (EO) and paraspinal muscles [7-9]. The co-contraction of deep local trunk muscles such as the TrA, IO, and multifidus muscles is considered essential to lumbar stability. It has been hypothesized that the co-activation of these muscles, together with the thoracolumbar fascia, generates an intra-abdominal pressure that transforms the abdomen into a mechanically rigid cylinder, thus, providing spinal stability [10,11].
Accurate and reliable core stabilization testing is crucial for clinical decision-making regarding diagnostic procedures in managing athletes with sports injuries and patients with core instability-related LBP. Various tools, including real-time ultrasound (US) imaging, electromyography (EMG), and pressure biofeedback unit (PBU), have been widely used to provide feedback about deep trunk muscle recruitment and to measure a person’s ability to contract their core muscles, particularly their TrA muscles [11-16]. A PBU is an inflatable, inelastic bag connected to a pressure gauge and inflation devices . A pressure change in a PBU’s inflation bulb indicates the contraction or relaxation of deep local trunk muscles (e.g., TrA and IO muscles) during ADIM in a prone position , also referred to as the ‘formal test’ or ‘PRONE test’ in previous studies [16,17]. This test is used clinically to assess deep core muscles  and the re-education or selective contraction of deep local trunk muscles to improve core stability. However, reliability studies assessing deep core muscles via PBU have reported inconsistent results regarding test-retest and intraobserver reliability [17-19]. Furthermore, to the best of our knowledge, a formal test’s validity has never been established.
It remains unclear whether core stability assessment is only possible through the use of a PBU in the formal test, which increases the selective co-activation of TrA and IO muscles while minimizing the activation of the EO and erector spinae (ES) muscles. Therefore, our specific aim was to establish the validity and reliability of the formal test relationship between abdominal muscle thickness and core stability pressure.
A convenient sample size of 11 young adults with core instability (8 males and 3 females, 23.1 ± 2.0 years) and 11 young adults with core stability (8 males and 3 females, 23.0 ± 3.1 years) was recruited from a local university. The core instability group consisted of subjects unable to successfully perform the formal test’ three consecutives testing at the baseline level. The core stability group subjects completed the ADIM core stabilization training, augmented by PBU-US-EMG feedback, for 20 minutes a day, 7 days a week, over two weeks . They successfully passed the formal test three consecutive times. However, subjects who presented with any known diagnosis of neurological or musculoskeletal disease were excluded. The demographic and anthropometric data are shown in Table 1. The experimental study protocol was approved by the Yonsei University Institutional Review Board (1041849-202103-BM-048-02), and informed consent was obtained from all participants before participation in the study. The demographic and anthropometric data are presented in Table 1.
A PBU (Chattanooga Group, Hixson, TN, USA), comprising a combined gauge and inflation bulb connected to a pressure cell , was used to assess core stability during the formal test . Real-time US (Medison Inc., Seoul, Korea) was used to determine core stability by measuring the thickness of the abdominal muscles (including the TrA, IO, and EO muscles) on the dominant side, and providing accurate visual feedback during the 2-week ADIM core stabilization training. The rib cage and iliac crest’s inferior borders were palpated as reference points. A 4.5 cm linear transducer (L5-12EC; Samsung, Suwon, Korea) with a frequency of 10 MHz was transversely positioned and halfway between them on the anterolateral abdominal wall, lateral to midline [21,22]. The transducer head was manipulated until three distinct muscle layers (TrA, IO, and EO) were visible. All images were consistently obtained at the end of the expiration phase to prevent the potential influence of respiration on muscle thickness. The muscle thickness dimension measurements (cm) were determined by an investigator using an on-screen caliper. Abdominal muscle thickness was defined by drawing a horizontal reference line located 1 cm from the muscle-fascia junction of the TrA muscle (benchmark) (Figure 1) . The myofascial boundaries were defined as the target layers of the hypoechoic area, where the heterogeneous boundaries manifested by contrasting pixels from dark to light [23,24]. Unacceptable data resulting from movement artifacts were discarded, and the US image was obtained repeatedly, measured using the image data, and then stored for further analysis.
Surface EMG (Noraxon Inc., Scottsdale, AZ, USA) was used to monitor the optimal performance of the ADIM in the formal test, which involved minimal activation of the superficial trunk muscles, including the EO and ES muscles while maintaining the selective co-activation of the TrA and IO muscles [11,16]. We measured the EMG amplitudes of the superficial trunk muscles on the dominant side (including the EO and ES muscles) during ADIM in the prone position . Each participant’s skin was carefully prepared to reduce skin impedance to below 5 kΩ by dry-shaving hair with a disposable razor and cleansing the shaved skin with a 2% alcohol swab. Once the skin was dry, pre-gelled bipolar Ag/AgCl disposable electrodes (Bio-Protech Inc., Wonju, Korea) were placed over the EO and ES muscles with an inter-electrode distance of 2.0 cm. Two representative peak EMG amplitude values from the five trials were obtained at the maximal expiratory volume in a standing position by monitoring with a spirometer . The middle 0.35-second EMG signals were then averaged to provide a stable reference value. This reference anchor was used to calculate the percentage of reference voluntary contraction (RVC) to normalize the above muscles. The EO and ES muscles’ raw EMG signals were recorded at a sampling rate of 1,500 Hz and processed with a 60 Hz notch filter for noise reduction associated with electrical interferences, including 60 Hz power lines or radio frequencies, and electric or magnetic devices. The root mean-square EMG amplitude for the superficial trunk muscles was computed using MyoResearch software (version 1.07; Noraxon Inc.). The EMG signal was full-wave rectified and filtered using a band-pass filter at 15–500 Hz.
Before the formal test, it was ensured that the ADIM was well understood by all the participants, and practiced for five minutes , after which they were sorted into core instability and core stability groups by a certified physical therapist (SJ). The formal test used a PBU to assess deep local trunk muscle function [16,17]. Additionally, we used real-time US (standard gold measurement) and surface EMG to measure deep abdominal muscle thickness and superficial trunk muscle activity (Figure 2). Initially, the subjects lay in a prone position on a plinth, arms at their sides, head and neck fully relaxed in the midline, with the PBU under their abdomen, navel in the center, and the distal edge of the pad in line with the right and left anterior superior iliac spine (ASIS). The subjects were instructed to relax their entire body before abdominal muscle contraction. The PBU was inflated to a pressure of 70 mmHg. Subjects were then verbally instructed to perform the ADIM using standard terminology, such as “pull your belly button up and in toward your spine without pelvic movement during exhalation and contract the deep abdominal muscles at a level of <20% of maximal contraction without excessively contracting the superficial muscles.” The ADIM was held for 10 seconds [16,17,27]. Richardson et al.  described a positive result for the formal test as a pressure reduction of approximately 4 to 10 mmHg over 10 seconds. This positive result indicated a correct localized contraction of the TrA and IO muscles, independent of the other abdominal muscles . In this study, the formal test for establishing concurrent validity was considered “successful” when the participant met the following criteria: (1) performed the correct ADIM (the selective co-activation of TrA and IO muscles as determined by muscle thickness changes using real-time US) while reducing the pressure by approximately 4–10 mmHg and maintaining this target pressure level monitored by the PBU  and (2) performed the ADIM without substitutions such as the excessive contraction of the EO and ES muscles (<15% RVC)  as evidenced in the EMG or any evasive movements (e.g., pelvic rotation, lumbar arching, shoulder elevation, or upper chest expansion) . If the participant met only one of the above criteria, the formal test was operationally defined as a ‘failure.’
The formal test’s concurrent validity was determined by comparing the thickness of the abdominal (TrA and IO) muscles using real-time US (standard gold measurement) and the core stability pressure using PBU in young adults with core instability and stability. The procedural steps for validity measurements were as follows: first, surface EMG electrodes were placed over the EO and ES muscles to monitor minimal contraction of the superficial trunk muscles (less than 15% RVC). The EO electrodes were positioned halfway between the iliac crest and the twelfth rib at a slightly oblique angle, running parallel to the muscle fibers. The ES electrodes were placed over the muscle mass, approximately 2 cm from the third lumbar vertebra (L3) . The ground electrode is then located over the ASIS. Second, the subjects lay in a relaxed, prone position on a plinth with the PBU under their abdomen. The bulb of the PBU was inflated to a pressure of 70 mmHg. Third, the subjects performed the ADIM in a pressure reduction range of 4 to 10 [5,7,9,11] and the PBU pressure gauge observed the pressure changes. Fourth, the TrA, IO, and EO muscles’ muscle thickness was measured using real-time US simultaneously with the surface EMG measurements of the EO and ES muscles’ amplitudes. Finally, all subjects performed three consecutive ADIM trials in a pressure reduction range of 4–10.
The test-retest reliability of the formal test (a contraction of the deep abdominal muscles at a level of <20% of maximal contraction) was determined by repeatedly measuring core stability pressure and muscle thickness (TrA, IO, and EO muscles) using a PBU and real-time US on two separate occasions, approximately 24 hours apart, in the core stability group. To consistently measure muscle thickness using real-time US, the transducer head location was marked on a transparent sheet to ensure identical placement between the tests and retests. The test-retest reliability (PBU, US) and EMG data were collected by a certified physical therapist (SJ).
Standard statistical analysis included computations of means and standard deviation. The normality of data was established, using the Shapiro–Wilk test. Pearson’s correlation coefficient (r), and the intraclass correlation coefficient (ICC1,2). Pearson’s correlation coefficient (r) was calculated to determine the validity of the formal test by comparing the thickness of the TrA and IO muscles using real-time US (standard gold measurement) and the core stability pressure using a PBU in both groups (core instability vs. core stability). The ICC1,2 (95% confidence interval, [95% CI]) was used to determine the formal test-retest reliability by repeatedly measuring core stability pressure and abdominal muscle thickness obtained on two separate days (24 hours apart) in the core stability group. The alpha level was set at 0.05, and IBM SPSS Statistics for Windows version 24.0 (IBM Co., Armonk, NY, USA) was used to conduct the statistical analysis.
Table 2 presents Pearson’s correlation coefficient (r) assessing the relationship between the thickness of the abdominal muscles and core stability pressure to measure the core stability function. The correlation (r) values of the core stability group were 0.963 (TrA-PBU) and 0.980 (IO-PBU), indicating very high validity. In the core instability group, the correlation (r) between the two measurements was also very high (TrA-PBU, r = 0.856; IO-PBU, r = 0.953) (Table 2). However, Table 3 shows that the normalized EMG activities of the EO muscle in a pressure reduction range of 6–10 mmHg in the core instability group were 21.4%, 28.3%, and 35.3% RVC, respectively. Hence, the core instability group only met one of the criteria for successful formal testing because of the EO muscle’s excessive contraction. It was operationally defined as a ‘failure’ or an ‘unsuccessful formal test’. Thus, the core instability group’s formal tests did not show very high validity in our findings.
The test-retest reliability of the formal test obtained by repeatedly measuring core stability pressure and abdominal muscle thickness in the core stability group is shown in Table 4. The ICC1,2 value of the core stability pressure was 0.876 (95% CI = 0.568–0.964). The ICC1,2 values for abdominal muscle thickness, including the TrA, IO, and EO muscles, were 0.971 (95% CI = 0.899–0.992), 0.939 (95% CI = 0.789–0.983), and 0.989 (95% CI = 0.961–0.997), respectively (Table 4).
To the best of our knowledge, this is the first study to investigate the concurrent validity of the formal test in young adults. The purpose of this study was to establish the concurrent validity of the formal test by comparing the thickness of abdominal (TrA and IO) muscles using real-time US (standard gold measurement) and the core stability pressure using a PBU in young adults with core instability and stability. We also aimed to establish the test-retest reliability of the formal test by repeatedly measuring core stability pressure and abdominal muscle thickness in young adults with core stability.
Pearson’s analysis demonstrated a high correlation between the thickness of the abdominal muscles (e.g., TrA and IO muscles) and the formal test’s core stability pressure, supporting an excellent validity (r = 0.856 to 0.980) in both groups. However, the normalized EMG activities of the EO muscle in a pressure reduction range of 6–10 mmHg were from 21.4% to 35.3% RVC, which reflected excessive contraction of the superficial trunk muscles in the core instability group. According to the operational definition of a successful formal test, normalized EMG activities should be < 15% RVC to confirm the minimal contraction of EO and ES muscles, as evidenced by EMG and according to Richardson et al. . The formal test has been defined operationally as a ‘failure’ because the core instability group met only one of the criteria for a successful formal test . We cannot conclude that the relationship between deep abdominal muscle thickness and core stability pressure in the formal test indicates very high concurrent validity without confirming the EMG activities of the superficial EO and ES muscles in the core instability group.
The formal test’s test-retest reliability was good (ICC1,2 of core stability pressure = 0.876; ICC1,2, abdominal muscles = 0.939–0.989) in the core stability group. These findings are difficult to compare to the results of previous studies because ours is the first study investigating the concurrent validity of the formal test in young adults with core instability and stability. Previous studies have examined the test-retest reliability and inter-observer reliability of the formal test (also called the ‘PRONE test’) using only a PBU [17-19]. Moseley showed good test-retest reliability (ICC = 0.91; 95% CI = 0.71–0.99), and Costa et al.  described fair test-retest reliability (ICC = 0.58; 95% CI = 0.28–0.78). von Garnier et al.  reported moderate test-retest reliability (ICC = 0.81; 95% CI = 0.67–0.90) and low intraobserver reliability (ICC = 0.47; 95% CI = 0.20–0.67). Such inconsistent test-retest reliability findings may be attributable to the different test procedures and instruments used (palpation or PBU rather than real-time US or EMG) and varied criteria (ADIM-trained participants vs. novice participants). Interestingly, superficial EO muscle thickness in real-time US did not correlate well with the core stability pressure measured by the PBU, which suggests no increased EO muscle thickness in the core instability group. However, the EO muscle in a pressure reduction range of 6–10 mmHg showed an excessive contraction in normalized EMG activities. In support of these results, Hodges et al.  established that real-time US imaging enabled the detection of EMG muscle activity changes for as many as 22% of the maximal voluntary isometric contractions in the IO muscle. However, the noted EO thickness changes did not correlate with the changes in the muscle activity amplitude. These findings suggest that US measurements of EO muscle thickness cannot be used to estimate or correlate with its EMG muscle activity results. Specifically, our findings imply that incorrect ADIM with excessive contraction of the superficial trunk muscles is possible while reducing the PBU pressure from 70 mmHg by approximately 4–10 mmHg, which ultimately leads to conflicting results in the formal test. We suggest that using surface EMG is necessary to accurately monitor the superficial trunk muscles (i.e., the EO and ES muscles) and obtain reliable results from formal tests, thus improving these tests’ sensitivity.
Core stability describes the trunk’s ability to maintain force production and withstand the acting forces on it . It is specifically addressed in many sports programs as a part of fitness professionals’ athletic conditioning. Relationships between core stability deficits and increased risk of injury have been identified in several studies [30,31].
There are some limitations in the current study. First, the subjects included only young adults, and the sample size was small. Second, core stability and core instability groups could not be subdivided into groups with back pain and without back pain, because subjects with musculoskeletal disease were excluded. Third, inter-rater reliability was not investigated in the current study. Therefore, future studies should include larger sample sizes of different age groups, as well as inter-rater reliability. Subgrouping should also be done based on the presence or absence of back pain.
The formal test is an essential method for assessing core stability function in managing athletes with sports injuries or patients with core instability-related LBP. Our findings demonstrate that the formal test, using a PBU and real-time US imaging without confirming the minimal contraction of superficial trunk muscles, as evidenced by EMG, leads to incorrect results. For example, such results in adults or individuals with core instability pathology influence ultimately rehabilitation programs designed to improve core stability. Therefore, the combination of a PBU, real-time US imaging, and surface EMG measurements may help obtain more reliable results from the formal tests and provide a better understanding of the neuromuscular control mechanisms of core stability.
No potential conflict of interest relevant to this article was reported.
Conceptualization: SY, NGL, CP, JHY. Data curation: SY, CP, JHY. Formal analysis: SY, NGL, CP, JHY. Investigation: SY, CP, JHY. Methodology: SY, NGL, CP, JHY. Project administration: SY, NGL, CP, JHY. Resources: SY, CP, JHY. Supervision: SY, NGL, CP, JHY. Validation: SY, NGL, CP, JHY. Visualization: SY, NGL, CP, JHY. Writing - original draft: SY, NGL, CP, JHY. Writing - review & editing: SY, NGL, CP, JHY.