Table of Contents

HK J Paediatr (New Series)
Vol 30. No. 1, 2025

HK J Paediatr (New Series) 2025;30:20-26

Original Article

Reliability and Validity of the Four Square Step Test in Adolescent Cerebral Palsy

F Tanriöğer Soyuer, U Şırayder


Abstract

Purpose: The study is aimed to determine the validity and reliability of the Four Square Step Test (FSST) for evaluating adolescent cerebral palsy (CP). Methods: This study included a total of 66 subjects (21 with hemiparetic CP, 19 with diplegic CP, and 26 healthy participants), all aged between 12 to 18 years and similar gender distribution.We assessed dynamic balance with the FSST and the Time Up and Go (TUG) test. Findings: For internal consistency, Cronbach's alpha coefficients calculated as 0.913 as the total score. Test-retest measurement demonstrated excellent reliability (intraclass correlation coefficient - 95% confidence interval, 0.805 to 0.999). There was a significant positive correlation between FSST and TUG among both of the groups with CP (r=0.95, p<0.001). Receiver operating characteristic analysis for differentiating healthy from those with hemiparetic CP the FSST showed 66.4% sensitivity and 84.62% specifity; for diparetic CP, the FSST showed 63.16% sensitivity and 100% specifity, and the TUG test showed 73.68% sensitivity and 100% specifity. Conclusion: The study shows that the FSST is a valid and reliable test to evaluate dynamic standing balance in adolescents with CP.

Keyword : Adolescent; Cerebral Palsy; Four Square Step Test; Reliability; Validity


Introduction

Balance is the ability to hold and maintain the body's centre of gravity on a support surface. Static balance is the continuation of the person's static position, while the dynamic balance is the person's ability to actively control body position and posture when moving effectively without falling down in different environments and situations, while both at rest and on the move.1,2 Information from the visual, vestibular and somatosensorial systems is very important to balance. To identify the causes of impaired balance and treat impaired balance effectively, it is necessary to understand balance control systems and their interactions with each other. Posture in humans is under intense control by the complex neuromuscular system that provides a rapid postural adaptation to changes in the center of gravity centre at rest and during activity.3,4 Poor postural control is a primary impairment in children, adolescents and adults with cerebral palsy (CP) and may affect activities in daily life, such as walking and sitting. One of the most important factors that cause poor walking in children and adolescents with CP is poor balance control. Adolescence and early adulthood were considered as an important transition period for individuals with CP.5,6

Since balance is a complex sensorimotor function, a single and simple evaluation test is not sufficient. Balance tests are classified by their type. There is no single test that can evaluate all the parameters of the balance equation, and different aspects of postural control are measured with different tests. Also, to effectively treat balance problems in clinical practice, clinicians must identify different types of balance disorders.7,8

Among persons with CP, several clinical tests have shown documented validity and reliability for balance evaluation;9,10 Yet, there remains a need for a range of clinical balance tests that can be used to study balance strategies more comprehensively.

To optimise gait performance for persons with CP, the focus should be on improving dynamic balance and thus reducing fall risk. Thus, physiotherapists in rehabilitation work require a reliable and valid assessment tool to measure and monitor dynamic standing balance in persons with CP, and, for clinical feasibility, clinical balance tests should be more practical and less technical than laboratory-based tests.9,10

The Four Square Step Test (FSST) is a dynamic standing balance test for clinical practice that was designed to assess a patient's ability to rapidly cross over obstacles and change direction. The FSST requires minimal time, space and equipment. The FSST involves the measurement of the time taken to step on top of four press bars arranged in a cross-configuration in a predetermined sequence. FSST measurements have been shown to be reliable and valid among adults with and without balance disabilities. In children with cerebral palsy and Down syndrome, the FSST has shown test-retest reliability estimates, range from moderate to excellent (between 0.54, 0.78, and 0.89), and it has shown excellent interater reliability (intraclass correlation coefficient [ICC]=0.79; 95% confidence interval [CI] 66, 0.89), FSST is a valid and reliable dynamic balance test that also has excellent reliability (ICC 2.1=0.922) among individuals with ambulant Multiple Sclerosis.11-14 While there are many tests to evaluate the balance in cerebral palsy, for adolescents with CP, we determined that there was no adequate test that was easy to understand, simple, feasible and time-saving. Therefore, we believe that a validity and reliable study of FSST was needed in to make available to practitioners a short, comprehensible and objective evaluation of balance for adolescents with CP.

Therefore, our study aimed to evaluate whether the FSST test is reliable and valid for safe assessments of balance assessments by physiotherapists and other related health workers in adolescents with CP.

Method

Participants
Between April 2018 and February 2019, for this research, we engaged 66 adolescents in three groups of similar age and gender distributions: (a) 21 with hemiparetic CP; (b) 19 with diparetic CP, and 26 healthy participants with no CP. All participants were aged between 12-18 years. All participants with CP were classified with levels I and II according to the Gross Motor Function Classification System (GMFCS). No participants had any neurological or orthopedic problems or any visual, hearing, vestibular, communication, or cognitive problems that would prevent testing. All participants were considered independent in daily life activities. No participants had a condition that affected lower extremity balance, and no participants with CP had received botox injections within the past six months.

The adolescents' sociodemographic and clinical findings were determined and recorded on a data form created for the study including the participants' age, height, body weight, body mass index, dominant extremity, and education status.

In addition, we recorded all participants' scores on the FSST and Time Up and Go (TUG) tests (see below). We received approval for this research protocol from Erciyes University Clinical Research Ethics Committee for Clinical Investigations (number 2016/527; 04.03.2016) and we obtained informed written consent from all subjects before the study.

Assessment Procedure
Before undergoing test measurements, the physiotherapist informed the participants about the tests and test positions and allowed participants an experimental trial to familiarise themselves with the measurement method and reduce the learning effect. Since there was more than one test, we interspersed each test with a 5-minute rest period to reduce any fatigue effect.

At the first test session, each study group performed the balance tests before three independent raters so that we could determine between-rater reliability. Participants were given another 5-minute rest period between each rater's assessments. We tested participants twice (approximately one week apart), and at the second testing session, participants repeated the balance tests with the more experienced rater (who was blinded to the results from session 1) so that we could determine within-rater reliability. The 1-week test interval provided sufficient time to limit participant recall of earlier test scores while also short enough to limit any change in clinical status. In the second session, the participants completed a self-report of their perceived degree of global change.

Test conditions were similar in all test sessions, including set up, instructions and time of day, and participants always wore comfortable clothes and their usual shoes.

Outcome Measures

FSST. For the FSST, participants had to step between a predetermined row of four rods, 90 centimeters (cm) long, placed diagonally on the ground. Their initial standing position was on squares 1 and 2. Participants then rotated clockwise into each quadrant for forward, right, backward and left quadrant positions and rotated counterclockwise in the reverse order (i.e., 2, 3, 4, 1, 4, 3, 2, 1) (Figure 1). For each position, both feet were to touch each position in the quadrant section. Participants were instructed to complete the sequence as quickly as possible without touching the bars.We recorded the time taken to complete the sequence, and we considered the 'test' unsuccessful whenever the participants lost their balance or their feet touched a rod. Each participant performed three trials with a 5-minute rest between trials to prevent fatigue. Test scores were taken from the participant's meantime on trials 2 and trial 3.15

TUG. We used the TUG test to measure functional mobility. For the TUG, participants must take a seat in an armchair, get up and walk for three meters (m), return to the chair and sit down again.We measured the time taken to complete this test with a stopwatch and gave each participant three trials. For data analysis, we took the participants' meantime on trials 2 and 3.16

Figure 1 FSST set up and stepping sequence.

Statistical Analysis
For data analysis, we used SPSS for Windows 22.0 statistical package program. We evaluated the data distribution for normality with the Shapiro-Wilk test and Histogram QQ graphs and assessed the homogeneity of variances with the Levene Test. We presented descriptive statistical information as means (M), standard deviations (SD) or %'s. We used Student t-tests to compare binary groups and one-way analysis of variance (ANOVA) to compare multiple groups. The Tukey post hoc comparison was used to determine specific group differences. Chi-Square analyses were used to compare categorical data. The relationship between quantitative variables was assessed by correlation coefficient. For the reliability analysis, the internal consistency and the test-retest reliability were measured. To evaluate the internal consistency, Cronbach's alpha values were calculated. A Cronbach's alpha coefficient of >0.7 was accepted as internally consistent. Test-retest reliability was evaluated using the ICC together with a 95% CI. We calculated interclass coefficients (ICCs) to assess the intra- and interrater reliability of measurement observers. For validity determinations, we utilised receiver operating characteristic (ROC) analyses. ROC curves were generated to assess the sensitivity and specificity of the ability of the FSST scores to differentiate between participants with or without CP based on their dynamic balance scores. The analysis of the results was evaluated using the R3.30 package (v 0.6.2; R Core Team) (www.r-project.org) with a significance level set at p<0.05.

Results

There were no statistically significant differences between groups of participants with hemiparetic and diplegic CP in terms of age, height, weight, BMI, maternal diseases during prenatal development, maternal cigarette and drug use during prenatal deelopment, birth delivery type (i.e., vaginal versus Caesarian section), single/ multiple births, birthing complications and postnatal characteristics (Tables 1 and 2).

Table 1 Comparisons of demographic characteristics, FSST and TUG tests of diplegic, hemiparetic cerebral palsy and healthy groups
  Hemiparetic Diplegic Control  
Variables N (%) X±SD N (%) X±SD N (%) X±SD P
Gender
Girls 14 (66.7)   10 (52.6)   13 (50.0)    
Boys 7 (33.3)   9 (47.4)   13 (50.0)   0.228
Age (year) 21 (31.81) 14.80±2.18 19 (28.78) 15.47±1.80 26 (39.39) 15.65±1.23 0.243
FSST (sec) 21 (31.81) 11.04±3.60 19 (28.78) 14.07±6.85 26 (39.39) 8.16±1.05 <0.001
TUG (sec) 21 (31.81) 7.07±1.29 19 (28.78) 8.63±2.43 26 (39.39) 6.24±0.52 <0.001
FSST: Four Square Step Test; TUG: Timed Up & Go Test

Table 2 Comparison of cases with cerebral palsy in terms of categorical variables
    Total Hemiparetic Diplegic  
Categorical variables   N (%) N (%) N (%)
P
Gender Girls 24 (60) 14 (66.7) 10 (52.6) 0.366
Boys 16 (40) 7 (33.3) 7 (33.3)
Dominant Side Right 27 (67.5) 11 (52.4) 16 (84.2) 0.032
Left 13 (22.5) 10 (47.6) 3 (15.8)
Diseases in Pregnancy Diabetes 1 (5.3) 0 (0.0) 1 (26.3) 0.824
Other Diseases 6 (15.8) 3 (14.3) 3 (28.6)
Absent 33 (78.9) 18 (85.7) 15 (31.6)
Using of Cigarettes in Pregnancy Present 12 (30.0) 5 (23.8) 7 (36.8) 0.369
Absent 28 (70.0) 16 (76.2) 12 (63.2)
Drug in Pregnancy Present 8 (20.0) 2 (9.5) 6 (31.6) 0.120
Absent 32 (80.0) 19 (90.5) 13 (68.4)
Delivery Method Normal 24 (63.2) 15 (71.4) 9 (52.9) 0.240
Cesarean 14 (36.8) 6 (28.6) 8 (47.1)
Single or Multiple Pregnancy Only 34 (85.0) 19 (90.5) 15 (78.9) 0.398
Twin 6 (15.0) 2 (9.5) 4 (21.1)
Complications Asphyxia / Hypoxia 15 (39.5) 8 (38.1) 7 (41.2) 0.847
Absent 23 (60.5) 13 (61.9) 10 (58.8)
Oxygen Support in the Incubator Present 21 (52.5) 8 (38.1) 13 (68.4) 0.55
Absent 19 (47.5) 13 (61.9) 6 (31.6)
Postnatal Other Features Others 3 (7.5) 3 (14.3) 0 (0.0) 0.233
Absent 37 (92.5) 18 (85.7) 19 (100.0)

Table 1 shows comparisons between adolescents in hemiparetic CP, diplegic CP and no-CP control groups for participant age, gender, FSST, and TUG. There was a statistically significant difference between the groups regarding FSST and TUG variables (p<0.001) such that FSST dynamic balance times were longer for adolescents with diplegic CP than for those in the no-CP control group and for those with diplegic CP relative to both those with hemiparetic CP and control groups. There was no significant difference between groups according to age and gender.

Table 3 shows the results of the ANOVA tests on FSST and TUG scores.

Table 3 Anova analysis results for groups with TUG and FSST tests
  Control and Cerebral Palsy Groups P
TUG Diplegia Hemipareticb 0.006
    Controlb 0.000
  Hemiparetica Diplegicb 0.006
    Controlb 0.159
  Controla Diplegicb 0.000
    Hemipareticb 0.159
FSST Diplegica Hemipareticb 0.069
    Controlb 0.000
  Hemiparetica Diplegicb 0.069
    Controlb 0.060
  Controla Diplegicb 0.000
    Hemipareticb 0.060
a: Control group; b: Cerebral palsy group

According to ICC results of the FSST test measurements of the three different evaluators, the compliance levels were found to be at an excellent level in hemiparetic (ICC=0.991), diplegic (ICC=0.994) and total (ICC=0.998) . For internal consistency, Cronbach's alpha coefficients ranged between 0.761 and 0.864 and were calculated as 0.913 as the total score. Test-retest measurement demonstrated excellent reliability (ICC-95% CI, 0.805 to 0.999) (Table 4).

Table 4 FSST test reliability results
  Total CP Clinical Type P
Hemiparetic Diplegic
  X±SD
n
X±SD
n
X±SD
n
 
FSST 12.48 ±5.53 11.04±3.60 14.07±6.85 0.096
1. Evaluator n=40 n=21 n=19
FSST 12.35±5.54 10.89±3.59 13.96±6.86 0.093
2. Evaluator n=40 n=21 n=19
FSST 12.42±5.54 10.85±3.34 14.17±6.94 0.069
3. Evaluator n=40 n=21 n=19
ICC (95% confidence interval) 0.998 0.991 (0.981-0.996) 0.994 (0.987-0.997)
 
Cronbach's alpha 0.913 0.761 0.864
 
Re-test FSST 11.97±4.94 10.61±3.31 13.33±5.95  
  n=34 n=17 n=17  
ICC 95% confidence interval, 0.805 to 0.999 0.970 0.920 0.924  

In the FSST test, it was found that there was a significant positive correlation between the first evaluator and the second assessor and between the first assessor and the third assessor, both in the hemiparesis and the diplegic groups (p<0.001) (Table 5).

Table 5 Relations between evaluators in terms of FSST
  FSST 1. Evaluator
Hemiparetic (n=21) Diplegic (n=19) Diplegic (n=19)
  r p r
p r p
FSST 2. Evaluator 0.960 <0.001 0.977 <0.001 0.979 <0.001
FSST 3. Evaluator 0.967 <0.001 0.988 <0.001 0.985 <0.001

There was a significant positive correlation between FSST and TUG in both the hemiparetic and diplegic groups (p<0.001).

In the ROC analysis performed it was found that the balance was impaired in the values of 9.02 and more in the FSST test for hemiparetic, 66.4% sensitivity and 84.62% selectivity. In the diplegics, the FSST test showed that the balance was impaired in 10.18 seconds and above, with 63.16% sensitivity and 100% selectivity (Table 6).

Table 6 ROC analysis
  ROC analysis    
Classification Variable AUC p SEN (%)
SPE (%)
TUG Mean
Control - Hemiparetic 0.68 (0.53-0.81) 0.031 47.62 (25.7-70.2) 92.31 (74.9-99.1)
Control - Diplegic 0.85 (0.71-0.94) <0.001 73.68 (48.8-90.9) 100 (86.8-100.0)
FSST Mean
Control - Hemiparetic 0.82 (0.68-0.92) <0.001 66.67 (43.0-85.4) 84.62 (65.1-95.6)
Control - Diplegic 0.75 (0.60-0.87) 0.006 63.16 (38.4-83.7) 100 (86.8-100)
Values are expressed as estimates and 95% confidence intervals.
AUC: Area Under Curve, SEN: Sensitivity, SPE: Specificity, ROC:Receiver Operating Characteristic.

In the diplegics, it was shown that in the TUG test, 7.03 seconds and above, 73.68% sensitivity and 100% selectivity were impaired in balance (Table 6).

Discussion

Our results have shown that FSST can be used as a viable and reliable tool for assessing dynamic balance in adolescent CP.

Adolescents with cerebral palsy often struggle with a gait that prevents them from walking in the same direction they are facing. This is akin to needing to clear obstacles such as canes and ensuring proper foot placement before advancing to the next step. An evaluator made visual and tactile stimuli to the subjects during the tests and also ensured the subjects' safety against falling, etc. All subjects had to make 2-3 trials before completing the test procedures successfully. This result suggests that there is a need for more trials when using this test. Furthermore, it is emphasized that the successful completion of FSST, requires specific cognitive and physical abilities.15 In our study, the cognitive functions were not evaluated since there was no communication problem in the adolescents. In addition, no adolescents participated in the study at the level of GMFCS III. Thus, the feasibility of FSST should be investigated for specific cognitive levels and lower motor skills.

The FSST was found to have excellent Cronbach's alpha coefficient and re-test reliability. Excellent Cronbach's alpha coefficient and test-retest reliability estimates were consistent with those reported for CP groups in children.11-15 Test-re-test reliability (ICCZ.93e.98) of FSST has been reported in healthy elderly adults15 and people with vestibular disorders.12 Given the limited experience of evaluators in clinical practice and the use of FSST, these estimates support the potential benefit of FSST for use by average service providers. The excellent test re-test reliability estimate suggests that FSST may have the potential to assess longitudinal outcomes for adolescents with CP.

Results on the validity of FSST found strong correlations of FSST with TUG in adolescents with CP. The TUG test is not a simple walking test. It includes several motor tasks, such as standing up, walking forward, returning and walking again, and lastly sitting down. Both FSST and TUG tests require effective postural adjustments to control the gravity center in a group of motor tasks.The TUG test was correlated with the function of cognitive ability in terms of movement planning and organisation, including the multi-tasking sequence in healthy elderly groups.17 Similarly, in FSST, participants require that they remember the movement sequence for 4 quadrants while trying to maintain balance and while passing over obstacles that require cognitive function.15 In this study, the number of groups may not be sufficient to provide a significant correlation.

None of the tests, such as The Timed Up - & - Go, Pediatric Balance Scale, and Berg Balance Scale measure other important balance dimensions, such as pressing sideways or backward and making quick changes in direction. FSST, dynamic standing balance as a clinical test, scoring, easy to apply and requires very little space, does not require special equipment. It is a test that involves forward, backward and side-stepping over thin (2.5 cm) objects and requires rapid changes in the direction of movement. Because of these features, good physical supervision is required to implement the FSST.

In the test, because of the complexity of the stepping sequence, people with cognitive impairment who are unable to follow the test instructions, may make an inappropriate test of FSST. All subjects in our study were able to understand the test comfortably in terms of cognitive function and were able to complete the test successfully.

A cutoff score on the FSST of 9.02 or more seconds generated a sensitivity of 66.4% and specificity of 84.62% for the identification of subjects with 1 or more risk factors for falls in this sample of subjects with hemiparetic adolescent CP. A cutoff score on the FSST of 10.18 and more seconds generated a sensitivity of 63.16% and a specificity of 100% for the identification of subjects with 1 or more risk factors for falls in this sample of subjects with diplegic adolescent CP. These values are the initial values determined for the FSST in adolescent CPs.

Since FSST is a new balance test, more research is needed to evaluate its use in different age groups and different diagnoses. FSST also requires investigation to assess changes in balance performance over time or after treatment. In particular, the more compelling nature of FSST may make it useful to test the balance in younger groups and in cases where the dynamic standing balance is less affected.

Study Limitations

To complete the test, the quality of the motion was not taken into account during FSST, since time is primarily important. FSST performance can be affected by several factors, such as level of spasticity, lower extremity proprioception and fear of falling, which are not measurable in our study. Furthermore, the complex stepping design in the FSST may not be appropriate for patients with severe cognitive impairment. The selection criteria mean that the results of the adolescent CP group can not be generalised to other populations.

Conclusions

As a result, there are many tests to evaluate the balance in cerebral palsies. However, it has been determined that there are not enough tests in the clinic for CP adolescents, that are easy to understand, simple, feasible and time-efficient. For this reason, we think that the validity and reliability study of FSST in CP adolescents is important in terms of achieving short, understandable and objective results in physiotherapy and neurology literature.

The FSST test is a valid and reliable test on adolescents with CP.

Conflict of Interest

No conflict of interest was declared by the authors.

Financial Disclosure

No financial disclosure was declared by the authors.


References

1. Son MS, Jung DH, You JS, Yi CH, Jeon HS, Cha YJ. Effects of dynamic neuromuscular stabilization on diaphragm movement, postural control, balance and gait performance in cerebral palsy. NeuroRehabilitation 2017;41:739-46.

2. Panibatla S, Kumar V, Narayan A. Relationship Between Trunk Control and Balance in Children with Spastic Cerebral Palsy: A Cross-Sectional Study. J Clin Diagn Res 2017;11:YC05-8.

3. Kim DH, An DH, Yoo WG. Effects of 4 weeks of dynamic neuromuscular stabilization training on balance and gait performance in an adolescent with spastic hemiparetic cerebral palsy. J Phys Ther Sci 2017;29:1881-2.

4. Bar-Haim S, Al-Jarrah MD, Nammourah I, Harries N. Mechanical efficiency and balance in adolescents and young adults with cerebral palsy. Gait Posture 2013;38:668-73.

5. Domagalska-Szopa M, Szopa A. Postural orientation and standing postural alignment in ambulant children with bilateral cerebral palsy. Clin Biomech (Bristol) 2017:49:22-7.

6. Dewar R, Claus AP, Tucker K, Johnston LM. Perspectives on postural control dysfunction to inform future research: a delphi study for children with cerebral palsy. Arch Phys Med Rehabil 2017;98:463-79.

7. Pavão SL, de Campos AC, Rocha NA. Age-related Changes in Postural Sway During Sit-to-stand in Typical Children and Children with Cerebral Palsy. J Mot Behav 2019;51:185-92.

8. Rodby-Bousquet E, Persson-Bunke M, Czuba T. Psychometric evaluation of the Posture and Postural Ability Scale for children with cerebral palsy. Clin Rehabil 2016;30:697-704.

9. Pavão SL, dos Santos AN, Woollacott MH, Rocha NA. Assessment of postural control in children with cerebral palsy: a review. Res Dev Disabil 2013;34:1367-75.

10. Saxena S, Rao BK, Kumaran S. Analysis of postural stability in children with cerebral palsy and children with typical development: An observational study. Pediatr Phys Ther 2014;26:325-30.

11. Bandong AN, Madriaga GO, Gorgon ER. Reliability and validity of the four square step test in children with cerebral palsy and down syndrome. Res Dev Disabil 2015:47:39-47.

12. Whitney SL, Marchetti GF, Morris FO, Sparto PJ. The reliability and validity of the four square step test for people with balance deficits secondary to a vestibular disorder. Arch Phys Med Rehabil 2007;88:99-104.

13. Goh EY, Chua SY, Hong SJ, Ng SS. Reliability and concurrent validity of Four Square Step Test scores in subjects with chronic stroke: a pilot study. Arch Phys Med Rehabil 2013;94:1306-11.

14. Wagner JM, Norris RA, Van Dillen LR, Thomas FP, Naismith RT. Four Square Step Test in ambulant persons with multiple sclerosis: validity, reliability, and responsiveness. Int J Rehabil Res 2013;36:253-9.

15. Moore M, Barker K. The validity and reliability of the four square step test in different adult populations: a systematic review. Syst Rev 2017;6:187.

16. Shumway-Cook A, Brauer S, Woollacott M. Predicting the probability for falls in community-dwelling older adults using the Timed Up & Go Test. Phys Ther 2000;80:896-903.

17. Schoene D, Wu SM, Mikolaizak AS, et al. Discriminative ability and predictive validity of the timed up and go test in identifying older people who fall: systematic review and meta‐analysis. J Am Geriatr Soc 2013;61:202-8.

 
 

©2025 Hong Kong Journal of Paediatrics. All rights reserved. Developed and maintained by Medcom.