pISSN 2288-6982
eISSN 2288-7105




Phys. Ther. Korea 2019; 26(3): 106-114

Published online October 1, 2019

© Korean Research Society of Physical Therapy

Introducing an Online Measurement System Using Item Response Theory and Computer Adaptive Testing Methods for Measuring the Physical Activity of Community-Dwelling Frail Older Adults

Bong-sam Choi

Dept. of Physical Therapy, College of Health and Welfare, Woosong University

Correspondence to: Bong-sam Choi

Received: March 7, 2019; Revised: March 7, 2019; Accepted: August 6, 2019


It is difficult to assess whether community-dwelling frail older adults may remain pre-frail status or improve into a robust state without being directly checked by health care professionals. The health information perceived by older adults is considered to be one of best sources of potential concerns in older adult population. An online measurement system combined with item response theory (IRT) and computer adaptive testing (CAT) methods is likely to become a realistic approach to remotely monitor physical activity status of frail older adults.


This article suggests an approach to provide a precise and efficient means of measuring physical activity levels of community-dwelling frail older adults.


Article reviews were reviewed and summarized.


In comparison to the classical test theory (CTT), the IRT method is empirically aimed to focus on the psychometric properties of individual test items in lieu of the test as a whole. These properties allow creating a large item pool that can capture the broad range of physical activity levels. The CAT method administers test items by an algorithm that select items matched to the physical activity levels of the older adults.


An online measurement system combined with these two methods would allow adequate physical activity measurement that may be useful to remotely monitor the activity level of community-dwelling frail older adults.

Keywords: Computer adaptive testing, Frail older adult, Item response theory, Outcome measures, Physical activity

The world’s older adult population is growing at a rapid rate. It is projected to nearly double in size from 1 billion in 2017 to 1.9 billion in 2050, and to further increase to 3.2 billion in 2100. What’s more is that the number of older adults aged 80 and older, so called the oldest-old population, is growing even faster than any other age groups of older adults (United.Nations., 2015). The United Nations statistics division indicate that the older adult population in Korea is rapidly aging than other countries and there is a growing consensus on the necessity to support the population (United Nations, 2015). Healthy later life is a crucial consideration for older adults as frailty rises with age and life expectancy with frailty increases (United Nations, 2015). In general, phonotypes of frailty in older adults arise from abnormal multifactorial reasons and often result in low levels of physical activity performed on a daily basis. Consequently, this would require assistance for daily activities (Manini and Clark, 2012). It is true that low physical activity can cause secondary risks to occur in his/her later life (Beswick et al, 2008;,Stuck et al, 2002). Especially, community-dwelling frail older adults may be less likely to perform in their daily activities relative to non-frail older adults, and the likelihood of continuing to remain physically inactive may result in other health problems (Beswick et al, 2008;,Stuck et al, 2002). One of the most critical aspects of physical activity status is the presence of physical activity limitations used to determine the frailty; that is, whether or not to have any physical activity limitation as a significant predictor of frailty (Fried et al, 2001;,Lohne-Seiler et al, 2014).

Studies provide converging evidences to support the measurement system detecting frailty in the community (Fried et al, 2001;,Lewis et al, 2019;,Woo et al, 2015;,Yang et al, 2018). In particular, it is crucial to conduct ongoing monitoring physical activity limitation as a risk factor for frailty and to detect the frailty of community-dwelling older adults (Woo et al, 2015). Few authors note that measurements designed to monitor physical activity status, normally done by health care professionals, require a significant conceptual leap for the measurement system for community-dwelling older adults (Gloth et al, 1995;,Washburn et al, 1993;,Woo et al, 2015). Several studies encourage focusing on self-report measurement to determine whether older adults have recovered to healthy condition following a frail condition or vice versa (Espinoza et al, 2012;,Fallah et al, 2011). Versions of those measurements are validated in other languages and are provided for non-health care professionals (Espinoza et al, 2012). However, these assessments are often challenging (i.e., frail older adults are not often seen by health care professional) to deal with the conditions in the frail older adults who commonly reside in community (Gloth et al, 1999).

As the World Health Organization (WHO) advocates for the use of older adults’ perceived information on their overall health, the older adult’s view is considered to be one of best sources to study community-based older adult population (United.Nations, 2017). Additionally, self-report measurements regarding their status in the context of physical activity has several distinct advantages relative to those measures obtained from health care professionals; 1) health status of older adults can be collected across broad types of environmental setting over time; 2) periodical monitoring of the status of the older adults can be conducted as needed; 3) detailed information on what the older adult may experience difficulties with their daily life can be collected. 4) non-sensitive data are made publicly available and are easier to access than administrative sources (Gloth et al, 1999).

Consequently, self-report measurements have commonly been accepted due to their potential to detect health problems through viewing physical limitations perceived by the older adults themselves and identifying specific information in physical activity limitations (Lohne-Seiler et al, 2014).

While a myriad of Classical Test Theory (CTT)- based self-report measurements are now available to detect physical activity limitations, selecting an optimal measurement remains a challenge. The challenge of selecting a measurement stems primarily from the need to consider the preferences of investigators and clinicians. Consequently, this tendency often leads to failure to justify the choice of selected measurement and sacrifice measurement quality (McHorney, 1999). For example, although many CTT-based measurements are developed by targeting average persons and proper psychometric properties, these measurements often demonstrate ceiling when administered to persons with mild disability. Among the numerous critic reviews indicating the challenges, renowned experts indicated that two properties of measurement (McHorney, 1999;,McHorney, 1997;,Velozo et al, 1999). First, although previously created measurements have adequate psychometric properties, those measurements may or may not be sensitive to a wide range of physical activity perceived by community- dwelling frail older adults. That is, most CTT-based self-report measurements are more likely to be sensitive only at the center in the wide range of physical activity domain since those measurements are commonly developed to target the “average” person (McHorney, 1997). Second, existing CTT-based measurements, that two scores obtained from two measurements cannot be simply equated to each other few reasons. Although two measurements are solely designed to assess physical activity, these scores are often incompatible in the sense that the scores have their own separate yardstick (Jette and Haley, 2005;,McHorney, 1997;,Velozo et al, 1999). Therefore, these measurements cannot translate scores between one another. These two drawbacks have already been facing criticism for being often insensitive to small subtle changes (Fisher, 1992;,Jette and Haley, 2005;,McHorney, 1999;,McHorney, 1997;,Velozo et al 1999) and yet more detailed metrics may be required to capture the smaller changes. This could lead to increasing the number of items within a measurement (McHorney, 1999;,McHorney, 1997;,Velozo et al, 1999). However, in reality, it is impossible to include a large number of items within a measurement because large number of unnecessary items would be a burden on frail older adults.

Another drawback of existing measurements is precision problems such as ceiling or floor effects. This measurement issue often leads to the type II error that is the rate of false negative (McHorney, 1999;,McHorney, 1997). Thus, this precision problem often makes measurements impossible to capture substantial changes in health status. Measurement imprecision occurs when test items are inadequately applied to population being measured (McHorney, 1999). That is, the precision decreases when easy items are administered to the populations with low ability or the other way around. The precision problem can also be partly from the use of fixed number of items within a measurement (Velozo et al, 1999). That is, many existing measurement systems have no adequate breadth to assess the broad spectrum of health status (i.e., physical activity status). Therefore, the measurement often fails to capture the changes in health status.

To overcome those drawbacks of existing CTT-based measurements, a solution with two methodologies, item response theory (IRT) and computer adaptive testing (CAT), has been introduced and proven to be a promising means for measurement precision and efficiency issues (Jette and Haley, 2005;,McHorney, 1999;,McHorney, 1997;,Velozo et al, 1999;,Beleckas et al, 2017;,Gamper et al, 2016;,Jette et al, 2008;,Morris et al, 2017;,Risk N, 2016). These two methodologies, IRT and CAT, have potential to measure older adults with high precision and efficiency. With less number of items and particular set of items rather than administering all items, online- based measurement system combined the IRT and CAT aims to target older adults across the broad ranges of physical activity levels. The online-based measurement system focuses on each test item rather than a just whole test. Thus, this paradigm shift free measurement system from the use of particular tests in which the size of a measuring unit cannot be varied by what and who is being measured.

The purposes of this study are; 1) to explore the innovative measurement methodologies with IRT and CAT and 2) to introduce a online measurement system that can be applied to monitor physical activity transitions for community-dwelling frail older adults.

A comprehensive online search for literature was conducted between 1992 to 2017 including electronic databases including PubMed, SCOPUS, and cumulative index to nursing and allied health literature (CINAHL) using combination of medical subject headings (MeSH) terms for computer adaptive testing, item response theory and frail older adult. A total of 72 articles were identified and 39 articles were excluded from literature review due to their irrelevance to this paper. In this paper, the pedagogical feature of an online measurement system necessary to ensure precise and efficient measurement for community-dwelling frail older adult is discussed. Figure 1 represents the framework of the present study for a online measurement system classifying the level of physical activity of community-dwelling frail older adult.

Figure 1.

Conceptual framework of the study for an online measurement system. The figure represents the online measurement system combined with IRT and CAT methods and how the system is created for measuring the physical activity of community-dwelling frail older adults.

Health issues in community-dwelling frail older adult

The concept of ageing-in-place, preferring to remain living in communities as long as possible rather than living in institutions, plays a central role to improve the care of older adult (Callahan, 1993;,Frank, 2002;,Wiles et al, 2001). There is also a growing recognition in the adequacy of support system for the community-dwelling frail older adults (Frank, 2002;,Donald, 2009). Although the concept appears as a function of multifactorial reasons, the community-based support is the most important in enabling older adult to remain in their community. Yet having older adult remain in his/her community is commonly favored by the older adults themselves as well as health care professionals. Successful ageing- in-place avoids the costly alternative such as institutional care (wiles et al, 2001). Of those factors influencing whether the older adults can stay in their community or not, investigators select the change of the older adults’ physical activity status as a crucial factor (Donald, 2009;,Glass and Balfour, 2003). That is, any progressive loss or gain of physical activity status of the community-dwelling frail older adults is considered to be a key component of their frailty management. Thus, the effective measurement system that can be applied to the older adults is in need of overhaul. Most current existing assessments anecdotally report the transition of older adults’ physical activity status have been developed to detect indicators of an impending acute illness or an exacerbation of a chronic illness. To identify the transition of the status, in general, the older adults should be physically present in health care clinics and be assessed by health care professionals. However, this type of care paradigm faces challenges in which community-dwelling frail older adults who may not be available at health care clinics should be regularly assessed.

In general, frailty has long been considered synonymous with disability, comorbidity, and other predisposing characteristics causing health problems (Fried et al, 2001). It is often referred to as a clinical syndrome that appears to be a transitional state in the dynamic progression from healthy condition to functional decline. Since frailty is a dynamic progression encompassing transitions between predefined frailty status over time, there has been a consensus, in which opportunities to less frail status are possible to be observed with optimal measurement systems (Lally and Crome, 2007). That is, it is now certain that an optimal measurement system along with tailored rehabilitation intervention can prevent substantial functional declines (Gill et al, 2002;,Jette et al, 1999;,Lally and Crome, 2007;,Wilson, 2004). The measurement system may include comprehensive assessment as well as selecting the physical activities matching the older adult’s needs in their community. In the context of comprehensive assessments to meet the need, several questions can be raised on how to remotely measure the physical activity status and to monitor the transition of the frail older adult in community. To date, measurement monitoring functional status in frail older adults have largely focused on assessments that are designed for clinical settings with structured environments. These measurements have commonly been administered to the frail older adult regardless of different environments where older adults are differently functioning in community. Since there is a strong association between function and environment in the care of community-dwelling frail older adults, community-based measurement is beneficial strategy in which older adults can perform their physical activity more safely and effectively (Gill et al, 2002;,Gill et al, 2003;,Jette et al, 1999). It is needless to say that most older adults are more confident in their community.

An online measurement system

Figure 1 represents conceptual framework of the online measurement system combined with IRT and CAT methods. The IRT method is; 1) to identify the older adults’ perceived information on their physical activity, 2) determine invariant item difficulty calibrations, and 3) create item pools on physical activity domain. The CAT method uses a computer algorithm choosing the most optimal test items with respect to the older adults’ ability levels. The IRT and CAT methodologies are likely to provide insights into the physical activity issues. The IRT focuses on the psychometric properties of individual items constituting each measurement instead of measurement as a whole. The IRT also can estimate the probability of select particular ratings of each item and places item difficulty and person ability on the linear continuum. This connects an individual response on particular items to the level of physical activity. This method subsequently lead to estimating large item pools that can be used to selectively administer measurement items to respondents through the CAT method (Velozo et al, 2006). While IRT provides a means for generating and connecting item difficulty and person ability, CAT provides a means for administering items in an algorithmic fashion that is both efficient and precise. An almost endless array of studies have shown that CAT improves measurement efficiency (Jette et al, 2005; McHorney, 1999;,McHorney, 1997;,Velozo et al, 1999;,Beleckas et al, 2017;,Gamper et al, 2016;,Jette et al, 2008: Kielhofner et al, 2005;,Morris et al, 2017;,Risk, 2016;,Velozo and Peterson, 2001;,Weiss, 2004) with fewer than the average 7 items while maintaining adequate precision in comparison to full length measurement (Jette et al, 2005; Jette et al, 2008). Online measurement system combined these methods can be delivered on the website and appear to be the most effective way to stably integrate the community older adults into a ubiquitous system and to monitor the older adult over time.

The primary feature of IRT yields invariant item difficulties by estimating the probability of selecting a particular rating for a measurement item. The IRT also places item difficulty and person ability on the same linear continuum. These two properties permits “connecting” older adults’ responses to particular items at his/her ability level (Jette et al, 2005). The invariant property of IRT means that once measurements have been calibrated to a common metric, estimates of older adults’ ability and item difficulty do not vary across measurement items and persons. That is, the estimates of person ability and item difficulty are invariant, in interval measurement, regardless of the measurement items used and independent of the ability of the sample used (Velozo et al, 1999). In addition, the invariant property of estimates is exceptionally valuable in the context of measurement quality. The invariance allows measurement items continually to be updated and replaced within large item pools, while it provides capacity that will develop the online measurement system. The item pool allows to precisely target older adults’ abilities and capture even small transitions that conventional measurement systems often fail to measure. Since the item pool covers a wide range of physical activity traits with large number of pre-generated items, it is capable of selecting an optimal item that matches a current ability estimate. For instance, in an attempt to assess a severely frail older adult, easy items that closely match the older adults’ physical ability would be chosen, similarly more challenging items would be selected to assess a mildly frail older adults. These two frail older adults with different physical ability can be assessed on the same physical activity measurement with different set of test items (Velozo et al, 1999; Jette et al, 2005).

In addition, meaningful information can be obtained by inspecting an older adult’s response pattern. Health care professionals can visually predict more or less challenging measurement items from an item pool in accordance to his/her current ability levels. That is, this primary feature of IRT can be used to determine whether the older adult can move on to more challenging physical activity or return to less challenging physical activity. Health care professionals can logically expect that easy items require less challenge while difficult items require more challenges. Moreover, community-dwelling older adults can get a general sense of what type of frailty they currently involved. This logical fashion can be taken into consideration in the online measurement system.

The CAT method promises a means of administering test items in a computer algorithm, in which item selection is tailored to an individual’s ability level and administration time is often less with fewer items, and still maintaining measurement precision (Jette et al, 2005; Beleckas et al, 2017;,Gamper et al, 2016;,Jette et al, 2008;,Morris et al, 2017;,Risk, 2016). The CAT in computer algorithms plays a primary role in selectively administering items that are most relevant for an individual of particular ability level. Thus, measurement efficiency with fewer number of items can be achieved without sacrificing measurement precision relative to a full version measurement. With these advantages, the CAT technology combined with IRT has recently become an alternative to traditional fixed-format measurements (Jette et al, 2005; Beleckas et al, 2017;,Gamper et al, 2016;,Jette et al, 2008;,Morris et al, 2017;,Risk, 2016;,Weiss, 2004).

The CAT method can also provide health care professionals with real-time data available for immediate use. After a few minutes of interaction with a touch screen monitor, immediate results can be available to the older adult as well as health care professionals. Once older adults are registered in the online measurement system, physical activity status of a frail older adult can be continually monitored and updated with each progress in community or elsewhere. Thus, longitudinal follow-up measurements of physical activity status could be tailored relative to previous status and recorded into a distant item pools over time.

The focus of this review was primarily on an emerging approach in measuring and monitoring the physical activity status of community-dwelling frail older adults. A few initiatives targeting underlying limitations of the older adult’s physical activity have been attempted and embraced by investigators. The conventional approach often faces problems in measuring the community-dwelling frail older adults. That is, most conventional measurement systems, if not all, only aimed at classifying the frail older adults in the context of physical activity limitations. In addition, health care professionals can no longer rely on conventional measurement systems to remotely monitoring community-dwelling frail older adults due to many drawbacks of the measurement system.

The online measurement system combined with two methodologies, IRT and CAT, can provide a means of continuous monitoring physical activity of community-dwelling frail older adults. Moreover, older adults can monitor their physical activity status over time in their community or elsewhere. The primary advantages of thel online measurement system are; 1) significant reduction in measurement time (i.e., measurement efficiency), 2) ensuring precise outcomes regardless of their current physical activity status (i.e., measurement precision), and 3) tailored feedbacks on the next level of physical activity immediately after the completion of each measurement. These features would lead frail older adults to a consensus decision-making with which health care professional’s suggestion is congruent.

  1. Beleckas CM, Padovano A, Guattery J. Performance of patient-reported outcomes measurement information system (PROMIS) upper extremity (UE) versus physical function (PF) computer adaptive tests (CATs) in upper extremity clinics. J Hand Surg Am. 2017:42(11): 867-874.
    Pubmed KoreaMed CrossRef
  2. Beswick AD, Reese K, Dieppe P. Complex interventions to improve physical function and maintain independent living in older adult people: A systemic review and meta-analysis. Lancet. 2008;371(9614):725-735.
  3. Callahan J.. Aging in Place: Generations and aging series. Amityville, N.Y.: Baywood Pub. Co. 1993: 65-70..
  4. Donald IP.. Housing and health care for older people. Age Ageing. 2009;38(4):364-367.
    Pubmed CrossRef
  5. Espinoza SE, Jung I, Hazuda H.. Frailty transitions in the San Antonio longitudinal study of aging. J Am Geriatr Soc. 2012;60(4):652-660.
    Pubmed KoreaMed CrossRef
  6. Fallah N, Mitnitski A, Searle SD. Transitions in frailty status in older adults in relation to mobility: A multistate modeling approach employing a deficit count. J Am Geriatr Soc. 2011; 59(3):524-529.
    Pubmed KoreaMed CrossRef
  7. Fisher AG.. Functional measures part 2: Selecting the right test, minimizing the limitations. Am J Occup Ther. 1992;46(3):278-281.
    Pubmed CrossRef
  8. Frank JB.. The Paradox of Aging in Place in Assisted Living. Bergin & Garvey, London, UK. 2002:125-128..
  9. Fried LP, Tangen CM, Walston J. Frailty in older adults: Evidence for a phenotype. J Gerontol A Biol Sci Med Sci. 2001;56(3):M146- M157.
    Pubmed CrossRef
  10. Gamper EM, Petersen MA, Aaronson N. Development of an item bank for the EORTC role functioning computer adaptive test (EORTC RF-CAT). Health Qual Life Outcomes. 2016;14:72.
    Pubmed KoreaMed CrossRef
  11. Gill TM, Baker DI, Gottschalk M. A program to prevent functional decline in physically frail, older adult persons who live at home. N Engl J Med. 2002;347(14):1068-1074.
    Pubmed CrossRef
  12. Gill TM, Baker DI, Gottschalk M. A prehabilitation program for physically frail community- living older persons. Arch Phys Med Rehabil. 2003;84(3):394-404.
    Pubmed CrossRef
  13. Glass TA, Balfour JL.. Neighborhoods, aging, and functional limitations. Neighborhoods and health. 2003:303-309.
  14. Gloth FM, Scheve AA, Shah S. The frail elderly functional assessment questionnaire: Its responsiveness and validity in alternative settings. Arch Phys Med Rehabil. 1999;80(12):1572-1576.
  15. Gloth FM, Walston J, Meyer J. Reliability and validity of the frail older adult functional assessment questionnaire. Am J Phys Med Rehabil. 1995;74(1):45-53.
  16. Jette AM, Lachman M, Giorgetti MM. Exercise-it’s never too late: The strong-for-life program. Am J Public Health. 1999;89(1):66-72.
    Pubmed KoreaMed CrossRef
  17. Jette AM, Haley SM.. Contemporary measurement techniques for rehabilitation outcomes assessment. J Rehabil Med. 2005;37(6):339-345.
    Pubmed CrossRef
  18. Jette AM, Haley SM, Ni P. Creating a computer adaptive test version of the late-life function and disability instrument. J Gerontol A Biol Sci Med Sci. 2008;63(11):1246-1256.
    Pubmed CrossRef
  19. Kielhofner G, Dobria L, Frsyth K. The construction of keyforms for obtaining instantaneous measures from the occupational performance history interview rating scales. Occup Ther J Res. 2005;25(1):23-32.
  20. Lally F, Crome P.. Understanding frailty. Postgrad Med J. 2007;83(975):16-20.
    Pubmed KoreaMed CrossRef
  21. Lewis ET, Dent E, Alkhouri H. Which frailty scale for patients admitted via emergency department? A cohort study. Arch Gerontol Geriatr. 2019;80:104-114.
    Pubmed CrossRef
  22. Lohne-Seiler H, Hansen BH, Kolle E. Accelerometer- determined physical activity and selfreported health in a population of older adults (65-85 years): A cross-sectional study. BMC Public Health. 2014;14:284.
    Pubmed KoreaMed CrossRef
  23. Manini T, Clark BC.. Dynapenia and aging: An update. J Gerontol A Biol Sci Med Sci. 2012; 67(1):28-40.
    Pubmed KoreaMed CrossRef
  24. McHorney CA.. Health status assessment methods for adults: Past accomplishments and future challenges. Annu Rev Public Health. 1999;20:309-335.
    Pubmed CrossRef
  25. McHorney CA.. Generic health measurement: Past accomplishments and a measurement paradigm for the 21st century. Ann Intern Med. 1997;127 (8 Pt 2):743-750..
    Pubmed CrossRef
  26. Morris S, Bass M, Lee M. Advancing the efficiency and efficacy of patient reported outcomes with multivariate computer adaptive testing. J Am Med Inform Assoc. 2017;24(5):897-902.
    Pubmed KoreaMed CrossRef
  27. Risk N.. The Impact of item parameter drift in computer adaptive testing (CAT). J Appl Meas. 2016;17(1):54-78.
  28. Stuck AE, Egger M, Hammer A. Home visitsto prevent nursing home admission and functional decline in older adult people: systemic review and meta-regression analysis. JAMA. 2002; 287(8):1022-1028.
    Pubmed CrossRef
  29. . Department of economic and Social Affairs, Population Division. World Population Aging. 2015:211-215.
  30. Velozo CA, Choi B, Zylstra SE. Measurement qualities of a self-report and therapist-scored functional instrument based on the dictionary of occupational titles. J Occup Rehabil. 2006;16(1): 109-122.
  31. Velozo CA, Kielhofner G, Lai JS.. The use of Rasch analysis to produce scale-free measurement of functional ability. Am J Occup Ther. 1999;53(1): 83-90.
    Pubmed CrossRef
  32. Velozo CA, Peterson EW.. Developing meaningful fear of falling measures for community dwelling older adult. Am J Phys Med Rehabil. 2001;80(9): 662-673.
    Pubmed CrossRef
  33. Washburn RA, Smith KW, Jette AM. The physical activity scale for the older adult (PASE): Development and evaluation. J Clin Epidemiol. 1993;46(2):153-162.
  34. Weiss DJ.. Computerized adaptive testing for effective and efficient measurement in counselling and education. Meas Eval Couns Dev. 2004;37:70-84.
  35. Wiles JL, Leibing A, Guberman N. The meaning of “aging in place” to older people. Gerontologist. 2001;52(3):357-366.
  36. Wilson JF.. Frailty and its dangerous effects might be preventable. Ann Intern Med. 2004;141(6): 489-492.
    Pubmed CrossRef
  37. Woo J, Yu R, Wong M. Frailty screening in the community using the frail scale. J Am Med Dir Assoc. 2015;16(5):412-419.
    Pubmed CrossRef
  38. . Towards a common language for functioning, disability and health, Geneva. 2002:126-131.
    Pubmed CrossRef
  39. Yang L, Jiang Y, Xu S. Evaluation of frailty status among older people living in urban communities by Edmonton Frail Scale in Wuhu, China: A cross-sectional study. Contemp Nurse. 2018;54(6):630-639.