Evidence-Based Medicine and Outcomes Assessment

Chapter 11

Evidence-Based Medicine and Outcomes Assessment

For veterinary surgeons, making decisions about the optimal care of patients requires the integration of individual clinical experience with information gathered from the best available clinically relevant research. The proficiency and judgment that individual clinicians acquire through clinical experience and clinical practice are fundamental to the decision-making process; however, knowledge gained through unsystematically recorded personal experiences can be biased. Therefore, surgeons look externally to well conducted clinical research to affirm their current decision-making process or suggest an alternative. Considering the merits of published research with which to answer clinical questions can be challenging. Major technical innovations tend to be reported as a series of cases, which form the core of the clinical surgical literature. New operations are adopted on the basis of these case series, before comparisons are ever made with previous techniques. The surgical procedures are in constant evolution, with each surgeon adding small technical refinements, making comparisons of case series over time difficult. Even if the surgical technique remains the same from one case series to the next, other variables often differ, such as presurgical screening, perioperative management, and postoperative follow-up, again confounding comparisons between techniques. One of the most significant problems associated with evaluating the efficacy of surgical interventions in the literature is the lack of a consistent, valid, reliable, and clinically relevant outcome assessment.

Outcome Measures

Outcome measures are the tools (aka instruments) used to measure the success of an intervention. For surgeons, the challenge is determining the definition of success for postoperative procedures to determine which outcome measures are most appropriate to apply to studies evaluating the efficacy of those procedures. Occasionally, the outcome of interest is clear and straightforward, for example, for ovariohysterectomy a successful outcome is sterility of the animal; for most procedures however, the outcome is less clearly defined. For example, for many orthopedic surgical procedures, a successful outcome is decreased pain and improved function of the animal. Ideally therefore, outcome measures that capture pain and function would be used in efficacy studies. Many efficacy studies of orthopedic procedures use kinematic or force plate gait analysis. The theory behind the use of gait analysis is that it is an objective measure of lameness, and lameness, in turn, is an indirect indication of pain and function. Gait analysis is not a direct measure of function, which becomes obvious in the situation where a dog increases the amount of peak vertical force it generates on its limb following a surgical procedure, but is still unable to climb up the stairs or jump onto the bed unassisted. The procedure may have been a success in terms of gait analysis, but the improvement in function that the owner or surgeon desired was not obtained. So why use gait analysis as an outcome measure if it doesn’t directly assess the outcome of interest (i.e., the dog’s ability to function in its home environment)? Gait analysis is a valid, reliable, and objective measure of lameness. Two of those properties, validity and reliability, are required for all outcome measures, and objectivity in a measure is desirable whenever possible. If no valid and reliable measures of a dog’s function in its home environment are available, then another measure, such as gait analysis, must be chosen. The problem is that most direct measures of the success of a surgical intervention, such as the dog’s ability to function in its home environment, require assessment of a subjective attribute. The development of outcome assessment tools for subjective outcomes is not easy and requires considerable investment of both mental and fiscal resources.

Outcome Assessment in Veterinary Medicine

A sound method is available for the development and application of tools to assess subjective states. Unfortunately, much of this literature is virtually unknown to most veterinary researchers. Health science libraries do not routinely catalog journals or textbooks that address the concepts of measurement because they are predominantly directed at educational and psychologic audiences for the development of achievement, intelligence, or personality tests and scales. References on the topic that focus on those attributes of interest to researchers in health sciences, such as subjective assessment of health states and response to illness, are uncommon. It is not surprising therefore, to find that a vast majority of measures reported in the veterinary literature to collect data for a subjective outcome were devised specifically for each given study, with no clear indication that the process of questionnaire development included or assessed the reliability and validity of the measure. Given the number and variety of unpublished scales that appear in the veterinary literature as outcome measures, it is clear that most investigators believe that devising a series of reasonable looking questions and then averaging or summing the responses to get a score is all that is required to then use that tool (i.e., series of questions) as an outcome measure in their study. However, questions that appear very reasonable on the surface and have been used in published studies, on closer inspection are actually very problematic. Example questions are provided here:


For this question, the focus will be on problems with the question itself:

Problem No. 1: This is what is known as a double-barreled question, that is, it actually asks two different questions at once. It is asking an owner “How much difficulty does your dog have going up the stairs?” and “How much difficulty does your dog have going down the stairs?” The answer may not be the same for both. For example, dogs with front limb orthopedic disease often will have more difficulty going down the stairs, and dogs with hindlimb orthopedic disease will have greater difficulty going up the stairs. So how will an owner of a dog with elbow osteoarthritis answer this question, if the dog goes up the stairs relatively easily, but has more difficulty going down? Will he choose to average the difficulty of the two activities, or will he choose the most extreme value of the two? If the owner is asked this question again later in the course of the study, will he use the same thought process to answer the question?

Problem No. 2: How will an owner answer this question if the dog does not live in an environment with stairs? Many dogs, depending on geography or the physical abilities of their owners, live in an environment without stairs. Will the owners of these dogs make a “guess” as to how much difficulty their dog would have if it did live in an environment with stairs? Will these owners just choose to leave the question blank and then leave it up to the investigators to manage missing data?

Problem No. 3: Over what time frame should the owner make this assessment—today, the past week, the past month? Many conditions have a waxing and waning clinical course (i.e., “good days” and “bad days”). If an owner of a dog with coxofemoral osteoarthritis is asked this question, will he choose the worst of the dog’s most recent days, or will he average some series of days?

For this question, the focus will be on problems with the response options:

Problem No. 1: Some of the response options are asking for an assessment of change, while others are asking for a current assessment of health status. If a question is to be used to assess change, two options are available. The first option is to assess the current health status of the animal at two different points in time during the study, and then calculate the change. The second option is to ask about change at some point following an intervention or during the progression or resolution of disease. Mixing both types of responses in the same question causes confusion for the respondent.

Problem No. 2: Response options do not cover all of the possible choices for which the owner may be looking. Two “change” options are provided: “no change” and “increased,” but no option is given for “decreased.” Three health status options are listed: “indifferent,” “little attention,” and “needy”; no options are offered for a normal or average amount of attention, or for a lot of attention without the negative connotation of being needy.

Problem No. 3: The response options are not mutually exclusive. If a dog pays little attention to the family, but that is not any different than the dog’s normal behavior, will the owner choose option number 2 or option number 4? If the owner circles both on the questionnaire, how will the investigator manage the data?

Designing clear questions with appropriate response options is more challenging than it would appear at first glance. In addition, it is only the first step in the development of an appropriate outcome assessment tool. Devising the questions must be followed by confirmation that those questions make up a valid and reliable instrument.

Stepwise Development of a Health Measurement Instrument*

Step One: Devising the Items (Questions)

The first step in designing a scale or questionnaire is devising the questions themselves. This is far from a trivial task in that no amount of statistical manipulation after the fact can compensate for poorly chosen questions, that is, those that are badly worded, ambiguous, irrelevant, or even not present.13 Two of the more relevant techniques for developing questions in a rigorous and systematic manner include the use of focus groups and key informant interviews. Focus groups are discussions in which a small group of people (typically six to twelve) with traits of interest are guided by a facilitator to talk about themes that are important to the investigation. For example, if one wanted to develop an outcome assessment scale to measure the ability of dogs with osteoarthritis to function in their home environment, owners of dogs with osteoarthritis would be assembled to discuss the behaviors they associate with their dog’s ability to function and how their dog’s osteoarthritis affects those behaviors. Key informant interviews are interviews with a small number of people who possess unique knowledge. These could be owners of animals with the condition of interest, but usually the group consists of three to ten clinicians who have extensive experience evaluating and managing those patients.

Once a series of questions has been devised, a method by which the responses will be obtained must be chosen. This is dictated in part by the nature of the question, but a very large body of research in this area describes the advantages and disadvantages of the wide variety of scaling methods that can be used. One form of question frequently used requires only a categorical judgment by the respondent, indicated by a “yes/no” response or a simple check. This results in a nominal scale (Figure 11-1). However, many of the variables of interest to health care researchers are continuous rather than categorical, and many options for scaling these responses may be chosen. The visual analogue scale is a line of fixed length with anchors at the extreme ends and with no words describing the intermediate positions (Figure 11-2). Respondents place a mark on the line corresponding to their perceived state. This approach is simple for investigators but often is not well understood or completed appropriately by respondents, and other methods may yield more precise measurement.3,6,7,13 To minimize the problems that arise when some respondents inappropriately complete visual analogue scales, those response options can be converted relatively easily to numeric rating scales by converting the visual analogue line to a 0 to 10 choice, as in the example (Figure 11-3). Adjectival scales use descriptors along a continuum rather than simply labeling the endpoints (Figure 11-4). Likert scales are bipolar scales measuring a continuum of positive to negative responses to a statement (Figure 11-5). When Likert scales are constructed, consideration should be given to several issues, such as the number of scale divisions to provide, and whether or not a neutral category should be included. For example, sometimes it is desirable to use a four- or six-point scale and exclude the neutral response option, to force the respondent to make a choice in the negative or positive direction.

< div class='tao-gold-member'>

Stay updated, free articles. Join our Telegram channel

Jul 18, 2016 | Posted by in PHARMACOLOGY, TOXICOLOGY & THERAPEUTICS | Comments Off on Evidence-Based Medicine and Outcomes Assessment

Full access? Get Clinical Tree

Get Clinical Tree app for offline access