General Laboratory Concepts

1 General Laboratory Concepts




Test Selection and Asking the Right Question


Veterinarians have many choices regarding laboratory testing. Important factors include availability of reference laboratory testing, reliability and ease of in-clinic testing, cost-effectiveness, accuracy, and turnaround time. One must determine what tests to perform in-clinic and what tests to send out to a veterinary reference laboratory or to a local human laboratory. Recent improvements in the automation and ease of use of analyzers designed for in-clinic use are changing what is acceptable. Correct choices vary with the needs and patient population of each veterinary clinic. No one answer fits all situations.


To get a specific and meaningful answer from laboratory testing, the diagnostician must ask a specific and meaningful question and understand whether a particular laboratory test is likely to yield a useful answer. As an example, compare the likely outcome of asking the following questions: “Is the animal anemic?” “What is wrong with the animal?” A microhematocrit procedure (in addition to knowledge of the animal’s hydration status) will usually answer the first specific question, but a serum chemistry profile, complete blood count (CBC), urinalysis, and fecal examination may or may not answer the second vague, nonspecific question. A clinician should ask, “What will a high, low, or normal test result specifically mean in terms of making a correct diagnosis, providing accurate prognostic information, or choosing an appropriate therapeutic plan?” If the answer is meaningful (i.e., it will change some action taken by the clinician), the test is worth the cost. Normal laboratory results may eliminate certain diseases (i.e., have high negative predictive value [NPV]) and can be as valuable as abnormal results.




Simple Statistics and Practical Interpretations


A reasonable level of skepticism about laboratory results should be maintained. Clinicians should not believe all numbers. All laboratory data should be interpreted in the context of the history, physical examination, and other diagnostic findings in a patient. Unexpected results are common and should stimulate the clinician to reevaluate the provisional diagnosis and look for additional diseases or consider possible causes for erroneous laboratory results. Trends over several days are often more informative than test results on a single day. Typically, not all test results that “should be” abnormal in a disease situation are abnormal in each affected patient.



When interpreting laboratory tests, it is important to keep in mind that reference intervals include the results expected in 95% of normal animals. Thus 5% of results in normal animals (i.e., 1 of 20) are expected to be outside the reference intervals. If a profile of 20 tests is performed, only 36% of normal animals would have all 20 results within the 95% confidence interval reference values. Diagnosticians must expect some false-positive and false-negative test results. No tests are 100% sensitive and 100% specific for a disease.



Abnormal results in normal animals are often only slightly above or below the reference interval. The magnitude of a change helps determine one’s confidence that a disease is present. Large alterations usually allow greater confidence that the animal is abnormal, because they are less likely the result of statistical chance. With many tests, increasing magnitude of deviation from normal also reflects a more severe disease and worsening prognosis.


Laboratory methods vary in their ability to provide the same result when a sample is repeatedly analyzed (i.e., analytical precision). The coefficient of variation (CV) is often used to indicate the precision of an assay. Assays with a low CV have a high degree of precision; small changes in results can be attributed to changes in the patient and not random variation in the assay itself. Assays with a high CV have poorer precision; small changes in results may be due to variation in the assay and not indicative of disease in the patient. For example, because of the great imprecision of a manual leukocyte, platelet, or erythrocyte count, results can vary 10% to 20% only because of technique; therefore mild changes from one day to the next may reflect only imprecision in the procedure rather than actual changes in the patient.


Evaluating populations of apparently healthy animals with screening tests is much different from testing individual sick animals. The predictive value of a test is strongly affected by the prevalence of disease in a population.3 For example, if a disease occurs in 1 of 1000 animals and a test is 95% specific and sensitive for the disease, what is the chance that an animal with a positive test result actually has the disease (i.e., positive predictive value [PPV])? Most students, residents, and clinicians answered this question incorrectly; the average response was 56% with a range of 0.095% to 99%. If the test is 95% sensitive, 95% of all animals with the disease should be detected. Therefore the one animal in 1000 that has the disease should be positive. If the test has a specificity of 95%, then 5% of the 999 animals in 1000 that do not have the disease, or about 50, will have a false-positive test result. The PPV (i.e., the number of true-positive tests/total number of positive test results) of this test is only about 2%, because only 1 of those 51 animals with a positive test result will have the disease. There are mainly false-positive results to interpret and explain to the animal owners.


Screening tests with a high sensitivity are often useful to rule out a disease. In the example above with a test that has a sensitivity of 95%, 5% of animals with the disease will not be detected and will be false negatives. Using the above situation, if 1 in 1000 animals has the disease, then 0.05 animals will have a false-negative result. If the specificity is 95%, then 949 of 999 animals in 1000 that do not have the disease are true negatives. Thus the NPV (number of true-negative tests/total number of negative test results; 949/949.05) is greater than 99.9%.



If a test is performed only when the disease is likely instead of screening all animals (including those with no clinical signs) for a disease, then the frequency of diseased animals in the test population is higher. Testing for disease in sick patients is exemplified by heartworm testing. Consider an example in which a test for heartworm disease is 99% sensitive and 90% specific and is used in 100 outside dogs in a heartworm-endemic area.5 If the incidence of disease is 50%, then one should identify 49.5 of the 50 ill dogs and obtain 5 false-positive results in the 50 dogs without heartworms. Thus the PPV in this situation is 49.5/54.5 or 91%. There are still false positives to interpret, but greatly fewer.


Receiver operating characteristic (ROC) curves (Figure 1-1) are used to determine the effectiveness of a test in diagnosis. ROC curves plot the true-positive rate (as indicated by the diagnostic sensitivity of an assay) against the false-positive rate (1 − the diagnostic specificity of an assay) calculated at various concentrations over the range of the test’s results. A good test has a great increase in the true-positive rate along the y axis for a minimal increase in the false-positive rate along the x axis. The 45-degree line in Figure 1-1 would indicate an ineffective test, which would have an equal increase in false positives and in true positives. Whether a positive result on such a test was a true positive or a false positive would be random chance, like tossing a coin. Figure 1-1 illustrates that serum creatinine and urea (measured as blood urea nitrogen [BUN]) are very good tests of renal failure in dogs and very similar in effectiveness.2 The urea/creatinine ratio is noticeably worse than either creatinine or urea, as illustrated by being closer to the 45-degree angle line (and having less area under the curve).


image

FIGURE 1-1 A ROC curve is a way to show the effectiveness of a test. Increased serum urea concentration, creatinine concentration, and urea/creatinine ratios were compared in diagnosis of 417 dogs with renal failure, 1463 normal dogs, and 2418 sick dogs without renal disease.2 The area under the ROC curves show that serum creatinine and urea concentrations were very similar in diagnostic accuracy but the urea/creatinine ratio was obviously worse than either of them.


ROC curves are also useful in selecting upper and lower decision thresholds that can be used to decide when a diagnosis can be ruled in or ruled out. Note that decision (or diagnostic) thresholds are different than reference intervals. Animals with a test result below the lower decision threshold limit are unlikely to have the disease being tested for; animals with a test result above the higher decision threshold limit are likely to have the disease.


Diagnostic thresholds for renal failure are suggested where the creatinine or urea ROC curves in Figure 1-1 rapidly change their upward angle and begin to turn and plateau to the right. Lower to the left along the curve is a higher concentration threshold with greater specificity and lower sensitivity. More to the upper right is a lower threshold with greater sensitivity and lower specificity. At the bend in the curve, the test has optimal sensitivity (increase in true positives) with minimal loss of specificity (increase in false positives).



Reference Values


Reference values (i.e., reference ranges, reference intervals, “normal” ranges) are used to determine if a test result appears normal or abnormal. A laboratory result is meaningless without knowing what values normal animals in that situation should have. It is not unusual for a veterinarian to request that a test be performed in a species for which the laboratory has no reference values, nor is it unusual to find that the laboratory has not validated the test for accuracy in disease diagnosis in species not commonly tested. Reference intervals may be presented as a range or a mean (or median) plus and minus 2 standard deviations. Reference intervals should optimally also have 95% confidence intervals around the upper and lower values to help show that the limits can be “fuzzy.” Too often veterinarians use an upper or lower value as an exact breakpoint between normal and abnormal. For example, if a serum sodium reference interval is 146 to 156 mmol/L, a common error is to consider 146 mmol/L normal but 145 mmol/L as indicating hyponatremia despite the fact that imprecision in the method or rounding off of values may mean that these values are essentially the same. Another example is that less than 60,000 reticulocytes/µl is often incorrectly given as a breakpoint between regenerative and nonregenerative anemia. The 60,000 should be considered an approximate, rule-of-thumb, mean reference value.


One uses the mean or range of reference values in different situations. Upper and lower reference values are best for individual patients without a previous evaluation. The best reference values are a patient’s own values (if available) before an illness, because individual animals or members of special groups (e.g., sight hounds, puppies) may have unique characteristics. When comparing groups of animals (e.g., a research project), one should use a mean or median value for the groups for interpretation of changes (e.g., packed cell volume [PCV] 45%) instead of published reference values (e.g., PCV 37% to 54%).


Specific reference values should be used for different methods and instruments. One laboratory’s reference values for canine reticulocytes for the ADVIA 120 instrument (Siemens Healthcare Diagnostics) is 11,000 to 111,000/µl (see Appendix II). The XT-2000iV analyzer (Sysmex Corporation) reports higher numbers of reticulocytes than the ADVIA 2120 and should have a different set of canine reference values (19,400–150,100/µl). A current problem with available automated reticulocyte results on most hematologic samples is that many nonanemic dogs appear to have a regenerative erythropoietic response because they have reticulocyte counts higher than currently available reference values. This may occur because, even with properly established reference values, there may be variations in how some samples were collected (excited dog versus calm dog), or changes in instrument software may change the sensitivity of detection of reticulocytes.


Mean values are used to detect trends that depart from the norm. For example, if a dog has a low carbon dioxide partial pressure (Pco2) (indicating a respiratory alkalosis trend) and a low bicarbonate (HCO3) concentration (indicating a metabolic acidosis trend), the diagnostician uses pH to indicate which is the disease change and which is likely compensation. A pH that is within the reference range can still be discriminating by how it deviates from the mean for pH. For example, a low-normal pH indicates an acidifying tendency and that the disease process is more likely metabolic acidosis with respiratory compensation, rather than respiratory alkalosis with metabolic compensation. Means are used with increases in enzyme activity that should be reported as an x-fold increase (e.g., a tenfold increase over the mean). It is more common to use the fold increase over the upper reference value, because mean values are often not available.


Reference values are often suboptimal. New reference intervals should, theoretically, be established whenever a laboratory changes instruments, methods, or even types of reagents. The expense is considerable and often prohibitory considering the number of species involved; the variety of breeds; the effect of age, sex, and other factors; and the number of “normal” animals for a reference population optimally needed for each category. An ideal reference population should include 120 individuals for parametric and 200 individuals for nonparametric distributed values. A robust method for determining reference intervals is recommended when only 20 to 40 individuals are available.9


Unfortunately, use of readily available animals for reference populations is often is found to be inappropriate because of later findings of subclinical disease or deviations from “the typical adult dog or cat” because of factors such as breed, age, sex. Results from any reference population should be evaluated for animals with values that deviate from the main group to see if those animals came from one kennel (e.g., breed-related deviations such as those in greyhounds, nutritional or toxic disorder in the kennel population) or for any other explanation for why those animals should be removed from the reference population.


An alternative method to generate reference intervals when new techniques, reagents, or instruments are added to laboratories is to perform at least 20 to 60 duplicate analyses with both the new and the previous or “standard” procedure. Regression analysis is used to predict the new procedure’s reference values from the previous reference values, assuming the previous values were properly established from an appropriate reference population. It is very important to include a wide range of low and high results in the group of duplicate samples.


Reference values in hematology or clinical chemistry books and articles will vary from a clinic’s own instruments and methods but are useful to identify factors that typically cause deviations from “the typical adult dog or cat” due to breed, age, sex, and the like. The number of tests analyzed in most clinics for some species (e.g., pet birds, wildlife, zoo animals) might be too low to justify establishing reference values. Literature values are often used for many tests if a laboratory does not have its own. One example is the International Species Inventory System (ISIS) Physiologic Data Reference Values for zoo animals. Many of the species are uncommon, and only a few may be present in a state or country. The ISIS values were derived from normal animals at 65 institutions so that a reasonably sized database could be established.


Selected or “groomed” hospital patient data may be used to reevaluate reference values for one or more parameters that come under question. For example, a reagent company may change the formulation of reagents for a test (e.g., calcium) and suddenly many patients appear to have abnormally high or low calcium concentrations. Patient values are not from animals proven to be normal but are a readily available source of recently obtained, inexpensive, and locally produced data. These data represent the laboratory’s current patient population. In the above example, if values for calcium concentration in the laboratory’s current patient population (minus patients having a disease affecting calcium) are compared to previous reference values, then new reference values derived from patient results can be a temporary adjustment. One would expect to find a shift in the quality control (QC) results that chronologically matches obtaining new reagents.




Sources of Laboratory Error


Laboratory error is common and needs to be detected early to avoid misdiagnosis. Laboratories should be asked to recheck results that do not make sense in the context of the animal’s history, physical examination, or other diagnostic findings, such as marked hyperkalemia or hypoglycemia in a clinically normal animal. A variety of artifacts may cause the measured concentration or activity of an analyte or multiple analytes in a panel to be falsely increased or decreased. Spurious results make it difficult to accurately interpret laboratory results; artifacts may cause abnormal results in a healthy animal or mask abnormal results in a sick animal. Depending on the cause, artifacts may make it impossible to determine the real concentration or activity of an analyte. Anytime one spurious result is found in a panel of tests, all results should be closely evaluated to determine whether they have also been affected.



Preanalytical Errors


When an artifact is suspected, it is useful to determine if the spurious findings resulted from a preanalytical or analytical error.12 Preanalytical problems occur before the laboratory analyzes the sample and are the most common cause for laboratory errors.1,11 Common types of preanalytical errors are listed in Box 1-1. Most preanalytical errors are the result of sample collection or handling problems that can be avoided. Whenever possible, new samples should be collected if these types of preanalytical errors are suspected. Treatment with drugs often causes artifacts in laboratory testing. For example, chloride concentration usually cannot be accurately measured in patients receiving potassium bromide because the most commonly available assays cannot distinguish bromide from chloride. Certain drugs are insoluble in urine (e.g., sulfa drugs), causing crystalluria.


Sep 10, 2016 | Posted by in SMALL ANIMAL | Comments Off on General Laboratory Concepts

Full access? Get Clinical Tree

Get Clinical Tree app for offline access