Measuring the performance of anesthetic depth indicators

W D Smith, R C Dutton, N T Smith, W D Smith, R C Dutton, N T Smith

Abstract

Background: An appropriate measure of performance is needed to identify anesthetic depth indicators that are promising for use in clinical monitoring. To avoid misleading results, the measure must take into account both desired indicator performance and the nature of available performance data. Ideally, anesthetic depth indicator value should correlate perfectly with anesthetic depth along a lighter-deeper anesthesia continuum. Experimentally, however, a candidate anesthetic depth indicator is judged against a "gold standard" indicator that provides only quantal observations of anesthetic depth. The standard anesthetic depth indicator is the patient's response to a specified stimulus. The resulting observed anesthetic depth scale may consist only of patient "response" versus "no response," or it may have multiple levels. The measurement scales for both the candidate anesthetic depth indicator and observed anesthetic depth are no more than ordinal; that is, only the relative rankings of values on these scales are meaningful.

Methods: Criteria were established for a measure of anesthetic depth indicator performance and the performance measure that best met these criteria was found.

Results: The performance measure recommended by the authors is prediction probability PK, a rescaled variant of Kim's dy.x measure of association. This performance measure shows the correlation between anesthetic depth indicator value and observed anesthetic depth, taking into account both desired performance and the limitations of the data. Prediction probability has a value of 1 when the indicator predicts observed anesthetic depth perfectly, and a value of 0.5 when the indicator predicts no better than a 50:50 chance. Prediction probability avoids the shortcomings of other measures. For example, as a nonparametric measure, PK is independent of scale units and does not require knowledge of underlying distributions or efforts to linearize or to otherwise transform scales. Furthermore, PK can be computed for any degree of coarseness or fineness of the scales for anesthetic depth indicator value and observed anesthetic depth; thus, PK fully uses the available data without imposing additional arbitrary constraints, such as the dichotomization of either scale. And finally, PK can be used to perform both grouped- and paired-data statistical comparisons of anesthetic depth indicator performance. Data for comparing depth indicators, however, must be gathered via the same response-to-stimulus test procedure and over the same distribution of anesthetic depths.

Conclusions: Prediction probability PK is an appropriate measure for evaluating and comparing the performance of anesthetic depth indicators.

Source: PubMed

3
Sottoscrivi