- To contrast Statistical Thinking with standard Statistical Method.
- To compare evidence based medicine with Improvement science.
- To provide an integral view of statistical thinking with Improvement science.
- To Explore the Theory of Measurement underlying improvement science.
- To highlight the informatics of improvement.
- To provide a conceptual understanding of the need for control chart and the rules for its interpretation.
- To describe the various types of Control Charts.
- To provide a decision map for the use of Control charts.
- To describe the routines for control charts using Microsoft excel.

Statistics is the science of extracting information from a collection of data. The field of statistics as it is currently known to healthcare professionals is an offline, back end activity often limited to the analysis of data collected under control conditions of a research project. The methods of statistics in thus used to compare the relative benefit of particular interventions- e.g. drugs, tests, and the results are traditionally expressed as ‘p’values or confidence intervals. Such a traditional use of statistical methods presumably is an analysis of the cause –effect relationship and establishes the efficacy of the intervention. Such research due to its offline nature has often been referred to as ‘Meta Research’.

Statistical thinking on the other hand views statistics as a science of prediction. The very purpose of analysis is motivated towards the production of actionable data, by looking at data over time .Action in practice is progressive and dynamic and thus not under controlled conditions. Such an analysis is oriented towards change – effect relationship and establishes the effectiveness of the intervention. The tools used here a simpler, and are concerned with predicting results within limits and are analogous to confidence interval approach. Such research is due its ‘real-time ‘nature is often referred to as improvement science, practice based research or simply ‘Chrono research’.

The paradigm of evidence based medicine is founded on the more rigorous but less friendly techniques of statistical methods, while the paradigm of improvement science uses statistical thinking as one of its essential modes of understanding practice and effecting valid change at the workplace. Even though there is a much more complex and problematic epistemology involved in the justification of its use, I have used a simplistic distinction of efficacy and effectiveness to encourage the reader. The theoretical assumption underlying statistical methods is ‘certainty’ ie. an approximation towards truth, whereas statistical thinking is content with pragmatic objective of what works and thus has a broader practical value.

As a parody of the statistical method, statistical thinking can be depicted thus:

In our section on the informatics of improvement we will use the above framework borrowed from the work of Richard Scoville and David Kibbe in the mid 90’s at Future Healthcare in Chapel Hill, North Carolina.

However, Statistical Thinking has several theoretical and practical assumptions that are different from statistical methods and thus schematically elaborated as:

The use of statistical thinking is relatively new in healthcare , though the origins in industry is based on the work of Walter Shewart in the 1920’s and later elaborated by Edward Deming as component of his ‘ system of profound Knowledge ‘: Systems Thinking, Statistical Thinking, Psychology and Epistemology. This should clarify that improvement science involves more than statistical thinking. The precise methodology that integrates statistical thinking and practice of improvement science is analogous to the scientific method and is famous as the PDSA.

While the PDSA , known outside healthcare as PDCA, was intended as a systematic process for carrying out improvement at the industrial shop floor. It was meant to bear the role of the scientific method . an interesting decription of the process was offered by progress ltc . At point I wish to merely clarify the confusion between the words ‘do’ and ‘act’. Do is a verified plan , act is a validated plan. It has demonstated impact and thus based on knowledge.

Based on the PDSA method we define Improvement science as the cumulative effectiveness of systems change under bounded predictability. The limited predictability may be the result of uncertainty, complexity or limited resources. Statistical thinking addresses the component of uncertainty or unwanted variation in the process. The idea of bounded predictability is also a description of the nature of statistics as a field, as predictability works only for a group of data, over short time span and that too within a reasonable set of boundaries or limits. This principle is the basis of analysis of time series and the conceptualisation of the control chart by Shewart.

The American society for quality defines statistical thinking as: ‘Statistical thinking is a philosophy of learning and action based on the following fundamental principles: 1. All work occurs in a system of interconnected processes. 2. Variation exists in all processes and 3. Understanding and reducing variation is key to success.

The idea of actionable data has several assumptions:

- It is enumerative data, not historical data. Increasingly the time frame for the data may be progressively decreased for example from week, to day, to shift.
- That data is collected for a specific purpose: Based on Practice Hypothesis. [Improvement study design is based on system principles and thus always has a Hypothesis. However, the methodology does not have hypothesis testing. It is a common mistake to say there is no hypothesis in audit and improvement work. The improvement study is usually ‘Before – After’ design.
- The data sets are generally small and random sampling techniques may be used.
- That the data includes ‘contextual variables ‘which usually in standard research design would be excluded as confounding variables.
- Graphical Techniques are prominent in both analysis and display.
- Individual Identification data may be retained in the analysis. Example, physician, shift etc. May require stratification or risk adjustment.
- Data is viewed over time.
- The data is provided by the process.
- The data plotted on a control chart is the outcome not the process.
- Usually, the data collection process is terminated at the end of the project. Data for testing and data for monitoring have different purposes.
- While standard statistical methods are concerned with ‘quality of inference’, statistical thinking is also particular about ‘quality of data’.

The principles of collection are not dealt with in this blog. The idea of cumulative impact underlines the fact that the effect is analysed formatively not summatively. The classical use summative analysis usually involves some form of hypothesis testing and controlled comparisons. The cummulative impact also highlights the multifactorial approach problem solving.

It is convenient to review the distinction between improvement science and traditional research :

Regardless of the emphasis of variation in the Statistical Thinking category, it will do well to remember that the business of both statistical method and statistical thinking is variation. The common textbook example, I have used in classroom to answer the question Why we need statistical manoeuvres is worth considering here:

Consider the following table Experiment results of number of word recalled with and without a memorization technique.

By looking at the data without doing any calculations what conclusions can you draw about the results of the four different experiments.

We could employ various strategies to draw conclusions about the intervention compared to performance without intervention. Our intuitions are not equally helpuful in all four experiments. This has to do with the centering and spread of the datasets in each of the four experiments.It is likely that most people are convinced of the effect of the intervention in experiment 1 and the negative effect of intervention in experiment 4. This would also be supported by comparing the averages.

Merely looking at the data alone may not have been useful in the case of experiment 2 and experiment 3. Needless to say this difficulty is due to the presence of variation in the datasets.it is this overlap that is caused by the variation in the dataset that requires us to use statistics to make sense of the results. If the four experiments were repeated we would get different datsets and thus different averages.

Are intuitive conclusions would have been even more certain if there was absolutely no overlap of the data in these cases. Statistical test are useful to make sense of data in the presence of variation. If you are curious about what type of statistical would be applicable, this YouTube video would be interesting: Choosing a statistical test.

**Theory of Measurement in Improvement Science:** The idea of statistical thinking as cumulative effectiveness.

Most tools of quality improvement begin as concepts and over a period of time evolve into tools. Measurement can be distinguished as ‘Measurement for Judgement’ or as ‘Measurement for Improvement’. Traditional quality assurance and conventional research falls in the first category while improvement science belongs to the latter. Improvement philosophy underlying the former is ‘trimming the tails’, while the latter is concerned with ‘shifting the curve’. The underlying theory of ‘trial and learn’ serves to drive the process of improvement.

Computer surveillance of the electronic medical record will soon be able to identify 3 patterns of deficient care, only the first of which was identifiable in the past.

The first pattern—a dramatic deviation from the expected outcome, as in the sentinel event of unexpected death—was the subject of historical quality assurance. The second—a reasonable outcome despite a less-than-optimal process—is a warning that the system is not in control and may deteriorate further, resulting in an adverse event or a pattern of preventable morbidity. As an example, consider increasing delays in emergency department triage time. At first, the increased delays are merely disturbing; however, should a critically ill patient present, the previously disturbing delay can prove catastrophic. The delay is a latent error—an accident waiting to happen.

The third category—a reasonable outcome with an acceptable process that might be further improved beyond the prevailing standard—will soon be identifiable with computerized clinical tracking. As an example, there has been a common practice of keeping patients with uncomplicated community-acquired pneumonia in the hospital on intravenous antibiotics for 1 day after their fever has remitted. Given that few of these patients require intervention on their last day, a careful review would show that an earlier discharge would arguably be appropriate, thus lowering costs and reducing the risks of iatrogenic complications.

We can represent a data set both as a histogram and a run chart . The histogram while convenient and for particular purposes such as comparisons, is informative. However, the disadvantage of a histogram is it provides a cross-section of the data and thus hides the sequence.

The run chart can schematically be viewed as a histogram turned on its side and as the same data displayed over time. The time based data provides us a chance to intervene. Vahe Kazanjian of the Maryland Quality Improvement Collaborative often uses an analogy to emphasis this point. Imagine a shooting target with the following result:

While we can make a judgement based on the location of the bullet marks whether the person was a good shot or not . The availability of the sequence would have allowed us to correct our shooting. If black was the first shot then the shooter has been improving. Looking at data over time creates room for action.

The above two examples are only given to emphasize the important of time based data and are not offered as a methodological concept. Most processes are by nature stochastic and thus require specific tools to analyze.

Having said this statistical thinking gives us a way of describing the process, in terms of variation.But this requires a richer description of variation in general . Shewarts thesis of the function of the control chart is to distinguish between variation that always present in the process and variation that is idiosyncratic. He uses the term common cause and special cause. The control chart is often used as the totem for the philosophy of statistical thinking . The central question of statistical process control is : Given data with variation, how can we establish thresholds for action that accurately distinguishes special cause and common cause variation .

Brian Joiner in his classic , Fourth Generation Management

Here it may be useful to introduce two approaches to establishing a limit or a threshold. The first is standard or norm which is usually applicable to all place and time and the other being a reference threshold , that is limited to particular place and time. The control chart is not a dignostic tool but more a tool for monitoring the health of the process and to assess interventions on the process. Most physicians and nurses have at some point come across or used growth monitoring chart and thus one could for pedagogical convenience draw an analogy to Childrens Growth Monitoring Chart, even though the curves represent percentiles and are derived from population based studies. The WHO growth charts use standards based thresholds and thus fix the expected growth profiles for the different age groups. On the otherhand , the control charts thresholds are an example of refrence thresholds and are derived locally from the prior or baseline performance of the process.

**Using Data to Represent Systems:**

The system of interest can be represented in several ways: A flow chart is a representation of the system. A histogram is also a representation of the system. A fish bone diagram is a representation of the system. The run chart and control chart is a representation of the system. A organization chart and a team are also representations of the system.

Data can be used to represent two properties of systems : Coherence or interdependence and secondly of Flow.

Representation of Coherence or Process dispersion:

A. Fish-Bone Diagram.

B. Histogram.

Representations of flow:

A. Process Flow Chart.

B. Control Chart.

Efficiency is a part of any scientific method.

]]>

Most clinicians have had sufficient exposure to statistical ideas during their school days. Thus students in the initial years of medical and nursing have greater aptitude for statistics than in their later years . It could also be advantageous to introduce statistical thinking in the initial years of clinical education.

One of the modus operandi in this blog is the use of small data sets so that the reader can visualize the impact of the manipulations.

In understanding the cognitive developent of Statitical Thinking among clinicians we can learn much from frameworks in understanding the evolution of statistical reasoning in children. As much of this blog promotes a data based approach to statistics we will follow the frame work of jones et al :

a. Reading Data .

b. Reading between data.

c. Reading beyond Data.

In this post we will address some preliminary etiquette to get by :sy

1. Scientific Notation.

2. Statistical terms and their Greek symbols.

3. Conventions in grouping data.

4. Arithmetic manipulation of Percentages.

5. Standardization Maneuvers.

In building frequency distributions there is some confusion about class intervals . Class interval such as 60-62 is a symbol for that particular class. 60 is called lower class limit and 62 upper class limit . How ever what happens if we have value of 62.5. thus in practice we use something called class boundaries or true class limits ie 60.5 -62.5. The mean or adjoining limits . 62+63/2=62.5. the real issue is class boundaries should not coincide with actual observatons .Thus a observation was 62.5 then it would not be possible to decide whether it belongs to the class 60.5-62.5 or 62.5-64.5. the class midpoint or class mark is computed by adding upper and lower limit and dividing by 2. for mathematical purposes all members of a class are assumed to have the value of the midpoint. thus all members of the class 60-62 have value of 61.this often leads to grouping error.as there is loss of information due to grouping. this is often minimised by choosing classintervals such that observations coincide with the midpoint.

observations should not coincide with class boundaries but should coincide with class midpoints.

]]>Understanding sampling distributions

Let us first assume that we have access to a population that has a normal distribution. Imagine you take a sample of say size 10. You can calculate various descriptive statistics for this sample such mean , sd , median or variance etc. The sample size is 10 and number of samples is 1.

Now if you take 20 such samples each of sample size 10 as above , we can calculate the descriptive statistics like mean , median etc as we did for the first sample above . Say we calculate the mean alone . So we now have the means of 20 samples , each of which have a sample size of 10. Please do not confuse the sample size [ here 10] with number of samples [ here 20 at present]

If we plot the means of the 20 samples , we have what is called a sampling distribution. This is a theoretical distribution in which the number of samples can be imagined to extend from 20 in our case above to infinity.

There are several important features of a sampling distribution.

As the number of samples increases from 20 to infinity the sampling distribution will approximate the normal curve. This is not surprizing as in our case we have been sampling from a population with a normal distribution. What is interesting is that even if the population is not a normal distribution as the sample size approximates infinity , the sampling distribution will approximate a normal distribution. This behaviour of the sampling distribution is called Central Limit Theorem and is the fundamental idea in statistics.

Any statistic of the sample can have its sampling distribution. The above description is sampling distribution of the mean, we can similarly have sampling distribution of the median, sampling distribution of the sd etc. The mean being the most mathematically malleable statistic it is usually used.

There are several derivations from the sampling distribution that is used for statistical testing , but let us now stop and understand better the behavior of the sampling distribution.

From the above discussion we can identify several variable components:

The size of the sample. here 10 . but it could be any practical number .

Number of samples. we began with one , then increased to 20 and theoretical can be extended to infinity .

The particular statistic computed from the sample: we computed the mean , but we could use the median etc.

The distribution of the population: here we assumed to know it is a normal distribution , but we have stated as the central limit theorem that it is immaterial what the population distribution is , the sampling distribution will always approximate the normal distribution as the number of samples increases.

Now this is only a theoretical exercise: in practice we do not take infinite number of samples , not even twenty samples but only one sample . what is in your control is only the sample size and not the number of samples. We assume it to be infinite.

Now , the sampling distribution not only gives us a assurance of a normal distribution which is a required assumption of all parametric tests, but also provides us with a new statistic called the standard error.

The sampling distribution is a distribution of sample means. We can thus calculate the mean and standard deviation of the sampling ditribution also.The standard deviation of the sampling distributions is called the standard error. The central limit theorem also provides us other assumptions:

The mean of the sampling distribution will approximately the mean of the population.

The standard deviation of the population can be derived from its relationship to the standard error : SD = SE/ sqrtN. Here is sample size not number of samples.

Now this is only a theoretical exercise: in practice we do not take infinite number of samples , not even twenty samples but only one sample . what is in your control is only the sample size and not the number of samples. We assume it to be infinite. Now the question is what happens if we increase the sample size.

We will explore the effect of change in sample size on the standard error with a help of a simulation in excel.

Determine how the standard error is affected by sample size. Plot the standard error of the mean as a function of sample size for different standard deviations? Can you discover a formula relating the standard error of the mean to the sample size and the standard deviation? If so, see if it holds for distributions other than the normal distribution.

Redo #2 above for the median. Find a distribution/sample size combination for which the sample median is a biased estimate of the population median. Is the sample variance an unbiased estimate of the population variance? If not, see if you can find a correction based on sample size. Does the correction hold for distributions other than the normal distribution?

A statistic is unbiased if the mean of the sampling distribution of the statistic is the parameter. Test to see if the sample mean is an unbiased estimate of the population mean. Try out different sample sizes and distributions.

For what statistic is the mean of the sampling distribution dependent on sample size?

For a normal distribution, compare the size of the standard error of the median and the standard error of the mean. Find a relationship that holds (approximately) across sample sizes?

Does this relationship hold for a uniform distribution?

Find a distribution for which the standard error of the median is smaller than the standard error of the mean. (You may find this difficult, but don’t give up.)

Compare the standard error of the standard deviation and the standard error of the mean absolute deviation from the mean (MAD). Does the relationship depend on the distribution?

]]>

1. In the exploration of most concepts in life and in science, they can be characterised as one of two types of processes: A. Deterministic Processes. B. Random Processes. Random processes produce random variable. The random variable is also referred to as a stochastic variable. Deterministic process produces deterministic variables.

1.1. A random process is one in which the outcomes are entirely random or unpredictable. The outcome is unknown and not fixed, but can have a set of values depending on the process producing it. A coin toss can have one of two values; the child’s sex at birth also has two possibilities male or female , a roll of a dice any one of six possibilities, the disposition of the patient at discharge can have X values with probability of p(X), expressed as relative frequencies stabilising in the long run.

2. A deterministic variable simply means that it is not random .A deterministic process behaves as a mathematical function and thus repeatedly produces a predictable outcome depending on the particular inputs.

2.1 However, in deterministic process even though they produce predictable outcomes [i.e. Deterministic variables] , there always a small degree of random behaviour: for example due to error in the measuring instrument or otherwise called measurement error. Historically, it the analysis of this error or the theory of errors, that evolved into the field of statistics.

3. In the case of purely random processes and that small part of deterministic processes made up of random errors, the variables cannot be characterized individually, but will follow some probability distribution.

3.1. The value of the random variable can never be known in advance. However we can know the set of possible values. Eg in the role of a dice, the possibility is 1 to six as discussed in 1.1. This information is given to us from the probability distribution of this random variable, made up of the long run relative frequencies of each possible value.

3.1. Such a probability distribution will describe the relative frequencies of occurrence of different-sized errors in the case of the random compartment of deterministic processes.

4. QUANTITATIVE RANDOM VARIABLES

A random quantitative variable results when numerical values are assigned to results of measurement or counting. It is called a discrete random variable if the assignment is based on counting. It is called a continuous random variable if the numerical assignment is based on measurement.

The numerical continuous random variable can be expressed as fractions and decimals. The numerical discrete random variable can only be expressed as whole numbers.

4.1QUALITATIVE RANDOM VARIABLES

Qualitative variables (nominal, ordinal, and ranked) are attribute or categorical with no intrinsic numerical value. The nominal has no ordering for example male or female. The ordinal has ordering for example first class, second class, third class. The ranked has observations arrayed in ascending or descending orders of magnitude for example 1st, 2nd, 3rd.

4.2. RANDOM VARIABLES: MATHEMATICAL TRANSFORMATIONS

Quantitative variables can be transformed into qualitative ones. Qualitative variables can be transformed into quantitative ones but this is less desirable. The continuous variable can be transformed into the discrete variable. Transformation of the discrete into the continuous may be misleading.

5. RANDOM VARIABLES: PROPERTIES

A random variable has 6 properties. The expectation or average of a random variable is a central value around which it hovers most of the time. The variations of the random variable around the average are measured by its variance. Covariance measures the co-variability of the two random variables. Correlation measures the linear relation between two random variables. Skew ness measures the bias of the distribution of the random variable from the center. Kurtosis measures the peaked ness of the random variable is at the point of its expectation.

5.1

]]>It also differs from probability: Probability is the the chance of an event happening, Expectation calculates the long run value of the probabilistic event. It clearly approximates the ‘sample mean as the sample size approaches infinity’. If we are dealing with samples from a population[ which theoretically is infinite size: a population of hypertensives would be considered infinite, because it could refer to all diabetes not only in the present – which we can’t for practical reasons count, but also in principle all hypertensives in the past and in the future]. It is in this sense the expected value [ which arithmetically is a average] is considered same as the population mean.

If one is dealing with events with equal probability , then the expectated value is merely a simple average of the probabilities of the event. If the probabilities are different, then the expected value is computed as the weighted average of the probabilites of the event.

Expectation is the value in the long run. It is a function of two variables. The probability of the event and the payoff attached to the event. If one tosses a coin for a reasonably sufficient number of times , we know the probability of getting a head or tail approximate 0.5. If we were to bet one rupee for occurance of the event heads and one rupee for the each occurance of tails , then in the long run , the expectation would 0.5. ie [1* probability of heads + 1* probability of tails]/2.

Let us use a simple example to calculate expectation:

The probablity of occurance of face of a six sided dice is 1/6. let us agree that we attach a value equal to the face of the dice: thus 1 [ say rupee] if side with 1 turns up , 2 if side with 2 turns and so on to six] If we calculate the expectation: [1*1/6+2*1/6+3*1/6+4*1/6+5*1/6+6*1/6]/6= 21/6 or 3.5.

The only situation the probability of the event and the mathematical expectation of the event will be the same when considering the indicator random variable of the event. Probability can be defined as the expectation of a indicator random variable.

]]>**“ Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write” **

**~ HG Wells (1866-1946)**

Existentially! We are ‘thrown ‘into uncertainty. Yet we must necessarily make choices by continously reducing the margin of error in our decisions and actions.

There are several reasons why I believe that this blog would be of interest to you:

- Developments in the field of statistical pedagogy have come to recognize a separate domain of statistical thinking, besides the traditional notion of statistical methods. This revolution can be summarized as follows:a. The field of exploratory data analysis that gives priority to visual tools to gain insight from data.b. The availability and use of technology that has made statistical methods more accessible to all. This has made possible a preponement in the exposure to advanced techniques like regression, but not without the risk of mindless misapplication and disregard for quality of data.
c. The development of simpler statistical methods in statistical process control has further democratized the use of statistics in real time practice. This has implications for medicine which has traditionally seen statistics purely as a tool in medical research.

- The fields of cognitive and development psychology have contributed greatly to the understanding of how we make sense of data. Though they do not always agree! Statistical thinking in a broader sense is also a favorite contender for contemporary models of general reasoning in cognitive science.

These developments offer the greatest advantages to introductory courses in statistics and yet have has been very little impact in the teaching of statistics in medical and healthcare institutions. Further more, the application of statistics in clinical practice is still not as widespread as it should be.The aim of this blog is the advocacy of statistical thinking in medicine and healthcare.

I have been a keen student of these developments for quite some time now and I wish to share this ‘basic’ knowledge with medical and other healthcare students and practitioners. The position I assume in this exercise is as a fellow student and I hope professionals in statistics, psychology and informatics would be willing to contribute to the discussion, as well as correct any errors in my understanding.

The strength I hopefully bring to this blog is a broad based interest in the psychology, history and philosophy of statistics, of science, of information and the conviction that statistical thinking is indispensable to all parts of medical and healthcare science and practice. I hope to pull together as much resources that would be invigorating to any one beginning their journey of self- learning in statistical thinking. To travel along all you need is a little enthusiasm and an open mind**.….**