N-of-1 trials are the application of the machinery of randomised clinical trials to the individual patient. Episodes of treatment rather than individual patients become the fundamental unit of inference and patients are repeatedly randomised to an experimental and a control treatment (say) in order to establish the effect of treatment. Popular designs involve cycles of episodes of treatment with the patient being assigned to a random order within cycles. (Continued below)
Guest Blog: Professor Stephen Senn
About The Author
Currently Head of Competence Centre for Methodology and Statistics with the Luxembourg Institute of Health, Stephen Senn was previously Professor of Statistics at a number of UK universities as well as holding senior statistical roles in the Swiss pharmaceutical industry.
He has written numerous papers in the area of healthcare statistics, and three books; Statistical Issues in Drug Development (Wiley: 1997,2007), Dicing with Death (Cambridge: 2003) and Crossover Trials in Clinical Research (Wiley: 1993,2002).
He is recognised as a leading global authority on clinical biostatistics. Stephen Senn is a former member of the Statsols statistical advisory board and he currently serves as a consultant to our statistical development team.
N-of-1 trials are the application of the machinery of randomised clinical trials to the individual patient. Episodes of treatment rather than individual patients become the fundamental unit of inference and patients are repeatedly randomised to an experimental and a control treatment (say) in order to establish the effect of treatment. Popular designs involve cycles of episodes of treatment with the patient being assigned to a random order within cycles. Usually two treatments only are compared and so cycles of pairs of episodes are employed. If patients are treated in κ cycles this means that 2ᵏ possible sequences can be used.
Within the medical literature emphasis has been on the independent analysis of the effect of treatment for each patient. This may seem intuitively reasonable but it overlooks what is done more widely in clinical research. When parallel group trials are run then no patient receives both experimental and control treatment but is assigned to one of the two at random. Despite this we believe that such trials can deliver useful information and this in turn implies that averages have value. Nobody denies that the effect of treatment may vary from patient to patient but nevertheless it is in the belief that the effects often show some similarity that we do research at all.
In fact, there are a number of purposes of n-of-1 trials and for none of these is it a good idea to analyse patients independently. For the first, we simply regard repeatedly treating patients as a useful way of obtaining more information. This can be particularly valuable for rare chronic diseases where repeated experimentation on the same patients is a way of making up for the fact that patient numbers are small. For this purpose, just as would be the case for a parallel group trial, the main emphasis is on the overall average effect. The fact that this effect itself may differ from patient to patient may be ignored by choice just as it would be ignored by necessity in a parallel group trial. The objective is to see whether it can be demonstrated that there is a difference between the average effects of treatments being compared.
For this purely causal purpose the treatment-by-patient interaction can be removed from any estimate of the variance of the overall treatment effect. Since pure between-patient effects are removed by virtue of each patient acting as his or her own control what is left is within-patient within cycle error from episode to episode. If the variance of this error is σ² then the variance of a given within-cycle contrast is 2σ² the variance of the mean of κ cycles is 2σ²/k and finally the variance of the mean of these over n patients is 2σ²/(nk). This, together with the fact that the unknown variance σ² can itself be estimated using n(k-1) degrees of freedom forms the basis for any power calculation.
For constructing confidence intervals for the average effect, however, then it might be more reasonable to regard each patient as providing a random witness to the overall effect. Since different patients might have been studied the variance of the true treatment effect from patient to patient now becomes relevant. If we call this variance Ѱ² then, if we could establish the true treatment effect for each of n patients and assuming independence, the mean of these effects would have variance Ѱ²/n. In practice, however, we do not know the true value but can only estimate the treatment effect for a given patient and if this is done using κ cycles then it is estimated with a variance of 2σ² /k. Putting these two together, it then follows that an estimate of the overall mean based on n patients, each of which has been studied in κ cycles has a variance of (Ѱ ² +2σ² /k)/n.
Finally, if we are interested in making inferences about a given patient in a situation in which many patients have been previously studied, we have a choice between two radically different approaches. One is to use the estimate based on the patient’s values only. If the patient has been studied in κ cycles, then we have an estimate with a variance of 2σ²/k. At the other extreme we have the average treatment effect based on patients previously studied. If this estimate has been based on a large sample of patients the uncertainty of the overall mean can be ignored but even where this so, it will provide an uncertain prediction of the effect in a given patient, since such effects vary from the average with variance Ѱ². Clearly, which is better depends on the relative size of the two variances and the number of cycles in which the given patient has been studied. In fact, it is possible to produce a weighted average of the both that is superior to either. The two weights are
so that, other things being equal, the larger the overall true variation Ѱ² from patient to patient the greater the weight that applies to the given patient’s estimate but, other things being equal, the greater the within-patient variation σ² the more weight applies to the overall average. The more cycles that can be studied the greater the weight to the individual’s average. Thus, a good estimate of the effect for any given patient requires appropriate weighting of the global and the individual.
Such a weighted estimate is often referred to as a shrunk estimate since it shrinks the patient’s individual estimate towards the overall mean. It has a variance equal to the reciprocal of the sum of the individual reciprocals of the variances. Working through the algebra, this gives a variance of
In short, due to their more complex nature, n-of-1 trials permit one to answer various questions. Knowing which one wishes to answer is an important part in developing any plan and in calculating a target sample size.
Statsols recently hosted a guest lecture by Professor Stephen Senn of the Luxembourg Institute of Health.
Title: 'Response, Quality and Variation - What Drug Development May Be Missing'
Location: Hayfield Manor Hotel, Cork, Ireland