- Clinical Trial Solutions
- Learning Center
nQuery Head of Statistics & Lead Researcher, Ronan Fitzpatrick recently sat down with Professor Stephen Senn to talk about Effect Size & Sensitivity Analysis.
This video is an excerpt from a feature length video titled "nQuery Interviews Professor Stephen Senn". The full video interview is available to watch on-demand by clicking here.
Ronan Fitzpatrick is Head of Statistics at Statsols and the Lead Researcher for nQuery Sample Size Software. He is a guest lecturer for many institutions including the FDA.
Stephen Senn is a statistical consultant for the Pharmaceutical Industry. Stephen has worked as an academic in a statistical capacity at the Luxembourg Institute of Health, University of Glasgow and University College London. A leader of one of the work packages on the EU FP7 IDEAL project for developing treatments in rare diseases, his expertise is in statistical methods for drug development and statistical inference.
Ronan: So one other component of Sample Size Determination is selecting your Effect Size and this is often one of those cases where there's differing views and sometimes people reverse calculate this there's some stuff around that but I felt more importantly there's probably two major schools of picking an effect size in a more principled manner one would be picking it as the expectation of what you think is going to happen but I think you generally come from another point of view that it should be based on clinical significance that it has some relationship to what you want to happen in a trial, what success would mean to the trial and maybe like a lower band or a disappointment band on what you want the effect size to be so do you want to discuss how discussing the effect size can help you design a better trial using it within the sample size calculation process.
Stephen: Yes I think that many people imagine that the effect size is somehow a property of the drug and of course you're trying to estimate what the drug will do, so it's understandable as to why they fall into what I consider an error, this in my view is an error. I prefer to think of it as being a feature of the disease, it's a property of the disease, in other words if you're studying hypertension a particular type of hypertension then in that case for that particular type of hypertension then there should be a particular effect size that many people agree or we hope they agree is roughly important. For example, it might be 10 millimeters of mercury if diastolic blood pressure is being measured or something like that, something in that particular area might be what one was interested in and if that is the case essentially it's a yardstick against with which you're measuring the candidate treatments.
You don't change the yardstick just because you fear the treatment will change it's basically a way of separating the wheat from the chaff and it should be a function of the disease area and the way in which I like to think of it is the difference I would not like to miss. When I was working in the pharmaceutical industry I used to tell the young statisticians working for me the way to think of it like this is if the trial fails then probably this treatment will never be studied again this drug will be lost to mankind forever.
At what point does that fact begin to hurt, what point does it begin to hurt that it will be lost forever. If you set the effect size far too large, if you demand something absolutely spectacular then inevitably what will happen is that a large number of treatments which are moderately interesting will never be studied again so you have to think of it that way.
On the other hand if you make it ridiculously small you'll spend huge amounts of resources looking for stuff that might not be interesting in the end. It's a judgment call but in my opinion it's a feature of the disease and not of the drug. Now to pick up on your second point, let's return to the business of the effect size as a feature of the drug. Yes I think there's a sense in which it can be useful but not for the purpose of fixing the sample size rather for the purpose of deciding whether the project is worth doing at all.
So the way in which I like to think of it is that we choose Delta based on what will be interesting in this particular area, now we ask people ok what do you actually think it is for this particular drug what's it likely to be and then of course there will be a distribution that one might think of and that were very naturally drawn to a Bayesian approach and assurance is one of the methods which of course is incorporated in nQuery and that basically allows you to consider power integrated over all the possible values that there will be for Delta for the true treatment effect. If everybody was certain it would be absolutely equal to something there'd be no problem because that something will be either above or below what is continued to be the clinically relevant difference but if it is the distribution then there'll be some probability it's above and some probability it’s below. So now what we do is we calculate the integrative power and that gives us in a sense the probability that the trial will succeed. That can be used to decide to cancel the project or to approve the project but it should not in my opinion be used as a decision to increase the sample size because that would then have the effect that if the effect was believed to be moderate we would then spend more money and that structures been illogical.
Ronan: To be fair effect size is kind of acting as the bridge between the people who have actual expertise on the particular candidate drug or whatever the outcome of interest is and the statisticians. A lot of the other parameters that they're worried about, like your alpha error your power, nuisance parameters, the variance, they don't really have a tangible value or understanding perhaps to everyone involved and having people come together and talk about the effect size which is something most people can guess on a fundamental level as how much difference there will be can be useful and indeed within the context of assurance one of the first steps that you need to do assurance is you need to have a prior distribution and what a lot of people who are advocates of assurance suggest is doing elicitation and elicitation by itself is very much just a consultational process where these people are talking to experts, talking to people who have the best understanding of what this treatments doing and asking them to come together and combine this knowledge base to what we expect to happen because it's very hard beforehand. You need to define what success is for your trial.
Stephen: I mean I want to make clear I'm not guessing that assurance is not a valuable exercise to go through as part of understanding what is realistic and what is realistic to expect about the project and what experts think and so forth, what I'm suggesting is it's not a valuable way to determine Delta. You should still imagine considering that Delta is a feature of the disease, what the drug is capable of achieving, what people believe is capable of achieving that's useful in order to decide whether one thinks is worth betting on the project being successful or not but ultimately we have to accept that it's the data that will tell us what the drug will do. We're going to rely on the data to inform us in the end. Of course assurance can also be useful for the purpose of nuisance parameters.
You mentioned some of the other things which are important in a calculation, one of them typically for a continuous outcome, if we go back to what I was talking about before, would be supposed variance in the population whatever that might be that itself will be based upon previous studies not with this drug but in this particular area and that in itself may enable one to say well we're not sure what the variance would be this is some distribution for it. So these are things that one can incorporate in it.
Ronan: I suppose finally there's a discussion about sample size also as the last step or what is often left as the last step is your sensitivity analysis, trying out different values and that's kind of similar to the idea of trying out different designs. Designs probably should be part of the sensitivity analysis instead of just trying different values we should talk about well we're going to try some different models as well.
Stephen: Yeah I think that's certainly true and I think that's perhaps more often done, I may be wrong about this, but there may well be areas in which the statisticians are more used to doing this, for example, in survival analysis you could change the length of follow-up. You might then make the surprising discovery that if you change the length of follow-up you make it longer the trial will actually not take as long to complete. This may sound counter-intuitive but it could be that if there's sufficient reduction in the number of patients that have to be recruited and if the process of recruiting itself is something that takes a great deal of time that actually a trial with two year follow-up might complete more quickly than a trial with one year follow-up simply because the recruitment period dominates. Those cases are not particularly common but this can happen so this is a sort of example of how playing around with different scenarios can help you to think about what's important and what the options are.
Of course it's so intimately related to design that we should think of ways of using in that way. I sometimes get a bit irritated because I sometimes think that all the people think the statistician is useful for is determining the sample size but if we understand that the sample size is closely related to the analysis chosen and the design chosen then immediately we see that those two things are also part of the job as well and they're actually much more interesting. So if one can use the sample size determination as a way to encourage people to think about these things and maybe that is something we should be doing more often.
Ronan: Yeah and if you're trying to do the collaborative manner in trying to make your life better then that's a very good way to do that. I think you have a quote from the path of the power that which statisticians always seek but rarely have.
Stephen: Oh yes power is something statisticians are always talking about but don't have.