Get sample size updates by email

Receive educational sample size content

Get sample size updates by email

Receive great industry news once a week in your inbox

Get sample size updates by email

Receive educational sample size content

Get sample size updates by email

Receive great industry news once a week in your inbox

46 min read

Written by nQuery Sample Size Software Team

February 21, 2020

"2020 Trends In Biostatistics - What you should know about study design" examines how industry changes are impacting clinical trials. Watch the recording of our recent webinar now.

What you should know about study design

- Adaptive designs in confirmatory trials
- Using external data in study planning
- Innovative designs in early-stage trials

**To watch the recording of this webinar, just click the image below.**

**Slide preview from the webinar**

**Duration:** 60 minutes

Transcripts of webinar

*Please note, this is auto generated, some spelling and grammatical differences may aoccur*

2:05

Alright, so today's webinar is 20/20 Trends and biostatistics what you should know about study design. So, you know, basically we're going to cover here is kind of tree of the kind of as we go into 2020 or suppose it were a bit into 2020 some of the topics which are likely to be talked about a lot over the up.

2:35

There's some level of finality in 2019 which allowed hopefully much more wider usage of things in these particular topics or things which are under active talking consideration of workshops Etc about them. So before we get started I should introduce myself. My name is Roman Fitzpatrick. I'm the head of Statistics here at statistical solution to develop an nQuery and I would be angry lead researcher since nQuery and from 3.0. It's about five or six years.

3:05

I've given talks at the places like the FDA or more accurately workshops at the FDA. And I've also given talks and workshops at places like JSM and recently was over London doing one for PSI as well on adaptive clinical trial design.

3:20

In terms of what we'll be covering today, the three main areas will be Adaptive designs in confirmatory trials, Using external data in study planning and Innovative designs in early-stage trials. These areas are seeing a lot of interesting work at the moment and where there's some changes particularly in the regulatory space that makes these are perhaps more interesting in terms of the choices.

3:43

Your clinical trial has many options when designing and analyzing it and this is true going forward to 2020. So obviously today's webinar is presented by nQuery, a complete trial design platform to allow you to design trials that are faster less costly and more successful. So primarily we're looking at sample size determination problems here. There are other tools in the in the study design are Arena that are of Interest here. And of course nQuery has been around for over 25 years at this stage. It's very widely respected and very widely used within the pharmaceutical and biotechnology.

4:19

We have had some good feedback particularly since we moved on to nQuery Advanced, the current version of the software. nQuery version 8.5.1 will be the version being used today.

4:32

So let's get straight into part one adaptive design for free trials.

4:57

So the first part is about adaptive design and confirmatory trials because I think it is important to note that.

5:03

We will be talking about adaptive design in an early stage trials and Phase 1 and Phase 2 and preclinical research board. I think it's important obviously to note that at that is confirmatory trials or IE phase 3 trials. That's where most of the I suppose big decisions are being made. That's where the money is being pushed in the vast majority of cases by 90% of the total clinical trial cost that not to be confused with the total development cost.

5:28

Just that clinical trial calls is in the phase 3 arena in Confirmatory trials and of course adaptive clinical trials have been emerging as a as a favorite solution is something that is definitely worth wider consideration as we try to tackle. Some of the obvious issues that have existed primarily around the increasing costs. But also just some of the general inefficiencies that exist in the kind of classic randomized control trial which are fixed term follow-up. So just in case anyone here is not familiar with the area of adaptive design or adaptive trials.

6:03

Basically an Adaptive design or trial is any designed trial where a change or decision is going to be made to the trial while it's still ongoing. So I think it's important to note that we're mostly talking about changes because that's as well as where the more interesting stuff. The newer stuff is coming from changes.

6:19

But of course if you think about the humble group sequential design, which gave you the option to stop your trial early if you had very strong evidence for or against the treatment that is also a type of adaptive design where you had a decision that you could make earlier than you would have otherwise Able to do so and of course, you know, we have this wide variety of potential adaptions. So we'll be focusing on here today sample size re estimation. But you have this great number of options regard to think like, you know choosing to enrich certain subgroups choosing to change your hypothesis basically have a wide degree of flexibility as well. As the idea here is that you give more control to you when to the trial is to improve their trial based on all available information.

6:58

So the way I often try to frame this is like, if you knew a priori what's going to happen you will could design a trial that was perfectly optimized or efficient for that particular outcome in terms of stuff like sample size and test etc and it was the Adaptive design Paradigm is one where we move closer to that based on information as it becomes available. We move towards that as well as platonic ideal of what the trial would have been if we'd known everything a priority. Of course, we do adaptive design. We do clinical trials because we don't know what's gonna happen beforehand.

7:31

And so it's also important to ensure that we still maintain the credibility of our trial. We have a great deal of emphasis on making sure that you do it in the appropriate way and in a way that makes sense for particular things you expect to happen and of course the whole beers that we can reduce costs and lead to more efficient inferences. And obviously hopefully perhaps lead to these new treatments getting onto getting to patients faster.

7:59

A small selection of the type of adaptive designs from group sequential designs to enriching designs to endpoint designs Etc. I'm not going to cover this much today because that's not the main thing. We're looking at kind of probably more focusing on why in 2020 or more interested in adapters.

8:17

I know why I suppose the area has changed because I suppose it's all kind of goes back to 2010 or at least 2007 it regarding the European situation where the first FDA group guidance draft guidance. I should say was published in 2010 and I think the criticism that they took with the time and like to be fair to them. They thought this was over basically over interpreted by potential sponsors, but the reality of what happened is based on what people have been saying and Industry is that this characterization or this categorization of well understood and less well understood designs in explicitly used to talk about different designs.

8:58

In that draft guidance left sponsors, you know afraid that if they went into the less well understood designs part of the guidance that they would face harsher or harder barriers to getting approval for that design and therefore, you know, you can see based on the statistics of how many people are doing adaptive designs based on work in the area from from from the industry from industry. You can see that they kind of got up to 2010.

9:27

They were kind of increasing and then it kind of Reached a plateau around that stage. So like you can just feed the causal mechanism as well statisticians are want to do but there's probably at least some evidence there to suggest it works. It didn't have the intended effect, which was to provide certainty and hopefully encourage that design rather than putting people off and I get, you know, clinical trials with that idea that costs have been increasing and obviously success rates have also been decreasing at the same time, which is a quite a bad combination of factors as meant that there's been a lot of push.

9:59

To encourage the more Innovative approaches or as paradigms when considering trial design and we'll talk about one of the other areas real-world evidence in part 2 but for now like adaptive design obviously Falls within the remit of kind of looking in a designs and see if the Innovative cures act in the you in the US you're the Adaptive Pathways program in the EU and I think one thing that's worth mentioning.

10:21

These came from legislative bodies, obviously like Congress and the European LL from the EMA in the use case, but that's obviously the reason Have to you but I think one of the things to note is that a lot of this push for innovation in general is coming from patients patients are actually one of the strong horse on the stakeholders who have a strongest interest and actually pushing more adaptive design and you know, you could take an obvious example, like enrichment designs where we enrich certain subgroups or certain certain design like certain arms because they're more successful or based on what we see so far. They're more likely to accessible and patient's like the idea.

10:58

Of being pulled into the most successful Arm based on the current evidence. And so it's easy to see why they're pushing for that type of thing that that doesn't remove the remit to understand things why a control group is important and such but it does mean that you know, when you're thinking about this it is worth noting that your patients as well.

11:16

Sometimes there are forgotten because we're statisticians and trial designer to trying to think in abstract high-level terms, but you know in terms of talking to Patient groups and thinking about what they think should be happening it is Danny's where one way you can talk about how to improve or you know innovation in the area. I think the regulatory context though since 2010 and particularly over the last couple of years. We've become a lot more certain. So we had the updated version of the cedar sieber guidance published in late 2018. And that was actually finalized just in the last few months of last year. So that's a huge step up and I'll talk about that in the next slide, but what that's also allowed to happen.

11:56

Is that now the ice eh, They which is basically the body which is responsible for creating regularized guidance across all of the international drug Regulatory Agencies such as the FDA in the EMA and equivalents in Japan and the UK has now set up at expert working group after its informal working group finish. Its work in late 2019 that's been established and the roof estimated time of appearance from that is around 3 years based on what we know from previous working.

12:28

so, you know, this is a very important step towards adaptive design making it easier to get you know, approval across different types of people because there is some small differences between how the FDA and the ma perceive this area but I think you know compromising on that and creating a unified Vision within the IC H has been a very successful Paradigm and other areas and we expected to be similar here as well as the FDA is probably you know, because I did in 2019 if it's the truth that the theme of today's webinar for but also probably is considered to be the main one that people emphasize when took a confirmatory trials in terms of who to satisfy. So, you know, this was required by them by legislation. I think the important thing away is that this was a far less categorical approach in terms of guidance than the 2010 one much more emphasizing that look we have some certain rules or criteria that are important to us. So, you know blinding show us how you're going to keep the blind is very important.

13:28

type 1 error of 0.025 the one-sided level is still a key way that we're going to talk about whether your design will be allowed to not but if you come to us early and collaborate with a surly and you ensure that you follow our basic rules around pre specifying when and why and how you're going to make your adaptions and you show via simulation that type 1 error control is controlled then we're happy to talk to you and here's some of our perspective and some samples and there's a really nice practical examples where they go through here's an example of a study that we thought did this the right way and you know, and that why this type of design might be seen as good and they go in-depth on certain of those adapters designs and there's also some interesting, you know, just words and stuff like base design and and how to deal with time to event data and are very useful things as well and horse the initial draft guidance.

14:23

There was obviously Scott God leave the former FDA commissioner at the time in 2018 talked about the potential of adaptive design on the left hand side here Okay, so I'm not going to beat around the bush much more. I think it's now were just like looking at adaptive design. And actually this is kind of like almost a two for one because we'll be looking at sample size re estimation but I think unblinded sample size or estimation which is what we focus on today. If you're interested in blinded sample size we estimation. I have covered that in previous webinars. So I'll be happy to share material and videos and examples from that if you're interested just get in touch either by the questions tab or via email afterwards. But today we're going to focus on unblinded Sam.

15:04

Sighs re estimation and it will find out in the moment generally most methods of doing unblinded sample size re estimation are tied into the group sequential Paradigm. Basically, if you're going to do one blinders to emphasize your estimation, it's probably best to think of it as an extension of your pre-existing on blinded sample size Rio. I'm sorry. Yeah, just think other samples of restoration as an extension of the group sequential design that you're already planning to do. And so we're going to do is just take a kind of simple.

15:34

Survival example and what you're actually going to extend by myself to group sequential design. I'm not gonna go over the details of this. This is if they're available slide here, but we're looking at a situation where we have a six month versus nine-month median survival change a hazard ratio of around point six six six and we have these other parameters, but how long will take and just that our group sequential design will be a tree look design with an O'Brien Fleming efficacy bound and acquiring sheet.

16:03

economy, basically gamma futility band with a parameter of - 1 .25 So this is an nQuery. So I'll be doing all of this obviously an nQuery because we think it's been sponsored by Ann Curry.

16:25

So when we look at this.

16:29

What we're seeing an nQuery is basically we split up the sample size wrist service sample size problem from group sequential design and two parts in the top window here. Can the main window is where we're going to enter the fixed term the fixed term parameters. You can see the exact same parameters will be entering if we were required to do a fixed-term sample size calculation.

16:50

So, you know groups and Designs really just an extension of fixed term design and the theory around it, which I won't go into the day, but basically the theory is all based about How much you need to increase sample size to get the equivalent power of to a fixed term analysis, but see how much efficiency you lose or how much you need to compensate for the fact you're you're cheating in inverted commas by looking at your data multiple times. So the actually illustrate that is probably work just to very briefly replicate the original analysis that was done by these authors where they had a 0.025 significance level so you can see here on the left hand side here. We have the names of the parameters required.

17:28

Wired for our calculation and each column corresponds to an individual calculation. So this is much like you'd fill in a custom Excel sheet to get a calculation out with those roads highlighted in yellow being the ones that we can solve for. So you'll see here that in this case we can solve for sample size or we could solve for power. We're going to calculate for sample size this time. Don't worry. You don't need to select its expensively you can just it'll automatically know which one to pick once you filled in enough of the table.

17:54

So suppose, you know, if we go back to the original slides here and you'll notice They have the accrual period i minimum follow-up period as 74 on 39 and so there's a bit of translation to do here like one important thing to note in survival analysis. I just has a small note is that if you have things that are denominated in time like a cruel or like a cruel time and your median survivals, they all need to be on the same time unit level. So in this case you could see that the accrual period and the follow-up were in weeks yet. The median survivals were months.

18:25

So we need to basically convert the accrual period And the follow-up in two months and the obvious way to do that is to divide them by four. So if we were to open up a calculator we can do that from the windows calculator Options under the assistance menu here if you're interested we can just take 74 and we can divide that by 4 and we should get 18.5.

18:48

And here's a small note here the anchor he's asking for the maximum length of follow-up not the minimum length of follow-up. So in fact, they're looking for 74r accrual.

18:57

+ 39 the minimum follow-up period so the minimum value period is like what would happen to someone recruited at the very end of the accrual period where if it were filling in here the maximum like to follow up is how long could someone theoretically followed for if you recruit at the very start of the accrual period so I'm not going to I'm not going to do the math there in calculator again, because I I've covered this example before but basically 74 + 39 is a hundred thirty-nine hundred 13/4 8.25.

19:27

You would then see it here that you know were required to enter these exponential rates or exponential parameters. And you know, I won't belabor the point here, but this is a nice tool down here. This is a side table added to the helper side table.

19:39

What we'll see in group sequential is an example of a mandatory side table that is it the input from these are actually used in the solver where it's easier to just this helper side tables primarily is there to ensure that you can very easily and quickly get the values that you want so we can put in six Median survival of 6 for Group 1 and mean cycle of 9 for group 2 we get this Hazard ratio. Don't worry about the this here. Like that's just the reciprocal so 1.5 and 0.666 are the same thing from a sample size purpose and 92.6 for the power. I'm going 276 and 282 true and a to events.

20:16

And if you go to the original paper, that is obviously I referenced here the NBA or Lemos paper you'd be able to find it. That's what you get there. But if we just go across back across to the group sequential design, you can see that these parameters that are required here are basically the same parameters required here. The only difference being that we now have the option to have only equal sample size per group. So I actually just copy and paste the two exponential parameters to save time in this case.

20:45

We're using the other parameters ation where it's Lambda 2 divided by Lambda 1 rather than Reciprocal of Lambda 1 Lambda divided by Lambda 2 and okay. Well enter the other parameters required. So we're still using a 0.025 one-sided analysis. We still have 18.5 accrual and twenty eight point two five follow-up. And one thing to note here is that you know for group sequential design and for power in general, you know, survival analysis only is really only interested in the number of events. Not the sample size like so actually in actuality. The only parameters are really interested in the hazard ratio.

21:20

Show the significance level and whether it's one or two sided and the anti what the sample size ratio is all the other parameters are basically just there to give you an estimate of how many people you will need to get the number of events. So it's really where the action is really happening in these two cells about the number of events. Everything else is just a kind of estimate of how many people we would need to get that number of events. But before we fill in the group sequential design this fixed parameters set we want to actually, you know, set what our group.

21:50

Sequential design is going to be so in the group sequential design side table, which will appear automatically below the main table.

21:59

You can see on the left hand side. We have something that looks quite similar to the main table. And this is basically the setting the rules for our group sequential design. So we're going to Define what we want our groups going to designed to be another right hand side. We get the airport's for various things will be important to actually do the group sequential design. So these are outputs effectively. So we said we're going to do a tree look design. So we'll change the number of looks the tree that includes the final look for references is to interim analysis and a final analysis.

22:27

We said we do an O'Brien Fleming spending function we could have used we could have user-defined upper and lower bounds we want to do or you can choose not to calculate and efficacy bound just like we are by default. We don't choose to calculate a futility banned by default. But we want to look will you stick with the default of an O'Brien Fleming spending function. And so this is the conservative bound basic means it's very difficult to stop early for efficacy.

22:49

And that's generally what's recommended because a conservative band is one that won't spend too much of your Alpha early on that means the basically Um, the final look will still have most of the suppose Alpha most of the type 1 error associated with it. If you're not going to go into too much detail here, but basically typically we use a conservative FC band because you want to reduce that chance of getting what would have been a significant result that we haven't done a group sequential design, but then ends up being non significant because we choose to be too aggressive in our spending function. I suppose busy imagine you got a p-value or Z statistic.

23:27

Evelyn to a p-value of around 0.045 and then you find that that wasn't significant because you spent a lot of alpha early on with your efficacy band. That's the kind of thing that obviously I suppose it's more likely to lead to troubles for lack of a better term rather than using a conservative band where you're not really going to have too much of that in terms of utility bangs. You can use a binding or non binding or choose not to have one. We'll use a non-binding one because that's basically the recommendation that basically means that we can go over a few to leave.

23:57

And but still continue on to the next look if we so choose to whereas if we had a binding one you'd actually have to stop trial because if you continued on through a binding fertility band being crossed you can actually affect the overall error rate of the type 1 error rate of the design and you can also get some weird artifacts. Are you end up being able to stop early Based On A P value that's greater than 0.05 or greater than 0.025 the one-sided level so we say we're going to use the acquiring she to Connie.

24:27

Adding functions so we can select that here and then we can enter our gamma parameter in this case of 1 - 1 .25. So for these two distributions of hungary's Cannery and power where you require an additional parameter, you have to enter that additional parameter make it work.

24:42

So then we go to the power we enter 92.6 and we'll get our I will get our sample size of close to what we had originally. Sorry. I just forgot to set the sample size ratio. So we have a hundred ninety one per group, which is a you know, a fairly not a huge increase over a hundred seventy six per group.

25:04

So we're looking at 15 there and you can see that we now require what is around three hundred and seven events versus 282 so not an insignificant results and you can see here just like finally you can see here on the right hand side is at Abel of our group sequential design parameters as well as the important ones to note here are the up or efficacy bound. So if we had a standardized Z statistic, so in that case, that's really just the log of the hazard ratio divided by the reciprocal started by the square root of 1 over c 1. He won't be in the number of events in group 1 plus 1 over e to the square busy the standard error of the busy little the log normal scale. That's what you can use a normal distribution.

25:50

Then you get the so the standardized Z statistic is around three point seven 12.5 one about 1.99 tree, which is quite close to 1.96 the 0.025 value and you see the futility band a bit more aggressive like if you had a basically let's say a hazard ratio around one. You'll be stopping early here and if it was even was a sorry if it was around and it was less than 1 or like closer to 0.666.

26:14

Basically quite close to 0.66 to be stopping at look to and if you want to just First outing to kind of p-values from a simple z-test that's what the nominal Alpha rolls are here. So basically if I got a p-value on a simple z-test on that standardized a statistic of 0.006 or less at Look to You Know two-thirds of the way through I would stop early for efficacy here and if it was both .15 to then I would stop early for futility or I have the option to stop early for futility.

26:51

Okay.

26:55

So, you know, we're going to talk about sample size 3 estimation.

26:57

Now as I mentioned there's on blinded and blinded or comparative and non comparative in the new terminology of the FDA guidance on I suppose the main thing to take away from here is that it's an obvious Target for adoption because you know, when we do sample size estimation in the first place, we're usually making some guesses about certain parameters, but they're not be the effect size the hazard Ratio or some other parameters like say the variance in the case of a means analysis, but the important thing I'm blinded sample size information. We're taking a look at the interim effect size. And therefore we are basing this on the comparative labeled data.

27:35

So, you know, when we going to choose to increase sample sizes when we get a result that is in inverted commas promising which is to say that it's kind of that you there's different opinions about that should mean but in basic terms between the result that means you don't believe you're about to get a significant result, but that the final value would still be at the final value for the effect size the final Hazard ratio would still be considered clinically relevant or clinically significant and really as I say, it's really just an extension of group sequential design where rather than just been having the choice to continue or stop early, you know, how the choice to increase and for those promising results basically results that are less than your initially expected Hazard Ratio or more than your has a ratio in this case for it below 1 but which are still clinically relevant.

28:24

And so this is a design that can provide efficiency and I suppose the the basic idea might be that you could power up front for an optimistic design or one based on what you expect to happen, but still have the optionality to increase the sample size if you want to for those lower than expected but still clinically relevant effect sizes. So, you know cost for optimistically but have the option to still find relevant effects. And so the common criteria here is conditional powers.

28:54

Oh there is enough busy the probability of significance given interim data. And the fifty two methods here the Chen dimensional an nQuery hung and wine merchants and land is a much more restrictive approach board is one where you can basically do use the same test statistics and approaches as for a standard group to kind of design where the query hung and whining approach uses a weighted statistic both one which but one which all otherwise Works more or less the same but where you have a lot more options about what you can do.

29:24

So change the metal and it has to be at the penultimate look base your last interim analysis and a has to be for only as very very select number of within a certain range of conditional powers where it could be hung and why you can basically do what so for the sake of time. I'm only going to cover chenda Mets and land today, but if you are interested in quitting and whining, I have covered that in previous webinars, I'll be happy to share information from that.

29:51

So basically the main thing to take from this slide is that we're going to be a similar in that the interim Hazard ratio ended up being actually 0.8 rather than 0.666. So you could trivially do the calculation there for that. And so basically based on that we're going to do a sample size re estimation at look to for the chenda Mets Atlanta. See what we get.

30:16

So we're turn to nQuery and an nQuery. This is only available dancreep row. We have this interim monitoring and sample size or estimation button. So we select that it will open up the this tool in term monitoring and unblinded sample size for estimation tool and you can see here basically that on the left hand side very similar to that side table.

30:37

We just looked at on the left hand side we have on the left hand side. We have the rules and on the right hand side. We have our outputs. But in this case, it'll actually be our input as well. And you can see that certain things have been inherited from our previous design like the efficacy bands utility bounds and some information about the bones Etc. So I'm not going to go through this in too much detail today. But if you do want to know when any more detailed let me know but basically, you know, we're going to default to the chenda metal an approach. The other approaches are available here at look to because that's required for chenda Mets and land.

31:09

We're going to assume you know, Full size increase of up to three times our original sample size up to 927 events allowed. Remember this is the number of events not the sample size. That's what the power is based on we're going to use the exact events rule that is to say we're going to increase the events on till the conditional power reaches our original Target power. The max events Only Rule is available.

31:32

If you're worried about people able to back calculate the internal events who aren't supposed to like people are participating people who aren't in the data monitoring committee basically and you know, we're going to use a range from based on the derived made a pocock approach where if the conditional power at look to its between 28% and 92% We're going to increase the sample size on till the conditional power equals the target power of 92.6% So once we've set our rules, we very simply done just go into our left hand or right hand table here. We're just going to you know quickly enter the very basically the summary statistics that are important to calculate the standardized result both basic. This is your classic wall statistic your Z statistics, whatever you want to call it as I've already said the calculation for that previously, but as we said we said we're going to say the interim Hazard ratio is 0.8.

32:28

In this case so we can see that when we have a hazard ratio of 0.08 that's equivalent of a standardized the statistic of 1.1 tree to which is obviously between these two values of 3.7 and - 0.107 and we have a conditional power of 48 percent. So we're saying there's about a 48% chance based on what we know right now that we get a significant result at the end of the study based on this Hazard ratio of 0.8 we go in to look to and now we now we have a third option that wasn't available to look one which is that if it falls into a certain range of conditional power, we're going to choose to increase the number of events.

33:02

Lock tree at the final analysis. So between now and the final analysis basically and you can see here that the conditional power was 47% which is between these two values of 90 to 98 and 92% And it says to get the power to go from you know, 47% up to 92% our original Target power. We need to increase the number of events from to from 309 815. This is a very aggressive sample size for estimation. You probably wouldn't be recommend to do this in real life, but I'm just doing this.

33:32

Out of a toilet seat toy example for illustration purposes and you can see that how that if we go to the final analysis and enter 0.8. Sorry. We got to 0.82 side table that appeared below. We'll get the test statistic that's very significant. Therefore we have power of a hundred because it's basically above the upper efficacy band.

33:53

And of course I could redo this without the sample size for estimation and you easily find that in this case for this toy example, it wouldn't have been significant if we hadn't increase the sample size now, The toy example, so the outcome is pretty much preordained, but just an illustration of how this would work.

34:10

Okay, that's kind of the Practical approach to it. But mostly were hopefully talking here about the trends and why it's important. I suppose, you know, hopefully you've got an idea of why you know how sample size re estimation could be important to you for finding those promising results allowing you to plan for an optimistic or your expected result while still being able to find those clinically relevant results for this particular type of adapter sign. Obviously, there's a wide variety not cover today. And obviously the main takeaway from today is that you know, adaptive design is open for business.

34:40

Uncertainty now exists around it. So if you talk to the FDA early, then their ears are open and so I would recommend that you do and you consider these type of designs and the efficiencies they could bring to you and just a small note that survival analysis does have a couple of additional complications for adaptive design which I won't cover in too much detail.

34:58

But basically there's some additional assumptions that are sometimes required for those conditional power calculations for example, and the unknown follow-up and the unknown time of your interim analysis because obviously your way Ting for the number of events so you can't really predict precisely in calendar time when you're actually going to be doing an interim analysis those mean there's more uncertainty there and also means that you could end up in a situation where before you make your interim decision, you might have recruited or had a lot of additional events happen as well.

35:26

So definitely practical considerations to keep in mind on and then one very technical statistical point that you know, in sample size re estimation in particular you do have two choices for how you might want to increase the number of events, but increase the sample size or you could Make the study run longer, but just note that this paper for fried and current talks about how you could buy us your results by basically choosing which of these two Dimensions to emphasize because obviously increasing sample size brings more new patients in and if your interim effect size was stronger in that earlier time period for each patient, then you might be biased towards that whereas if the effect size increased for towards the tail or the end of someone's time in the study, then you might choose to increase the time.

36:10

I'm so you have a two-dimensional problem here that you could optimize for I suppose negative / busy to optimize chance of getting a good P value rather than based on you know, you know trying to get the right result the one that's inferentially, correct. Obviously, this is done by independent donor data monitoring committees that shouldn't happen. But is something worth keeping in mind as a slight risk.

36:32

So adaptive design probably the big area. So the other two areas are kind of ones which are maybe more in flux and perhaps for study design are less interesting call First a phase three, which is what angry mostly focused on but the I think they are were talking about as well.

36:47

So I suppose the next one is like using external data in data in study planning and as well as this kind of feeds into the general idea now of real-world data and evidence or WD or w/e the trend that we're seeing of where we want to take real world evidence, you know data that exists from previous studies from you know, pilot studies from other areas. We want to use that to make our designs more efficient and more accurate and to hopefully improve the process of getting regulatory approval both as well as in the high level of like phase 3 trials maybe as I was a stretch goal, but perhaps more importantly looking at kind of more Niche things. So post marketing approval would be an area where this make sense.

37:31

Synthetic controls and the context of earlier stage designs and alternative uses. So, you know the FDA program and has been active on real-world data and evidence in 2016. This is them talking about. Well, one of the things that we think this can be used for one type of things. We want to collaborate and talk with industry Partners in terms of doing this and they've been running lots of different workshops and have published several guidance documents in the area since 2016 all actually catalyzed by that same innovative.

38:01

Cures act that we talked about earlier in the context of adaptive design and I think you know, it's important to note that you know, like all things nothing happens overnight, but we are seeing now real successes in this. So in terms of alternative uses 29, very highly publicized success case or it Brands a drug from Pfizer was approved for usage in men. Whereas previous it only being approved for women in breast cancer.

38:26

So that was based on the real world data real world evidence that they had done in-house based on their Surveillance based on the previous studies and that's the kind of success and the successes that we've seen which should accelerate interest in using real-world data and evidence in trial trial design and in trial analysis.

38:46

And I think you know when we talk about designing trials, you know, we talk about sample size a lot of course and which assumes he's fixed values for unknown parameters, but all of these parameter estimates have uncertainty around them and we're supposed to do things like sensitivity analysis, but they're kind of ad hoc and post talk but I think if we're talking about as well as we're talking in real terms about you know, what we're interested in we are interested in the highly likely is our trial to succeed because our traditional power analysis is making very fixed.

39:16

Assumptions about what's going to happen we have to fix those and we can just bury them a little bit ad hoc to see what could happen. But what if we could take that in a more formal way we could formalize the idea of like How likely do we think this trial is to succeed based on our uncertainty around those estimates and then perhaps provide a new way to talk about. You know, how good is this trial? How you know, how useful is this trial based? Not light and so one way we can do this and this is obviously only one of many ways is the idea of assurance.

39:50

And so what surance is really just a way of doing power analysis where instead of we assuming fixed values for parameters such as the effect size. We actually choose to consider those parameter estimates to have someone certainty around them. Basically the classic Bayesian thing of turning a fixed Point into a distribution. And so this is based on the work of o'hagan.

40:11

It'll obviously a lot more work has been based in the since then I'm basically just it's the expectation of power averaged over some prior to Fusion for one or more of the parameters and so I would actually say that better candidates are things like the variance but the effect size is probably the one that still managed to get the most true push. And so that's why this is going to be called a true public success because you're now considering well, I'm not just assuming the hazard ratio is 0.66. I think it's like could be somewhere between zero point, you know, five and zero point seven. It's like the 95% confident about I want to consider. That's how most probably Mass should be averaged over.

40:49

So, you know basically think of as a Bayesian analog to your kind of classic sensitivity analysis around and just picking ten different scenarios, you're not thinking of all the scenarios at once because you've taught of a distribution of what would happen. You need that covers all of those scenarios and just know that there is a kind of analogous idea to conditional power known as predictive power, which is basically doing the same thing except. Of course, the difference is that you're combining your prior with the, you know, interim data to create a new posterior for your best estimate that your average.

41:19

Over in that case was actually little bit better as well as than a surance which is purely based in the area of priest up you design your classic sample size determination fixed-term Troy. So I've covered this in a lot more detail in previous webinars.

41:32

So if there's anything you want to know more detail on feel free to get in touch on be happy to share a previous webinars or answer any individual questions on but to illustrate the practical way of doing this in nQuery and it's going to take a very simple cyclophosphamide study from New England Journal of Medicine where they were just doing a two-sample t-test example, and so, you know, we could very quickly illustrate this example, but just like doing the calculation which was just 0.05 two-sided Alpha level.

42:03

I mean difference of 9 a standard deviation of 16 per group and a power of 90% I believe and that gives a sample size of 68 per group, which is the sample size here. If you adjusted for Dropout just know that there's a 50% Dropout in this a hundred sixty three figure. But if we look at the de surance similar to what we saw in group Central design, most of the parameters that we require are pretty much the same. So this is an assurance table. This kind of looks pretty familiar if you're looking at the right things. So we got the significance level. Okay. That's the same we got this.

42:40

Prior mean difference so we'll assume that the prior mean difference is that would be the same as the mean difference would be using a sample size calculation of 9, we have these standard groups within groups standard deviations of 16 will assume those are the same and then we have a sample size of 68. Okay, that's pretty similar. And so the only thing that's really left at this point is this prior variance for the difference.

43:05

We also did a little towel figure here and this type of figure you should just look, Earlier at this is basically just the your squared thundered are so you know, your standard area of the sum of the sigma is divided by the sample sizes. I started by the variance is divided by the sample sizes usually square root for the standard error. This is that what this is before you square root of the variance of the you know of the parameter estimate.

43:31

And so that's useful just for scaling because we probably expect this prior variance be on the same scale as this Tau variable and so the only thing that's really new is this prior variance of different was just basically, you know, if we go back to this distribution here instead of assuming that the mean difference is exactly 9:00. I just imagine this is 9 and number line.

43:50

We're actually going to assume that it's a distribution that's normally distributed centered on nine, but with a custom variance and in this case, we'll just pick a custom variance around 10 and you'll see here that Assurance of around 80% so if we assumed that the mean that the mean difference was better characterized by assuming a mean difference of 9 with a variance for its distribution of 10, then we say that Assurance is around 80% So this is on the decimal scale, but just multiplied by hundred for 80 percent Assurance. That's what 10% drop versus the previous case, but I would say actually it usually recommended an assurance that you go for the one-sided equivalent.

44:31

It's like 0.025 at the one side level primarily just because There's an issue here that we had a very liberal prior will see this in a moment.

44:41

I'll just show you by plotting if we had a very liberal prior say the prior variance difference is like a thousand what would happen is that these could get very different results because the lower tail effect that is say the significant results on the lower tail where you had very negative differences for example would still be significant under the under the two sided case, but in clinical trials really aren't of that much importance like we usually don't think our brilliant our result was the exact opposite direction of what we had and so I would recommend practically speaking cuz that comes an issue here that isn't really true for classic sample size determination that that lower tail effect is worth considering and doing the one-sided analysis instead.

45:21

And so, you know, we can illustrate that by a plotting. So if we select our two columns by holding down the control key and just selecting the like cells within each column and go to the plot user selected row option here it just in this toolbar we can also do from the plot menu, of course, and we see what would be the effect of going, you know, plotting the prior variance for the difference against the insurance.

45:43

So we're going to see very the the x-axis the variance for the difference is V 0 actually people the Greek symbol that looks like V week to go from 0.01 up to what we said a thousand. So, let's go up all the way up to a thousand that we increase in increments of 1.

46:06

And what do you see here while you see that effect basically, so call him one here is the one in dark and the wall other one isn't thing and so will actually just makes a little easier so we could go to the series here for Colleen one will just give that a color.

46:23

will stay red Just to make things clear. So the column 1 the 2 sided case. You can see here that you get this trend. Whereas the variance increases. It kind of reaches a low Point here, but then kind of goes up and what you'll actually find it that will go up to the original Power of ninety percent. Whereas the one sided case will actually go all the way down and end up around 50 percent. And the reason that happens is basically just that for the two side of the case.

46:51

If you imagine you're an infinite prior an infinite uniform prior, like all possible mean differences were equally likely the under our prior well for the two sided case, you can imagine that the vast majority of cases will be significant because the you know, nonsignificant region is just a small enclosed area around the mean difference of zero under the a superiority hypothesis or is the one sided case you can imagine that like half of them will be in the half that significant and 1/2 billion the other And of course we could extend this result easily to other cases. For example, we could look at a prior not only over the mean difference, but we consider like what will be the prior four if we didn't know the variance with certainty like I'd say the variants came from a pilot study of around 10 people or 20 people and you can see here that if we did that let's let's assume it's 16. That's our actual estimate came from a net. So let's say we got we think it's actually 16, but our current estimate is 16 only came through a sample size of 20.

47:51

D. This is a simulation approach. So just some simulation parameters and we have 68 here and you can see here to sort of assurance updates and there's various other insurance options where you can have a custom prior for example where you could basically set a custom prior where we have to assume some like set of estimates for the for these things of 9 and 16 for we can also basically also just instead of setting like a busy.

48:21

Prior based on a specific distribution we could just go. Well, I think that the mean difference of 9 is there's like a 50% chance that's true. Then there, you know, we'll look at 10 11. I'm sorry ten eleven and twelve maybe a small chance of an under estimate.

48:41

And then assign those each an estimate of likes a 0.1 0.1 0.1 0.2 respectively.

48:55

Sorry to shooting 0.5 up here. And then we sent our sample size of 68 and we'd get insurance of 91. So in this case, obviously, we've said that our original estimate very, correct. And therefore we end up getting a result that's you know, as power actually more powerful because we've been kind of optimistic here.

49:13

and so I've covered Assurance in much greater detail in previous webinars, but if you're interested, this is an area where real world data real world evidence where you kind of use real-world data or you could talk to experts and use elicitation and there's a really good elicitation framework from o'hagan called shelf the Sheffield elicitation framework about how do you combine real-world evidence and data to create priors for your expected results effectively and you can feed these into assurance and then you can have a talk about what how much time How likely do you actually think we are to get a significant P value given our uncertainty rather than having to make these very strong assumptions or these kind of ad hoc assumptions for some for sample size or or sensitivity analysis in general.

49:56

I thought finally just in the final ten minutes or so.

50:00

I think it's one final Trend we can talk about it's just early stage adaptive Design This is really more of an ongoing design because the last suppose 10 20 or well, let's say 10 years have been very much where adaptive design early stage study designers have really been leading the charge in terms of innovation and doing new things and of course in early when I come but early stage and mostly down here by Phase 1 or our first safety evaluation don't healthy volunteers or we want to evaluate what the maximum tolerated dose or MTD is where the traditional approaches were based on like simple rules or Phase 2 trials where we had proof of concept where we want to find if there's a signal or dose finding which is where you want to find the doses appropriate for phase tree and we know those are usually separate into two like phase 2A and phase 2B for those two purposes where we're seeing is lots of innovation tying these together into seamless design But also even if it's shooting Duty separately how to do these in a more efficient manner and in a way that better reflects what you're actually trying to achieve and of course because you have these high failure rates and higher uncertainties points. It makes a lot of sense to do with that design. So if you have high failure rate, you want to kind of your kind of it's a selection problem, right? You want to quickly get to the correct results because you expect most of these to be bad results. That's particularly a problem the phase two where you know, 80% of them are off. Your candidates are likely to be failures.

51:24

You're like, I just want to pick the Percent as quickly as possible adaptive design is an obvious way to do that and in phase one we are we literally know nothing about how this happens in humans in empirical sense. So let's have an Adaptive design that reflects and and updates what we think in real time basically and for Phase 1 designs from not be able to cover today.

51:45

I have covered previously we've seen these Bayesian models like these continual reassessment method the base logistic model which basically are updating adaptively what the best Well, what the best dose to pick next he's and their efficiency has been shown to be better than these rule based approaches in most situations. And in in in Phase 2. We've seen this approach where you know where where was futility stopping is obvious Simon designers being, you know ever present for ages, but also we have this problem though. Well, you know if we're going to do those selection do I really want to do these two different designs?

52:20

And so the thing that was brought up was MCP mod this mod will be all comparisons procedure - modeling more approach which basic is combined the two pre-existing ways of doing a proof of concept phase to a design and phase to be dose finding approaches where MCP mod is a very good way to quickly establish that a signal exists but modeling is the way to go if you actually want to talk about what you actually expect to happen at a given dose even including ones that you haven't considered so far and so you can see here that MCP is robust but restricted to A selected doses where's modeling is flexible but relies but requires you to make an explicit model choice and actually mob it allows you to you know, design a study where you can consider multiple different models like say exponential linear logistic Etc, but you but you also have that power or that robustness have been able to say okay an effect exists.

53:16

We've established that only now can we consider the models the model of interest and when we use that model we've done it in a way that ensures that we are Aren't you know increasing our type 1 error rate which would happen if we just kind of did that willy-nilly without thinking about stuff like the formalized approach proposed by Brett Hull and of course the EMA and FDA of both considered this to be fit for purpose. And for that reason, this is a one of one of the many methods and many of the variance of this methods which are increasingly popular and phase to design.

53:46

So this is an ongoing trend of using things again seem odd, and it's variants and I think that's expected to grow even further in 2020 and hopefully You can see why based on this very short presentation. I have cover this in more detail and recent webinar. So if you're interested, I can convert their this example is just taken from what the kind of the foundational paper in designing an seem odd, but just due to time constraints are going to go through quite quickly. So once again, I have covered this in much more detail in previous webinars. So if you are interested in seeing the details, feel free to get in touch about that.

54:22

But in this case, the first thing is our significance level at the one-sided level very similar to what we had and then we pick our number of doses which includes our Placebo dose that is to say a dose where you're not going to get any of the proposed treatment at any dose and we're going to look at six different models considered in this case. We're going to say that we want the mean power across these six models to be equal to our Target power in this case, which is equal to 80 percent.

54:50

We have a placebo effect zero and a treatment effect of 0.4 and a standard deviation of one. This business means standard deviation ones means it's a standardized treatment effect. So, you know, this is will be I suppose we consider it a medium effect in the standardized scale. I mostly filled this main table. We just need to fill in the side table, which is very similar to what we've seen before. So the different dose levels here were 10 25 50 100 and 150 and what's important.

55:18

To note here is that this isn't the expected response at this given dose. This is the actual amount of the dose that you're getting. So like 10 milligrams 10 milliliters that type of thing. So obviously the dose for Placebo has to be 0 of the thing of interest and then if we quickly just entered the models many of the models require additional parameters, the linear is an example of one that actually doesn't linear model kind of does itself but other parameters such as the emacs require you to specify things like Ed 50.

55:46

That's what's the thing in the bracket is I need 250 of course is just the you know dose we expect 50% of the maximum to maximum effect to occur. The other ones were logistic, which I need e50 this case a 50 so you can tell that obviously under different models.

56:02

You have different assumptions about what you expect the Ed 50 to be something Emax is expecting more aggressive effect up front at the exponential model, which is kind of just is just your kind of exponential rate of 85.

56:20

And then we have our two beta models. So our beta models are what what are our two most flexible model? So these are models which allow us to basically allow things like non-monotonic relationships and stuff like that. I can't go into them in too much detail today on this case. We're selecting ones of 0.3 tree and 2.31. And if you're familiar with the beta distribution, these are basically just the same parameters used before the beta distribution.

56:45

So the mle will be equal to just the sum of the parameter 1 Sorry through beads parameter 1 divided by the sum of these two parameters. So, you know 0.33 / point three three plus two point three one and of course based on that you'd expect that in this case on surprisingly.

57:05

This is one which is assuming that the that the minimally the most likely dose is very early and this one is where it's around the middle because obviously something divided by the sum of itself or self x 2 is going to be 50% Ant and the scaling parameters you're stretching the classic beta parameter which is like 0 to 1 up to 0 to 200.

57:26

So we enter our power of 80% And it takes a while to run just because we're using a multivariate t distribution as out the little bit slower and that's kind of just going to run a while. They'll actually turned back to that because just we're running a little bit behind time. So just give you some notes on the MC mod just some of the things in Anchor either available and of course just the mention that you know, you don't have to select just one model you could select multiple models and you could select the models based on criteria such as the P value that you get for each model, but there are other criteria such as the AIC or be.

57:59

E which are probably too young say I see is probably the one that people default to and as I mentioned there's various different extensions that you need to have been proposed which may be of interest to you.

58:10

The discussion to conclusions here. This is running in the background here. I'll check the results at the end just for for completeness.

58:16

But you know, there's been a desire to improve trial efficiency and Innovation at all stages of the clinical trial process coming from industry, of course coming from legislation coming from legislative bodies that Congress and coming from patients, which I think is an underestimated point in this and in faith re-adopted trials are growing breasts and that certainty coming from the FDA and of course the ice I CH group work could obviously accelerate that even further in a few years time, but you know sample sizes. It's a classic easy starting point, but obviously, you know, the world is your oyster when you move through may move choose to move beyond that real-world data is being increasingly used to increase try proficiency and Assurance an example of where real world data could be used to improve how you talk about Trial design or how you actually consider trial design from the sample size perspective, but obviously post-marketing surveillance alternative uses.

59:09

Other ways that real-world evidence is being used in the real world and early stage adaptive designs are already very prevalent prevalent and Innovation is happening here. Basically neck breaking electricity busy breath Breakneck speed and you should definitely keep abreast of the latest innovations that are happening in this area and You Know sample size determination and design model for those are obviously important as well.

59:36

So, you know while the MCP mod thing finishes up there, I do want to if you have any questions based on anything that we covered today. Feel free to answer them now, but you are also free to email me at the influence that sells.com before we finish up though. I just want to mention that if you do not have a license of nQuery or are interested in trying out a feature that's not currently included such as an creeper.

1:00:00

Oh, you can try those that sells.com forward slash trial where you can get a A free trial of the software just by filling in some basic details about yourself and you can start trying the software within the browser. You can just it'll start within the browser you can play around with it for a week or so and then decide whether you want to purchase the software or not. It's very very easy and very easy very very clean. So feel free to use that solves.com forward slash trial if you want to try anything either. If you don't have any career any features that you aren't on your current license and you know nQuery is obviously expanding all the time.

1:00:36

Two different areas. So there's any area of interest to you or any designs that are being covered by enquiry do get in touch. We are always looking for feedback on what we should be covering next.

1:00:48

So just for completeness.

1:00:49

Here's the MC modernized as we did. I know the main point to take away. We had a sample size of 372 or 62 per dose level here and you can see here. We have these Powers per model. So, you know logistic a 90% power about 70 something percent for most of the other ones and I was the average of these that is displayed here, but we could also choose to have like the minimum or the maximum as well or the median one.

1:01:14

So, you know the important point to take away from this is just you know, we have this details about each model additional details here if you want to know About this do feel free to get in touch and we can send on videos where I had I covered this at the whole webinar in a previous webinar in a previous case in our webinar series.

1:01:31

The references are also available here and there's also some introductory material that you could find in the webinars tab from stat cells.com forward slash stars, and the references are available to slide deck. So the slides and the recording will be sent to you later today, hopefully and I'll just be taking a moment here to look at any questions that came in and I'll be back to you in a moment. But I was the only one who has to leave. Thanks for attending and I hope to see you in the future.

1:02:04

Okay, the last couple of my couple of questions is one main question just from Frank Poitier. Do you think it could be possible to use real-world evidence data as a prior in the context of a proof-of-concept study design. Well, like Assurance is a very flexible concept.

1:02:21

So proof of concept design would probably be using something like a Simon's design or something similar to that and I don't see any reason why not although obviously not covered angry right now, but I have no real reason why you wouldn't be able to take a simple prior for say you're expected treatment proportion for a proof of concept design on seeing walk the The different powers would be based on say fixing the stage 1 and stage 2 results that would certainly be very possible and it is something that you could probably do if you're using like I say more trivial one sample design. Let's say just looking at upper portion versus historic control as your proof of concept design rather than the more exact, you know, something more like the something more complicated like the Simon design.

1:03:05

So you say a one sample design that is covered in nQuery already so you could go to You could go to.

1:03:18

You go to proportions fixed-term design and then you can see bays and Assurance here for a beta prior for sampling. You know, if you're doing this of want a simple one sample design you could do an insurance calculation based on as well and you can even use that perhaps the kind of a pseudo Simon's want to cover that either.

1:03:37

And this is one other question there about predictive power as an alternative in unblinded sample size for estimation. And yeah, like there is a lot of debate like there is a lot of criticism of conditional power because it makes these very strong options about what the real effect size is force, like most of the work happening up to this point has focused on using conditional power as a criteria and socially operating characteristics if you use predictive power may be different and there's also the obvious question coming from that real.

1:04:06

Eleven inspired of like well, how do you pick the initial prior that you're going to use to then generate the posterior the interim posterior effectively because the prior you choose to combine with the interim data to create your post area that you average over for predictive power could end up being very important. So that's another thing where the real world evidence stuff could bring into are so I think pretty good power is probably one of the things that people thinks we should be doing that but isn't currently the way that people are doing it because people haven't investigated enough or don't have enough certainty about it to kind of do it yet.

1:04:36

So, I think that those done for today. Once again, just want to thank you so much for coming today's webinar. And once again, if you want to get in contact info at cells.com will take their info at statsols.com. So once again, thank you so much for coming and have a good day. And I hope to see you soon.

Previous Story

← 5 Reasons to upgrade to nQuery v8.5 [Video]
These Stories on Guide to Sample Size

September 17, 2020 |

47 min read

Copyright © Statsols 2020, All Rights Reserved. Privacy Policy

## No Comments Yet

Let us know what you think