Get sample size updates by email

Receive educational sample size content

Get sample size updates by email

Receive great industry news once a week in your inbox

Get sample size updates by email

Receive educational sample size content

Get sample size updates by email

Receive great industry news once a week in your inbox

44 min read

Written by nQuery Sample Size Software Team

March 18, 2020

**"Designing Studies with Recurrent Events - Model choices, pitfalls and group sequential design" **examines the important design considerations for analyzing recurring events and counts.

Model choices, pitfalls and group sequential design

Recurring events are common in clinical trials (e.g. COPD exacerbations, MS relapses) but have often been analysed using survival models or other approximations. But these simple approaches fail to use every event.

This has led to increasing interest in recurring event and count models and how these allow us to analyse all recurring events or counts and thus provide additional insight and power.

In this webinar, we will cover the different options for the analysis of recurring events and counts, the sample size calculations available for this area including adjustments for unequal follow-up and dropout, and extensions of the group sequential design framework to the recurring event endpoint.

- What are recurring events and how should we analyse them
- The design issues, models and pitfalls in recurring events
- Sample size determination for recurring events and counts
- Group sequential design for recurring events

**Slide preview from the webinar**

**To watch the recording of this webinar, just click the image below.**

**Duration:** 60 minutes

Transcripts of webinar

*Please note, this is auto generated, some spelling and grammatical differences may occur*

0:03

Hello everyone and welcome to today's webinar designing studies with recurrent events, model choices pitfalls and group sequential design though. Today's webinar will be covering the area of recurrent events counts incidence rates.

There's a lot of different names for this but basically any type of data point you have that will occur multiple times over a given time unit or other type of say spatial unit.

....

1:41

So in terms will be covering today, we will first be looking at recurrent event data and the type of models that are common in this space will then be looking at sample size for recurrent event models and then look will be looking at something a bit more new which is group sequential design for recurrent events. And some of the particular issues that come around this perhaps it's a kind of open era to what we can expect in terms of adaptive design for a current event models as we go forward in time. And then at the end we'll have time for discussion inclusions and answer.

2:11

questions that you may have had so just so you know today's webinar is sponsored by inquiry and creates one of the world leading sample size platforms that can help you design a trial that's faster less costly and more successful and it is the number one solution for optimizing your clinical trial design process and is used by over 90% of organizations who had clinical trials approved by the FDA and some of the engagement we've had from users is here on the screen and some of the companies and organizations. You work it out or bottom here.

2:43

So, you know because obviously this has been sponsored by inquiry. I'll be displaying most of the you know, decisions and ideas via and creole again to the software demonstration part, but obviously most of the things that are being talked about here will be relevant regardless of whether using and Courier or not.

3:00

So suppose for we get started in today's webinar. It's important to know that recurrent event day. You know, it's something that's common but perhaps which has been underexplored despite the prevalence of this type of data type occurring in clinical trials. So, you know record event processes are any endpoint where subject could have greater than one informative event in a given time period so, you know, take care of your it was the classic example of your London buses. You know how many of them will appear?

3:30

In 10 minutes or an hour, that's the kind of recurring events. These are kind of your classic place on type processes or negative binomial type processes and in chronic diseases it particularly but also generally available other areas as well. It is a quite a common endpoint like something happens multiple times in a Time unit isn't particularly unusual weather that's like admissions to a hospital or in more specific areas, like it respiratory diseases.

3:55

It's very common to have a chronic disease such as COPD or asthma where you have of these exacerbations where you have these as mattacks for example and multiple ones those could occur in any given time here and you know other areas that I might happen would be relapse in multiple sclerosis headaches will be or migraines would be a classic example and seizures in other conditions where you can have multiple seizures say epileptic seizures in a given time period so what's it all these cases is that we have a you know, we have a time scale like seguir and all of these events Could Happen multiple times and each one of these is equally or around equally important as the first or the last one and therefore, you know, we're getting all this information and what's happening is that the past what's happened is basically that this has been ignored now, it's poor to note that you know, most of what we're talking about here is also pretty much the same for What we call count data, so recurrent events is kind of or incidence rates is another name. That means these kind of things that happen over time unit account. I suppose is what we'll talk about if we're talking about a unit that's not a Time units. So say spatial unit. So Imaging studies, for example, you know, you may look for how many abnormalities there are in a given square unit of a given Imaging process and use that as some kind of place on type process, you know, there's a historically been lots of examples of this.

5:22

For example are very famous example during World War two of you know showing that V2 rocket attacks were actually completely consistent with random chance in terms of Draya being attacked and that was great reassurance at the time and epidemiology is another case for those kind of counts may be important. So as well as recurrent events think time unit can't take other type of unit usually spatial unit.

5:47

And as I said here, you know, we should be treating if we can these it's these kinds as is basically as they exist. Maybe it's kind of a general point that you know, we're modeling statistical phenomena or real world phenomena. Statistically. We should be trying to get a model that as closely aligns with the reality of what exists in the actual field or what happens in the actual trial and you know, if these events are happening multiple times at each of them is important like you would say asthma attacks. Each one is equally important or How likely you have many aspects you are going to get a certain type.

6:22

It is equally important that it makes sense to do that rather than to simplify this down to other endpoints. So, you know classically what would happen is it, you know, instead of considering every event you might only consider time to the first event basically survival analysis, or you might treat count rate. That's a no 1.5 events per year in group one and one event per year in group 2 and consider those to be normally distributed and use a t-test or something else like that.

6:52

Or a simple like square root transformation, if you're if you're particularly interested, but borrow all of these kind of simplify down and remove supposed to subtlety but implied by all of these events occurring at once a busy limits the kind of inferences that you could make about say not just how many events occurred but how long did it take in between events and other such issues?

7:16

And so when we get to actually analyzing where current events as is effectively ones that actually take account of the fact that you're going to be looking at multiple events. I think you can split them into three broad categories. So the first one in the one you're probably most familiar with are what we call event rate models and these are basically just trying to give you an estimate of how many events you expect on average per time unit. So, you know, you're basically getting a rate out of them. So, you know, 10 buses per hour.

7:44

Or you know tree asthma attacks per year those type of scales and so obviously most of these models tend to be parametric there either plus our models negative binomial models and similar models, which basically impose a you know, parametric Distribution on how this race performs or higher.

8:05

This rate is likely to be distributed given, you know different assumptions about the mean the variance Etc and obviously the plots on You know also include a lot of variations are probably the most commonly what use is the Quasi Applause on and which we'll talk about a bit later on but basically it's one that it can account for over dispersion. The negative binomial is another model that can take care of over this version and just know that there are also you know variations like zero inflation and zero truncated to deal with the issue of excess zeros or not wanting to have zeros and stuff like that. We're not going to go into too much detail on those kind of special cases, but there really is there too.

8:44

Deal with the situation where you know, you have too many zeros or you don't want to think about zeros and stuff like that. Obviously. These are kind of your classic. I suppose event rate models. We've all heard of the plots are model and the negative binomial is in the very similar space. These are kind of taking, you know, Tink event rates and thinking of inferences around the comparison event rates and who's less common in statistical analysis, or you might have found in your your classic is statistical course, are these extensions of time to event model to time to events model?

9:14

You'll note there that it's events not event. So now we're not thinking about like the time to the first event, which is your classic survival analysis or actually going out and we're thinking about the time till you know to events or treat event or perhaps we could also think about like the time till the next event or the time between events.

9:37

And these are the kind of different choices you could have in terms of what you're targeting.

9:40

But all of them are basically extending upon your kind of survival analysis framework and in particular kind of taking an inspiration from things like the Cox regression model these semi parametric models, which don't make assumptions about what the underlying say event radar but do have some you know, some things about there being a proportional hazards, for example that basically say the event that the change like the the In both groups is kept constant over time, even if the underlying rate changes over time and so you have models like the Anderson killed the while in weisfeld and Prentiss Williams Peterson here. I suppose the big difference here is that the others in Gill is probably quite closer to your parametric models in terms of how it thinks about things.

10:23

Whereas these other two models are more interested in time to the first specific set of events to say the first three events the first two events with that number of events being pre-specified and so it's just we're not going to talk about in today's webinar, but it is worth noting the results of the a nonparametric Option of simply just creating, you know, your mean cumulative event function which is busy just like how many events so I expect that, you know a year from now two years from now, you know, this is kind of very similar to your kaplan-meier analysis or your Nelson alien-type statistic and it's just basically going counting up how much you how much things have happened at time x and then making inferences based around that as well as first sample size calculation similar to you know, have these nonparametric approach don't generally a sample size calculations except perhaps precision.

11:07

I think it's these time to event balls and event great models where you're talking about inferences power P values Etc in this case.

11:18

And so, you know, there's a lot of different options out there obviously so that the question that you must answer is well, how am I going to pick between these models? How am I going to pick between you know different choices in terms of decisions? And you know, there are some things that are you know saying that of questions you would have in any situation which is like what is the primary question or Target of in in the 30,000 the event rates that's kind of, you know, there's multiple ways to ask the same question. So, you know classically we might default.

11:48

To we're going to look at the event rates or the event rate ratio. So the ratio between 1.5 events in group one per year, but one event per year in group 2 like we want to make inferences comparing those two, but you could also look at you know inferences around the number of events that will happen at a certain time or the time till two events are tree events occurs or like how what is the average time between events?

12:15

And those are all different questions that you could ask as most will mostly Focus today on the first one the idea that you're comparing event rates. You have the event raised ratio. That's the primary idea and they kind of quite similar idea of the intensity has a great ratio, which are basically for our purposes. We'll say I've basically more or less the same thing and then obviously there's also these questions about you know, are we sure that we're only in are we sure that we're interested in all events equally or are we only interested in the first K events Are we more interested in later events? Are there more intensive?

12:48

It's the worry about so for example, if you're looking at a Cardiology and you're looking at admissions for heart disease you often expect that later admissions will be more severe. So then that creates, you know complications for the model for this for this webinar purposes will mostly focus on models that assume that events are equally important and so, you know, even if we get Beyond well what we're interested in we often ask about the assumptions we're making in terms of what models appropriate for that.

13:17

You know, are we looking at independent events those like they're being there then like a day ago effect. I like the event is to happen today. We all have to deal with non informative censoring and we also have to deal with still flick over dispersion in the case of say the parametric models.

13:34

and that kind of comes into you know, do we expect all the average people's event free to be different per person or do we roughly expect everyone in the same category two more or less have the same event raised and you know, those the adventure a process change over time to more events happen later in the year than earlier year and you know stuff like terminal events is also quite important School primary analysis, you know, do some events just stop just like the story basically for ample someone died. They're opting not going to have a repeat of that event. These are all considerations suppose today. We don't have time to go into these fully but these are all things you should consider as complications that you will need to worry about but it's also the classic cases. We have a chronic disease that occurs, you know at a certain rate per year and we want to reduce the number of times a particular negative event like an asthma attack or or other type of exacerbation occurs. And therefore things are relatively simple than that.

14:34

Kind of the case that most of all today be about designing for but do be aware of these other complications that exist. I suppose if you wanted to look at just a small survey of can't model examples. We have the pastor model and negative binomial it models which are very similar except that one makes assumptions one assumes that the event rate is the same for everyone in that group on average. I should say where the negative binomial says, you know, people have their own different event rates per person.

15:02

And also you have a A parameter for that we didn't have the Anderson Gil Laden Whitesville practice Williams Peterson these time to events models or it under the guild kind of looking at the intensity of events kind of averaging over how you know like to get a vet was to happen in terms of time. And that's the intensity or the hazard. How do you give Hazard ratio basically and then these lie win-loss swelled and practice within Peterson being like how long till X events or k events or the gap between K events the nonparametric mcf.

15:34

as I mentioned here, we did mention these approximate models these survival models like time for the first event, you know your T Test if you're going to take a normal assumption for comparing two rates and you know, there's also the very very much I would recommend not doing where you know, we check you know, there's like so that one group had 50% people have an event, you know, the other group had sixty percent had an event compared those if they were binomial very much wouldn't recommend that if you have these repeated events type data and yet it's still very common to those just for today's webinar. We're going to focus on these first three models of the porcelain model the negative binomial and the Anderson guilt.

16:14

So that's kind of nice little overview of recurrent events of count count bottles and kind of the different consideration. Does it go into designing them? So we're going to do now is talk about like sample size for recurring event models and hopefully by looking at sample size will also get an idea some like liquids of these kind of design decisions that we've already kind of considered before and some of the ones that aren't really assumptions but are actually just, you know, design considerations for a model.

16:41

So as I was as I mentioned the tree we look at are the place on negative binomial and Anderson Gil model. So in terms of sample size determination papers, which is Watts on the slides in terms of these references the path of model you have papers like you have many many papers, but the main ones book will focus on today are signorini for poisson regression in 91.

17:00

Learn 92 with a very simple approximation and go at all later work with a more flexible approach to pass on data because all of these are from the Power Plus on model and we're in all these cases we're kind of using the You sample we want to compare rates into groups type of design and you know for the both of models parametric model of the sea and it seems a constant rate ratio and a but it also assumes the rate is constant over time and the same per subject. This is basically your o your dispersion assumption, which is that the variance and the mean rate are the same.

17:36

So basically if everyone's rate is on average the same if everyone's coming from the same rate, As you know bases are mean rate in each group. And that's the same for everyone then. That's what the parts are model is assuming.

17:50

Unless you use a quality possible-- we won't talk about too much today. But that is a model that extends the parts on to allow for that over dispersion where the variance is higher than the mean rate the basic, you know, we don't get that nice behavior that we expect where where everyone's coming from the same group basically people are having different rates on average within each group. Then we have the negative binomial group where the initial work in Sample sizes on the shoe and lacus then it was Extensions by Tang over it.

18:19

Last like six years or so and this isn't this is very similar to the possum. But basically instead of assuming that every subject comes from the same poisson process ever has the same place on rate. We instead it seemed that everyone gets their own place on rate up at those paws on rates have been sampled from the gamma distribution.

18:39

That's the most common way of thinking about the negative binomial model when talking about this type of recurrent event count modeling, but basically just think of it like everyone gets their own paws on rate and those parts on rates have been pulled from the The gamma bag as opposed to gamma distribution bag. And then finally we talk with the Anderson girl, which is an example of these semi-parametric models based. It's an extension of the Cox regression model but instead of looking at you know, the hazard, you know for the first event or now going to look at the hazard combined even see the time to each event and then kind of basically averaging those over each other again again, like the average time to event and then using that as a basic an estimate of the expected hazard.

19:19

You know equivalent of the hazard ratio the ratio whatever you want to call it that exists for this model, like basically you're actually looking at I suppose, you know, the combination of multiple Hazard ratios into a kind of the weighted mean Hazard ratio, but for practical purposes in today's webinar, it's basically equivalent to the rate ratio been user for these other two models, but I think the nice thing about this or the big thing that it has over the negative binomial and poisson. Is that instead of having to assume that the rate.

19:49

Are constant over time so, you know in a negative anyone in Boston, you know, you still have to assume that for each individual like their underlying rate of event High likely to have an event does not change over time. The basically I'm as likely to have an event at the start as I am at the end where it's at the Anderson Gil and these other semi-parametric models. You have the flexibility to change what the underlying rate is and it can be arbitrary as long as proportional hazards assumption is met so for example, you could have a weibull type distribution instead.

20:19

Said of the of the constant event rate assumption.

20:26

And so, you know, you have this much greater flexibility and this is also reflected in the sample size determination, especially from the work from Tang if you pattern from 2019.

20:38

In terms of sample size for record the vents suppose, you know sample size determination in the early days was very much focused on the simple approximations. I think they're in 1992 gave a very nice one the very kind of very much beginners very much. But you see in your introductory textbooks using the square root transformation, but I think you know as time has gone on we've seen a lot more done for poisson regression and pour some models in general including for the Quasi parts.

21:06

I'll be at the work and that is probably I would consider to be still under undergoing we can also then and then what happened in the last I suppose more like like five ten years or so is that there's been a move away from these kind of classic ports on models to extending African to sample our one sample case extending that to the other types of commonly user current event and character models like the negative binomial and the Alison Gill and also extending to a large audible variety of different design decisions.

21:36

So, you know greater than two groups looking at things like step wedge or crossover designs and also giving far more flexibility by things like, you know is their only cool follow-up per subject. So for example, you know similar survival studies, you know about the time your followed up with depends when you recruited to study. So let's say, you know you recruit over a year you follow up for a number of year someone who recruited the very start of accrual is followed for two years. Someone recruited the very end of a cruel is only followed for one year.

22:07

And drop out and lots of other considerations that will talk about more in a moment.

22:11

But the reason work has also extended stuff like nen non-inferiority an equivalence analysis and sort of been a great amount of work in terms of extending sample size determination methods in recent years for a wall much wider variety of these designs and kind of also making them way more flexible and then some of the initial derivations that happened in the 90s and early 2000s were so I think these are reached a level of maturity over the last five years that that is a major step forward in terms of these sample size calculations reflecting the Practical constraints that exist in designing these studies.

22:45

So hopefully I could you an overview of all the covering today and the sample size methods they obviously this isn't really going to be a math class because you know, most of you are probably more interested in the practicalities not the mathematics of deriving these equations. So what I'm actually going to do right now is going to go straight into an example of a study for once and half Lotus love throwing it at the lateral a combination therapy versus plant role only for reducing exacerbations and COPD.

23:14

And this is a fairly simple example, you can see the sample size calculation on the left hand side here and the like tabular summary on the right hand side. This paper was from the landsat respiratory medicine in 2013, and they were assuming a negative binomial model in this case. So they were assuming you know, we're going to use a negative binomial model because we're worried about over dispersion that basically the variance will be higher than our mean right and what we're interested in doing here is effectively just seeing what sample size required assuming we survive.

23:43

Following everyone up for around a year and you know, we have a control incidence rates of 1.4 ratio of 0.75 knew I could do the calculation but busy multiply 1.4 by 0.75 to get 1.05 event rate in group 1A. Sorry in the treatment group and we have this dispersion parameter of 0.7.

24:06

So well, the first thing that we'll do is just replicate that example in angry. So angry obviously is sample size determination platform that allows you to do sample size calculations from a wide variety of different scenarios for anyone who hasn't seen inquiry before the left hand side here. You'll see the inputs required for a calculation. So, you know, our significance level are dispersion parameter each column is an individual calculation.

24:35

I need yellow rules are basically those for which a calculation can occur assuming enough information has been given so we can calculate the sample size has the power or the rate ratio here. For example, there's also these drop downs here. For example where we can pick different things from a drop-down, but we won't worry too much about them at the moment.

24:56

You'll also notice that as we select different rows a little help card at the right here gives you some additional information and context for how you may want to, you know, derive or estimate what this may be equal to.

25:10

So I think the first thing we'll do is just replicate the previous example where they had a 5% or 0.05 Alpha level a group won't mean incidence rate of 1.4. So they are saying there's like one point for COPD asthma exacerbations per year in the control group, which was the lateral only at a rate ratio 0.75. Which as I said, if you multiply 1 .4 by 0.75 you get 1.05. So we're expecting our they were expecting before their study about wound.

25:39

.05 on average exist OBD exacerbations in the treatment group so, you know fairly significant reduction in those exacerbations, which obviously are quite unpleasant for anyone who isn't familiar with COPD.

25:56

I mean exposure time of once a day. We're going to follow everyone for a year and then they had a dispersion parameter of 0.7. So this dispersion parameter is basically controlling how much we expect that over dispersion problem to exist. How much are we deviating from the place and Assumption of the mean and the variance being equal to each other? Basically that idea that everyone's rate was the same effectively so, you know on the opossum we expect that on average. Everyone's kind of got the same mean race.

26:24

8 but now we're kind of saying no people will differ somewhat and this is version parameter deals with the kind of gamma distribution that you're picking these different postal rates for in the negative binomial case of the there's different ways of doing this but the negative and only is probably one of the most classic ways and there's also as mentioned here some guidance and how to get this from say the over dispersion parameter from a from a from a possum model or quasi possum models. I should say. I'm from kind of other data.

26:54

Purses and stuff like that. It's a very good paper and I'll kind of mention that briefly when we talk about the poisson regression equivalent to this analysis. I'm in this case. We'll select three rates, which is basically when we're catching the variance we're going to use the true rates under the null hypothesis sample size ratio of 1 so equal sample size for group and a power of 90 and that gives us 390 people per group and that obviously replicate exactly what they give here. You know, does we calculate a sample size of 390 assessable?

27:24

As per group in each study will provide each study with 90% power.

27:33

and of course, if you're curious, you know, we selected this kind of null variants that of true rates like that's maybe you consider out to be quite arbitrary so we could obviously just you know, quickly copy and paste the cross these inputs see what maximum likelihood I'm on the reference group break give us in terms of our calculations that we can see that the maximum likelihood and true rates end up being quite similar, but the reference group ends up being a little nicer basically in this case because the The right like the immunizations right in this case is a little bit higher.

28:07

If we were to reverse the calculation that would actually end up being that the opposite.

28:13

And so this is basically just replicating the previous exam but understanding too complicated here, but of course this is really a very simple study like we've assumed that everyone being followed for on average the same amount of time dispersion parameters the same in each group, obviously, each group could have its own dispersion parameter in reality, you know, sample sizes equal per group and other types of things. So what are some of the kind of complications that we could look at that aren't being covered here that I've been converted in subsequent work that doesn't exist there.

28:42

I'm basically You know, we can look at this this other table here. That's for the same to negative binomial rate situation. But in this case, you know, we have a lot of additional things that weren't there previously. So we'll enter the things that are the same first just to keep things simple. So the significance level hasn't changed, you know, we were still gonna be looking at a sample size ratio of 1 will have a power of 90, but we'll leave that to the end but everything else here is looking pretty pretty different.

29:11

The last thing is was just at the Rates have not changed but conveniently in this table. You can see that you know, the event rate 2 is displayed for you as well. So you can see these are the event rates that we have. So what changed? Okay. Well firstly we got these accrual period or minimum treatment period options here, but we also have this kind of only equal and equal follow-up situation.

29:33

So the equal follow-up which is the default is basically saying we want to do the same thing at this analysis, but unequal follow-up is basically very similar to a survival analysis what we're saying is The amount of time that you will be spoiled in the study and available to have an event will be different depending on when you recruited into the study. So let's say we had an accrual period of half a year and then we had a minimum treatment period of half a year. So this is the same story that you know, it's still only going to be a year-long or instead of assuming that everyone is going to just be followed for a year regardless of when they entered the study.

30:13

We're just going to say we want to study to be only one year long. That's a hard limit on our study. But we expect to have to recruit patients over the first half of a year.

30:22

And that's basically saying therefore that someone could be followed, you know, somewhere between one year to half a year depending when they were recruited into the study and we could also then make assumptions about well what we expect our recruitment profile to be we expecting uniform accrual basically, you know people who are coming in around the same race on we get to the required number of people are we going to recruit aggressively initially and then kind of tail off just to get the last few people in or do we expect to have to kind of float people in lakes or be open up new centers to kind of deal with the fact that early recruitment has been slower than expected and we can play around with all of those different assumptions using something like the recruitment parameter or basically negative values mean that you recruit slower positive means you recruit faster and zero busy means uniform Sorry that that's the opposite negative means that you could have faster or towards the beginning in this case. We'll just shooting uniform and then you know more additional to come then we have these Dropout rates these basically work exactly the same as for the survival case now, we're assuming they see there's a parallel process of censoring happening where certain people just drop out for various reasons, you know, they decided what they don't want to be in a study anymore or you know other things show up.

31:49

Cetera et cetera. And this is kind of a process that you know similar to other cases you can actually actively model as if this was in survival analysis study. And in this additional thing here is that we have these dispersion parameters that can be different per group. So, you know previously we had to assume they're the same Niche group now we can actually set them to be different in each group so we can see more dispersion in the treatment group versus the control group for example or less if we were particularly interested.

32:19

And so you can see in this case.

32:20

We're kind of roughly changing the previous design to warm where the accrual is happening over half the first half of the study uniformly and we've added about a 5% will say about a 5% drop out per atom of be basically, you know, if we had the same cohort without any effect from the event like this see if everyone survived except when the five percent Dropout based on this kind of exponential dropout rate, it's not exactly that but I'm not going to The exact calculation today. I've covered this very much a lot in the survival webinars and I'm happy to discuss it further if you're interested and then we we've had this version of the same here. This is mostly the same except for the accrual period and the Dropout you can see we've had a fairly substantial increase from Trailer 90 per group up to 470 per group. Now we keep basically get back the exact same analysis as last time if we had kept the equal follow-up assumption where basically we have a treatment follow-up of one, you know these accrual up.

33:18

Options basically become optional in this case. We assume no drop out. We have our event rate of 1.4 1.05 for Venturi to and our dispersion parameters are the same in each group.

33:34

And we can see here. We get the 390 that we got previously from the previous table. So, you know, we can replicate this analysis. This is a basically a subset of this bigger one, but you now hold all these additional options that you didn't have before basically to kind of see what the effect would be of unequal follow-up per subject of different dispersion parameters per group, you know of drop out or censoring on this and this is the kind of flexibility to know.

34:04

A little angry that you basically yeah, it's not available in comparable software at the moment.

34:12

And so this also gives good idea of negative binomial. Now. This is kind of like negative binomial mostly covering a day and that there are additional options for negative binomial related to equivalence analysis, non-inferiority analysis.

34:25

So for example, this is like a an equivalent stable which is basically looks more or less the same as our prior table, but of course you can see here that you know, our event rate ratio and self has these additional equivalence limit options, but basically everything else is more or less the same there's no sex and slight differences that Inserts play based on the paper that they're based on but these are basically doing the same thing except for the equivalent analysis case.

34:50

Now we talked about bots on supports on is the exposed simple case where you know, this is kind of riffing off and of course if we were to take this analysis But instead of having the first parameter of 0.7 we would have a dispersion parameter of effectively zero and zero is actually actively isn't allowed in the state. It will attempt to effectively zero for like 1 2 e minus 10 here very very much basically saying it's zero will see here. In this case that the sample size for a place on type analysis will be around depending on the assumptions. You want to use two hundred and ten to a hundred ninety-five.

35:32

And of course we could confirm that by looking at, you know actively doing Applause on analysis. So for example, if we were to go to the poisson regression table, this is what it looks like here and we were to enter you know, 0.05 significance level to Tucson at level 1 point for Baseline rate 0.75 rape ratio. Basically, that's what the rate of the exponentiated parameters from a pulse.

36:00

On regression are it's equivalent to especially for binary covariate mean exposure time of 1 over dispersion parameter of water. So in the case of the poor quality poor thermal, which is basically what this is equivalent to, you know, Warren is equivalent to know over dispersion, but technically you can actually under dispersal. If you set this less than 1 whereas values greater than 1 which is probably what you're more worried about is where it's greater than 1 and then we will have a you know, we're talking here about regression so we actually have very many.

36:30

Adoptions for what type of covariant were interested in but because we're looking at an equivalent to a binary code various type analysis in these previous tables will do use binomial here for the distribution of x 1 of our Co various or of our basically our variable with a proportion of 0.5 which unsurprisingly is equivalent to equal sample size per group.

36:52

In this table, we have the option to account for the effect of other covariance or will assume no other covariance for now and then we have a sample size of 90 and we see we get 387 here for the total sample size.

37:07

We were to go back to this previous example, and we enter just one here just kind of go quickly. You can see here that a hundred ninety four by two is actually equal to 388. So those are very very similar as you can tell there.

37:19

So, you know, these are very very similar in the you know, limiting case where the negative binomial effectively becomes the Paulson distribution when the distribution parameter effectively becomes zero basic the variance of the gamma distribution becomes basically equal to 0 So effectively you can take it with that way. So if that if that variances is very much, you know coming down to zero then busy the gamblers region for the same value every single time and that's basically the same as the past and assumption.

37:48

Now there is actually a way to calculate or kind of go between the over dispersion parameter given here for the poor quality plotted model and the dispersion parameter given for the negative binomial and this is talked about in several places.

38:04

But the paper I would recommend is actually this 2017 paper sample size for comparing negative binomial rates in non-inferiority equivalence trials would only go follow up times done by young king Tang the references available at the End of the of the slides as well, so don't worry about that. But if there's a little equation here for Kappa, which is the dispersion parameter from negative binomial is equal to the dispersion over dispersion parameter from the quality Parts on minus 1 divided by basically, the mean number of events expected across both groups, which is relatively easy to calculate you can see here. So we'll actually do is very briefly basically use this calculation to go from Kappa R Kappa.

38:47

To an over like an over dispersion parameter for the Quasi place on so this is very simple algebra, you know will basically take Kappa I mu so we'll take Kappa by the mean number of events and we'll add 1 and that will give us the over dispersion parameter. And you know, this is this is fairly trivial calculation, but basically the total number of expected events. It's just you know, we have treat or 90 people being followed for a year in each group.

39:17

So You know 1.4 by 390 busy years of follow-up. I'm 1.05 by train or 94 group for the treatment group gives us, you know, this number of events are expected to happen in our between our two groups divided by the total amount of time. They've been followed which is equal to about 1.2 to 5. So we're expecting about one point two to five events to happen in this particular study.

39:48

We then just have to multiply that by by R Kappa by are basically by our by our dispersion parameter from the from the negative binomial model and that is that this and we just add one to that and we get this one point a five seven five now, of course the industry or not.

40:11

He's only been used because the follow-up was one if the follow-up time would say tree then We'd be taking 390 pine tree because obviously what we're the real I suppose sample size for the current event of recurring events case isn't really the number of people it's the amount of time that people are exposed in the study.

40:35

So really it's about the person you think of it like person-years is the real sample size in this case and that's really, you know, the trivial way to do that would just be the multiply time by the number of people and they get kind of your Ralph person years and then multiply that by the rate to get the expected number of events in a given group and I get the average as explained here. But what we'll do is basically we'll go back to the post of regression model.

41:10

And the only thing we're going to change is basically that over dispersion parameter. Of course, we're going to use our new value of one point eight five seven five. So we calculate that. So we're saying this is roughly equivalent very rough. Like these aren't the same model. There are differences between the negative binomial and quality porcelain. That won't go into much today, but they do give different results for the same data.

41:32

But if we're gonna kind of go between them roughly then this 0.7 dispersion parameter, Equivalent to one point eight five eight for this set of assumptions around the sample size and the event rates and you know for 90% power were saying we need 718 people which is actually like not significantly lower which is a bit significant lower than 7 and 42-year choir here, but these aren't my million miles away from each other. So obviously when you think about the transfer in terms of the underlying model, you need to transfer the underlying sample size decisions in terms of deriving the temple.

42:08

Size calculation and lots of other things all of these have an effect this case the main that the poisson regression the quality Parts on seems to require less sample size for equivalent levels of dispersion, but that may just be an artifact of the underlying assumptions we made by each in terms of the derivation, but they seem to actually align perfectly in the case where we're talking about yourself pure place on model where there's no dispersion over dispersion concern.

42:37

I don't really have time to cover it today because I'm going to go on to cooking sequential design but there are also lots of other scenarios covered then query like there's a lot more options for the two sample case for Paul said model. So these are someone like some replications of that poisson regression example in call you one and this taper payable based on the work of Goo at all.

42:56

The references just here but also, you know Four Paws on I'm for recurring models where including lots more options a time goes on so, you know if you want to do it for a Ghal recurring event in group in a single group. We have tables for that. We have crossover analysis like your two by two crossover announces for recurring events under the place of assumption. We have enough self regarding the complete like the stepped wedge cluster randomized design and for other round under my eyes designs, like other like cluster randomized designs as well as so the last thing we should just mention here is the Anderson Guild model.

43:33

So this briefly mentioned this as well because I said I'd cover it but I think practically speaking the only major difference that you'll see here for this versus kind of the table. We were looking at previously for the negative binomial case is that you know, we have this assumption instead of just assuming a constant event rate like one point four and one .045 1.05.

43:56

We actually have this much greater flexibility to deal with having the event rate change over time. And in this case, we're going to use a we can use a weibull distribution to deal with this situation. So if we just add like this if we kind of set up the same design as we have in column one here.

44:21

And you'll see that this is you know, this is looking kind of very similar to the same. These are actually basically from the same author. So unsurprisingly, he's kind of use the same type of parameterisation here. So these are all the same parameters you saw previously.

44:36

The only difference really is that is that we now use these weibull scale and shape parameters to Define what we expect to happen in the control arm, obviously given the proportional hazards assumption thats how this event we should end arrives the treatment arm and of course, you know, we could replicate the previously negative binomial analysis by setting a square shape parameter equal to 1 which is an exponential distribution and we're kind of going from survive. If you're taking a survival equivalent to the constant event rate assumption, then the exponential distribution is an obvious way to do.

45:12

That and then in this case, we would then have 1.4 here.

45:18

And it's and becomes the event rate. If for the exponential basically here, we have 80 sorry 90% for power 0.5 for allocation and we get something slightly different for this case. I think they actually get this fully to be the same you would have to you know, basically translate a survival parameter idea to the other race.

45:43

So after one year You would expect.

45:49

One point for event and stuff like that and you could get the number right there.

46:01

What's a very different way of thinking about a basic there? And you can kind of think about how to get that done there?

46:09

Okay, so I think the last thing we want to just cover today is group sequential design or count data or for a current events data and I suppose I'm not going to cover group schedule design in detail today because it is an area of covered in detail in a previous webinars on in some of our online material but basically a group sequential design is a type of adaptive trial where we have the decision to stop the trial early based on very strong evidence for or against Current treatment being, you know, either good for efficacy or five-door not likely to succeed IE futility and if we don't choose to stop early, we always continue to the next look until we get to the final look and then decide whether you know as we do in a fixed term trial whether to stop early or not and you know cruise control design is a it's a very widely used adaptive design and it's expected.

47:00

It is very has lots of operating characteristics that are very much beneficial and therefore the FDA and other entities are very much have no problems with groups of control design.

47:13

This is kind of an overview of group sequential design details. I think for us here. We're only really interested in knowing that you know, we're going to be using these spending functions to decide how much of our Alpha we're going to spend the beta going to spend.

47:26

I may see what those are saying is that because we have this chance to stop early, you know, we're doing this V multiple analyses and therefore if we didn't adjust for that by spending a certain portion or function with end up with an inflated type 1 error or type 2 error and therefore we're trying to just adjust for those and then you can see that for every can see we're usually going to use a conservative spending function like the O'Brien Fleming and but for futility, we have more flexibility suppose because from a regulatory perspective stopping early for efficacy is something that the FDA is involved in whereas for fertility, you're more interested in the producers perspective from the sponsors perspective of when they want to cut their losses or not.

48:05

So in terms of groups, which design issues unique to recurrent events, I suppose in terms of getting the theory to work what has been found. Is that like if you follow every sap every subject for the same amount of time if you talk about simple example that we started with which is that they're everyone's followed for a year. Then things are actually not too bad like more or less your kind of simple.

48:33

Statistics even for the negative binomial are more or less going to work for you including a cure met of method of moments estimator, which is basically you take you know, you take the number of events and you divide it by the time for like the mle is in the method of moments estimators. Don't play nice, especially if you have these on equals follow-ups and especially when you're dealing with over this version, so I think what you'll see on the right-hand side here that the the idea of information which I won't go into too much detail, basically.

49:03

Do you think of these kind of being these are very much related to your standard errors that you would see for your kind of Z for any kind of Z approximation?

49:13

It's based on some so it's not based on a closed form solution. There's no way to kind of just derive this without actually just summing up what happened to each person over time or what you expect to happen. I should say over time for each person and then just somebody that all together to get your information your Fisher information at a given time. And then therefore, you know also then using that for your derivation of the sample size and the power the group's want to design theory is probably too much to go into right now, but it's all related to this.

49:42

Fisher information to these kind of ideas of units of formation and this is importantly for current events isn't isn't based on these kind of closed form Solutions. That's for the means or proportions or even survival but is actually based on these summations that are a bit more complicated. I think there is also an issue here that because of that it's not trivial especially if you're dealing with on equal follow times and desperate High dispersion.

50:12

To derive wash, you know real time or calendar time would mean in terms of information time. And therefore there's work or packages that GS Carrington on the paper from would settle reference in this slide, which kind of talks about how to translate between those is basically using these equations that we're looking at here.

50:30

I think if you have equal time, Per subject on dispersion isn't something you're worried about then. That's the situation where you end up basically having your closed form Solutions where the variance for your standard deviation for your standard error calculation is basically just the you know, the rate is equal to the variance see divide that by n to get your you know for the standard error. So I think for that special case, which I won't talk about today too much, but I'll show you what briefly looks like an inquiry for that poor son equal follow-up care.

51:03

Case things more or less work at the means case like the proportions case like the Survivor case with these closed form Solutions and where the calendar time isn't really too hard to drive. It's basically, you know, wait for 20% of your people to have finished our follow-up, but when you're getting all these other issues about an equal false hope that's where things get complicated and I think it is also important to note that these that these ways of kind of getting to the Z statistic even using these summations talked about here by Moosa as you go to low sample size or as you have very high.

51:33

Over dispersion like we're talking about like over like 510 dispersion. That's where these things kind of breakdown like the Z approximation starts to break down the asymptotics breaks down and you might want to consider using either different statistics different types of wolves statistics, for example or stuff like the T distribution instead of the Z distribution of there's all the city already equivalent type of T distribution.

51:57

GST is for means going to be using the same which to adjust your your boundaries and your test statistics of Italy So it's just like a flavor of the type of thing. Like it's also important like a lollies are also relevant for adaptive design in general for recurrent events.

52:13

And I imagine an inquiry and in the papers and research Beyond I expect a lot of additional work in areas like sample size for estimation and other similar things for this type of end point going forward, but there are these additional complications that need to be considered when looking at it even as you can see here for the relatively simple case of groups conscious design and suppose just to illustrate groups quench Time for this case. Like if you're single twisting force, basically, it's more or less taking a fixed term to trial and adding some spending function assumptions. They see this kind of your interim look assumptions on top of that. There are actually some take the previous example, but extend that to have a O'Brien Fleming efficacy bound wiring she to county or gamma spending function for the fertility band with a gamma equal to - 1 .25. We look at the data to times before we get to the final look. Sorry.

53:05

To have three total looks in this design. And so we get an inquiry. I think the first thing you'll notice is that for I think I mentioned briefly like if you have the place on assumption but equal follow-up things are a lot simpler and these are basically mean that your statistics itself is very simple before the negative binomial case, you know, we end up with parameters look very very similar to what we have previously we have our test significance level, but we'll set that 0.025 the one-sided level we're going to do too much.

53:32

But basically this is what we want really Or if utility analysis if we're going to do if utility analysis and I can't go to trial if the half the two-sided Alpha at the one-sided level which we all know is pretty much equivalent for practical purposes, you know an accrual period here will say 0.5 will take the kind of the complicated example. Well, no actually will keep things simple and it will keep it exactly the same as the rear very much reason example accrual period equal to zero speed parameter. Therefore becomes irrelevant.

54:04

We'll set that to 1 Minimum follow-up period of wounds are going to follow everyone for one year in this case, basically our event rates of 1.4 and 1.05 and then aspersion parameter of 0.7 and a sample size ratio of 1 and then before we entered the power and get our sample size, let's fix up our group sequential design parameters We have tree total looks to in term one final and a Brian Fleming efficacy bound we could also look at other ones off see here and then a non-binding futility.

54:35

And for Fang shih to County so non-binding this means that if we cross the fatality Bank, we don't have to stop and if we choose to continue, it doesn't affect the type 1 error, whereas with the binding one and that would happen and therefore practically most people use the non-binding spending functions.

54:55

and you can see that with this type of spending profile with the kind of accounting for the fact that we're spending our errors earlier on we go from 390 per group to 426 and if we were to kind of replicate our other example That we did.

55:13

where we had 0.5 10.5 this is what kind of more complicated example I'll be at will need to get rid of the Dropout to kind of make it fully comparable.

55:31

So this in this case we were looking at around 439 in this case.

55:46

I 465 I forgot to change that dispersion parameter.

55:51

So we want to set this up the same. Of course. We don't want to we don't want this to be non-comparable.

55:58

The non-binding wine she to Connie obviously O'Brien Fleming some Pizazz the default assumption for for the efficacy bound.

56:14

You can see that the sample size actually increases quite substantial enough not surprising cause we're busy gone from an average follow-up for one year to an average follow-up of you know, three quarters of a year effectively. So that's not an insubstantial thing the course as per our all our group special tables. You can buy more information here about you know, what type of Z statistics would like you to stop early based on these efforts like these bounds and then they nominal offer here is basically the P values that you would return that would allow you to stop her.

56:44

So in this case, you know P value of 0.006 or less time to would allow us to stop early for efficacy. But Warren greater than point one five one would indicate we should stop early for futility.

57:01

And so hopefully that gives you an idea of group coaching design and an enquiry and I kinda important thing to take away from this is that practically speaking, you know, like the complications I talked about there are relevant for the operating characteristics and things you should consider when designing your study, but from a sample size or kind of just your trivia like him he will do I need situation, you know with sell for like Angry it basically becomes up.

57:22

It isn't really that huh art to get a get a get a feel for in a glimpse of and so Don't consider sample size or that design parameters Asian or these efficacy by own calculations to be a barrier to considering that these other things I've talked about in the slides are more practical considerations that you could think about and maybe just you know cases where this type of design maybe less appropriate like slow sample size.

57:51

So I think that's pretty much done for the day. I think about be slightly overboard.

57:57

You know, what are we cover today will basically recovered what that you know recurrent events and count data is very common in clinical trials. And in the last I'd say 10 to 20 years.

58:07

There's definitely been a move from just using approximations for that type of analysis like using you know, your survival models or your T tests or your binary models are your the chi-square test and actually modeling these recurring event processes fully especially in areas like chronic disease where this has been a major issue for many Years, there's a variety of different models depending of the scientific question that you want to ask like these parametric models and to basically understand Gil as well. Ask the question of the event rate and the semi-parametric look at like the time between or two events. There are sample size methods available for most common models like the negative binomial and poisson.

58:46

That's one less barrier to using these types of models and configuration design methods are becoming more widely available and they work adequately for I considered we could form a three trials pastry scenarios where you have high sample size and you're probably not dealing with very out of control over dispersion typically.

59:06

So, you know, I'd like to thank you for attending today's webinar. I hope you've learned something useful. If you want to ask any questions after the webinar you can get in contact at info at that sells.com, or you can visit statsols.com for further information. I think if there's anything that you've seen today that you would like to try that you either don't have a query or you have a version of anchor your lights to recreate. It doesn't include these features. Feel free to use that sells.com.

59:34

I'm forward slash trial where you can try in your browser the inquiry software with the full Suite of options for a few days and see if some of these options are important to you. You can also get some of the video tutorials. I'm worked examples from previous webinars at statsols.com forward slash start and the references are also available to end of these slides. So we're going to do now should actually just take a few moments to answer some questions that came in today and any any of them.

1:00:03

I don't get I will make sure to answer via email afterwards.

1:00:14

So just to just a couple of questions, I'll get to the day one after it would sample size re estimation be possible for recurring events. And I yeah, definitely there is actually already papers out for Blinded sample size re estimation for the negative binomial model and four other recurring event modeled. So if you're looking at blinded sample size for estimation of it's already basically out there and I said, you know, like in previous webinars on blinded sample size we estimation is really an extension of group sequential.

1:00:44

Line for most of the ways it's done. And therefore I think it naturally to me extending the current group sequential design calculations to that scenario shouldn't be too much of a stretch. But of course think the work there is just evaluating the operating characteristics of it and making sure that the assumption that just extending that groups going to design approach to sample size for estimation like say that generates an ion or the quit hogging approach still retains the operating characteristics.

1:01:13

It's we expect from our schools are kind of working assumptions that kind of work that's been done a place like angry, but obviously places other places in the clinical trial research Community.

1:01:25

There was also one of the questions about the dispersion parameter, you know back calculation and they say like, you know, the time paper that I talked about here goes into a lot more detail about you know, like this is just like if you happen to have either the over dispersion parameter and you went it was like from the placebo On and you wanted to get to the Kappa to the dispersion parameter from the negative binomial model. This is what you would hear but I think it's important to note. This paper goes into a lot more detail about how you could get these different parameters from data in the actual paper is even a whole section here on back calculating based on results from another negative binomial regression based on the confidence interval. So there's very very many different ways to get this kind of as well as assume.

1:02:13

Door test value for Kappa or for the over dispersion parameter and that paper. I would recommend reading that or the appendices of that actually get a flavor for how that's going on. So for reference, that's this 2017 paper by Tang here. This one here from the Journal of pharmaceutical statistics.

1:02:37

Okay, I think that's us more or less done. So, I just want to thank you. Once again for coming to today's webinar as I say if you have any additional questions email us at info at Stats also are calm so lifer dude.

Thank you so much for attending and goodbye.

These Stories on Guide to Sample Size

April 21, 2020 |

1 min read

April 21, 2020 |

2 min read

March 20, 2020 |

3 min read

Copyright © Statsols 2020, All Rights Reserved. Privacy Policy

## No Comments Yet

Let us know what you think