nQuery Predict uses your trial data to project when key actions will occur. This gives you the ability to identify roadblocks and take action to keep your trial on schedule.
nQuery Predict is a new tool that uses simulation to help trialists make better predictions about when key trial milestones will occur. nQuery Predict will provide tools to predict key enrollment and event milestones both before and while a trial is ongoing.
In this webinar, we introduce clinical trial milestone prediction and the tools available in nQuery Predict for this.
We show how to make more informed decisions based on real trial data as it becomes available.
Clinical trials rely on key study milestones being reached before interim or final analyses can be conducted.
Enrollment targets are commonly missed in clinical trials. In survival (time-to-event) analysis studies, event targets can be unpredictable and complex to model.
While pre-trial assumptions are used to project when these milestones will occur, it is vital to know if enrollment and survival trends are varying from the pre-trial assumptions. Re-evaluation of the trial’s trajectory and required resources can be made on real-time interim predictions.
Nothing showing? Click here to accept marketing cookies
Looking for more resources?
*Please note, this is auto-generated, some spelling and grammatical differences may occur*
So, hello everyone. And welcome to today's webinar, Introducing nQuery Predict: a New Tool for Clinical Trial Milestone Prediction.
So today's webinar will be us here at nQuery introducing our latest new feature. nQuery predict a new tool that uses simulations to try and allow you to predict when clinical trial milestones will occur for both enrollment targets and for event targets indicates that survival analysis.
OK, so, let's get into today's webinar, Introducing nQuery Predict.
So, the last two webinars that I have done I have focused broadly on clinical trial milestone prediction, mostly focusing on enrollment prediction.
And then in last month's webinar, focusing on survival, enrollment, prediction, milestones, mainly, calculating, or predicting when the required number of events will occur. Such as deaths in survival analysis in oncology or cardiology, or something like that. And suppose, you know, these are kind of been building up to today's webinar, which is showing off the new tool, that we're a whole, we're hoping that you will enjoy, called nQuery Predict, which is part of our new expert here in the nQuery, Advanced software suite.
And so, I'm hoping today is to give you the background that we have from those previous two webinars, in case you didn't attend those very briefly at the start. And then give you a rundown of all the things will be available in, nQuery protect one point, though, but I think it's worth saying upfront. That is just our initial release. My plan.
And our plan is to continue to add lot of new features and options as we go forward. In time, as we develop this similar, to highlight, every six months, we provide a new set of tables for a variety of different endpoints from adaptive design phase in sample size to core, or fixed term trials. And our hope is the key to use this as a starting point to build up a much more kind of like a fully comprehensive suite reacting to the demands that you have. So, there'll be an e-mail in the slides, ... dot com. So if there's anything here that you're particularly interested in, and you want to see further e-mailed us there, but also have any interest in Medical, know what kind of things that you think should be in the software going forward, but if you feel it's missing, then we're really happy to get that kind of feedback, as well. But I think we have something that's really, really useful tool, that kind of It's kind of nicely ... with the current sample size and adaptive design offerings, in nQuery advanced.
With all that said, just for anyone who doesn't that hasn't attended these webinars before, my name is Ronan Fitzpatrick and the head of Statistics here at nQuery, ..., researchers, into nQuery, interim. Tree point though which I think is about 7 or 8 years ago at this point. I've given workshops in places like the FDA JSM. And of course you know with the pandemic hopefully having reached you know kind of steady state. Our vaccination continue is high.
We'll hopefully get to go and see people out there again but for now these webinars and I've been able to attend webinars has been very useful.
For example, I was busy with development of nQuery predict but I'm hoping to catch up with a lot of the information from the by up workshop that occurred earlier this week.
So in terms of what we're covering today, one is just clinical trial models and prediction of the high level, really just going through the cliff notes of what we covered in the last two webinars. And then, a overview of what will be an nQuery predict in its initial release. And then there'll be a work example showing off those features and showing us some of the, you know, the UX and UI.
And after that, then, just some conclusion discussion, and probably answering a few questions.
So obviously just to mention that this webinar is presented by nQuery, your complete solution for optimizing clinical trial design, and, you know, obviously we cover a variety of different clinical trials ready from pre-clinical research to early phase. Do your confirmatory phase three trials and postmarketing on the course in the context of what we're talking about today? Will be covering, you know, some aspects of that which are related to train on trial design. Perhaps. Or, you know, more related to the kind of, what do you do once you've designed your study? How do you make sure that it continues to be on track?
And obviously nQuery is widely used in the biopharmaceutical biotechnology industry, to 91% of organizations. I will send you try the Peabody FDA having a license for nQuery.
You can see some of those here, some of the reviews.
So part one, clinical trial milestone protection. So, I suppose just at the very highest level, what are we talking about when we talk about, try like these key milestones. Well, look, when we're designing a clinical trial, we're focusing on the ride at different operational. And statistical aspects were thinking about sample size calculations, stuff like that. Depending where you are in terms of being a statistician. or being someone more involved in the more logistical end. Or involved in terms of the clinician and stuff like that. You'll have different emphasis. But all, let's say, a lot of thought, goes into generating the protocols that are used for clinical trials, particularly in later stage A clinical trial designs face tree, in particular. So, you know, you have a lot of different choices in terms of adaptive design power, et cetera.
So, you know, once you've decided what your trial it's going to be, you will tend to have these key milestones that need to be reached before you can make certain key decision. So if I've done a sample size calculation, and it says, I need 500 people, then obviously, I need to get follow up from those 500 people before I can end the study. Or, if I've done a, if I've planned an interim analysis, like, say, using encrypts control design, then I might be deciding that, you know, 250 people in. We're going to do an interim analysis and then see how we're doing it. And perhaps choose to stop early, Or increase the sample size, or continue to trial on board sizes.
And all of those require you to have some expectation of when they're going to occur. Because you know, in the abstract 500 people sounds fine. But we know that you know, one of the biggest challenges in clinical trials is actually getting people into the trial. And there's a lot of really good research happening right now in terms of how to make that process smoother and stuff like that. But one aspect that would be useful in that case would be to have a better understanding of how your trial is doing while it's ongoing.
So, whether that's thinking about your sample size, or your enrollment targets, or in the case of survival analysis, like how many events are coming in based on the current rates. Those are the kinds of considerations are important, that what we know is that a lot of trials are delayed, 80, like around 80% of trials are delayed, at an 85%, or even reached a recruitment goal. I think these are based on that kind of, Industry survey, that it's in the references of the slide, and that are included at the end of the slides.
Know, what's what's the big thing is here, it's like, Well, look, we have data coming in. We can see how people are coming in. We can see a site's open insights close, but why don't we use that to more accurately model what's going to happen. And there's a lot of interest in using statistical approaches, like simulation and machine learning and similar to try and do that better and more often.
So, as I said, there's kind of two major kind of categories of key trial amounts, are going to talk about it.
We're going to talk about enrollment slash sample size, prediction, and then we're gonna talk about event prediction for the context of survival analysis. Enrollment protection is pretty straightforward.
We need to recruit a certain number of people when it's not going to happen. And that's basically what's of interest for us as a tri-List, as a sponsor, as a regulator. These are kind of things that are important to have some kind of reasonable estimate for, and of course, before the trial, you will have some estimates, that, some idea about it. But obviously it's a trial actually unfolds. Those estimates may end up being overly optimistic or overly pessimistic. I'm probably the latter problem, if a pass like overly optimistic, are the is the more problematic situation.
So, when we look at this, like the question, that is OK, we want to model the enrollment process. How do we do this? Well, there's obviously a variety of different ways you can think about this. You can think of it as like a global problem. Like, basically, everyone's kind of coming from the same process, but, of course, we often have individual hospitals, individual sites. We will have regional level effects, or regional caps, and we're kind of trying to get a diverse set of people into our trial. So, there's often older considerations, not just get a 500 people into my study, but there have to be a certain gender split. There has to be a certain regional splash and stuff like that. Port, you know. The easiest way to think of that is, like, you know, is, in terms of the site level, like this hospital is open. We're getting this many patients per month. How long it's gonna take us to finish if we model each hospital individually, and how do we deal with that? And once you've decided like what kind of levels and considerations and covariates you want to include. There's a variety of different options that you could use to then do that modeling.
Ranging from, like, just using simple equations to simulations Bayesian approaches, and some combination thereof. And the main point here is that if we have these predictions, we have this information behind sites are doing how different regions are doing. Then, we have a much better chance of being able to make choices. That can improve our chance of reaching our recruitment goal, whether that be adding more sites, if we're not doing, as well as expected, or adding sites, and in areas, or regions that seem to be doing better than expected, or dropping sites, which aren't really living up to expectations. Where we're not really getting the required, I push.
So let's just a very brief summary of a variety of different approaches that could be used to do enrollment prediction from kind of the more statistical or technical area. I'm not going to cover these today, but just dimension data to the nQuery focuses on in its initial releases, global simulation, which is kinda just treating the enrollment process as a global process coming from, say, like a Poisson process on a per site simulation. So, treat each site as if it was it had its own poisson race, and then similar to each of those individually, and then use as the basis for your enrollment targets.
So, that's enrollment protection. And if you want to know more about enrollment prediction, I covered a lot in, and the webinar two months ago, that will be the July webinar. So, if you wanna know more about that, I covered a lot more detail in that webinar. Obviously, don't have time to go through everything we covered in those previous webinars in detail today.
So then we move on to survival, event predictions. So in the, well, when we talk about the role of protection, that's pretty much true for, you know, 99% of all clinical trials. We have some enrollment target, you want to reach it, and in most cases, things either go as expected or media are a little slower than expected. There are other exceptions where things actually go better than expected. For example, quite famously in the covert 19 trials, they ended up having the sample size ended up being higher than they really needed, because the events rate was much higher than expected for, covered. one thousand due.
To the kind of outbreak peak that we had in the US, and other jurisdictions during the middle of last summer, bought first. But if we're talking about survival analysis, there's now an extra level, because we're talking about here, inference about the time until some event, such as data, using methods at *** aggression, or log rank test. And, you know, we're obviously talking about things like, like, overall survival or progression free survival.
You know, people, you know, an advantage dying, or progressing to the next stage of cancer, obviously things that are quite, quite, quite unfortunate.
But, obviously, we're doing clinical trials here to try and, obviously, opiate these for future people: future patients. So, when we talk about the key milestones for survival analysis, the important thing to note is that, similar to the point that I'm making a lot of sample size calculation, Webinars on survival analysis, the milestones are related to the number of events. Not to the sample size itself. So if we recruit 100 people, but no one has the event, then we have no idea what the time to event is, for obvious reasons, because no one's had the event so far. All we can infer right now, it's at the time to event. It's probably greater than however long the study's been running for.
So, that's obviously not completely useful information. Francisco point of view isn't very useful at all.
So, what we really need from that perspective is to target some set number of events for an interim analysis, or when we're going to end the study, and then accurately, in our case, want to accurately estimate, OK, based on how these events are coming, in, does far, how long is it gonna take for that, like the target number of events to occur?
And, you know, in this case, not like, you know, we're not only dealing with the events process, but we also have to deal with the enrollment process, if the enrollment period is still ongoing when we choose to do this analysis. But we also have to deal with other competing processes, such as dropout, such as censoring, such as some cure or competing processes that might exist in our study.
So, you know, the enrollments modeling, that we talked about earlier on, is often a kind of constituent components of the survival event prediction. So you need to kinda do enrollment modeling to be able to do survival modeling. And then there's all the other things that you may wish to model, basically, which are things that prevent you from having the event going forward.
Now, just to note, that from our perspective, we're not really that interested in what the specific non-event was.
What it was a dropout, or a cure, Or a competing risk, or et cetera, Because, usually, we have based our event targets, agnostic of what those particular things are. And that's sometimes, like, if we do a Cure Model, we might have some idea of, here's how many curious when it happened. But we'll still have a target number of events that we need to reach before. we're willing to make our inference before we're willing to end to study and to write sensor the remaining subjects.
So just to note, from a practical point of view, we're predicting events that the alternative things that can happen to you aren't important.
Insofar as all we need to know is that they kind of event would stop you having the event of interest. So if I drop out, I can't have the event of interest.
If I have a competing risk occur, I can't have the event of interest, et cetera.
So our if I'm censored for whatever reason, then I can't have the event of interest. So that's what we need needs to be focused on just in this case. It's not really as much about the statistical or inferential aspects for this particular type of problem.
Just to mention there, is as, for the enrollment case, a variety of different methods and processes available to do this type of prediction, ranging from parametric modeling, to simulation, to machine learning algorithms. And obviously, some of the special cases that have become a particular interest in recent years, such as the immunotherapies with their delayed effect and stuff like that, David require a slightly different approach as well when thinking about this. And you know, there are approaches of using, let's say, Bayesian information from other trials to kinda get better estimates are starting estimates. But obviously if the trial has been ongoing for awhile that those will become less important.
So, hopefully, that gives you a reasonable overview of what we're talking about here. What nQuery predict is really targeting. It's really about, can we predict based on what's happened so far or potentially pretrial.
Watch the length of time of our study is going to be based on an information that we have available. So we want to know, how long is this study going to take based on what we know thus far effectively.
And so when we predict is our tool that will focus on clinical trial milestone prediction with a, you know, user friendly interface.
You and your interface that which picks into the, you know, years of knowledge that we've accumulated in terms of creating nQuery, a product, which makes sample size calculations, easy to use, easy to communicate, and easy to know that they've been validated and tested to a high level.
So, you know, obviously, initially, we're focusing on this enrollment and event predictions via simulations, and we're hoping that we can provide the flexible options that you need to cater to whatever the particular set of information you happen, to have, on provide visualizations and summary data. That allows you to get an accurate, quick idea of where you are, at the moment, based on what's happened thus far.
So, know, depending on what data you have access to, depending on the granularity of the data, we're hoping to encrypt critic can, you know, provide an option, a tool, which spits and maximizes the value that you can extract. From those. Those, those information that you have, those data, you happen to have available to you.
So we're hoping that nQuery, predictive, is a useful compliment to our existing sample size determination and adaptive design tools. Because, you know, from a practical point of view, like NQuery right now gives you, here's how many people I need my study, here's the adaptive design, when the interim analysis is going to happen.
Once you've made those decisions, once you've gotten to doing the actual trial, we're hoping nQuery cannot provide you a tool that, let you know, practically speaking, you know, the important aspect of how long it's going to take. really important.
Obviously, the trial S themselves to clinicians to sponsors, to everyone involved, that now, NQuery provides an integrated tool to allow you to make those decisions as that data becomes available.
So we think it's kind of, I wouldn't say orthogonal but it is, you know, it's energizes quite well I think, with the offering that we have for NQuery, has obviously been one of the leading sample size determination, sulfur for over 20 years at this point. We obviously done a lot of adaptive design over the last 10 years as well.
So, in terms of what nQuery predict features will be at the high level, on the next slide, I'll kinda summarize this in the tabular format. But basically, there's two primary approaches to doing this prediction that will be available in nQuery, predict, one point, though, using interim data, and using summary data. So, interim data is where you have real clinical trial data, which includes things like when people arrived into the study, for survival analysis, when they had an event, or when they had to drop out, or some other competing process. Like, what is your current status, are you, have you had the event, have you dropped out of you, Are you, are you still available to have the event at the current time attack at the time that the data was taken? And, you know, what treatment group you're in might also be available. So, these are the types of information that you might have available on a per subject basis.
But also, if we're talking about the enrollment process, you might have information that summarizes what's happening in each site. So you have access to what's happening on a per patient basis, but you also have access to what's happening on a per site basis. And you can then use both of those to more accurately simulate what's happening with the enrollment process thus far.
So that's probably the ideal scenario on the one that I nQuery predict probably caters to most in its initial iteration, and it's obviously a much more, I suppose some extent more interesting and more complex and more valuable. Information to be able to use the real data to most accurately do your protection board. nQuery also focuses on summary data, where if you don't have access to the interim data on a per subject basis or on a per site basis, you can use summary data. So, for example, if you only know that the study has been going for 10 weeks, and you've had 100 people recruited and you've had 10 events occur, and five that's like five dropouts occur, then that information is sufficient. But I nQuery predict to also make predictions. They won't be. as able to more I like to accurately estimate some information. And to do certain things, like, for example, would be more difficult to use.
something like a Weibo model for that case, because the Weibo model, you know, the chance of event, is dependent on how long you've been on the study. And you wouldn't have that information for summary data, but you would have that for interim data. So, Weibel modeling with interim data as possible. For summary data, you'd have to make some strong assumptions about when the people who are already in the study actually arrive into the study, how long they've been available to have the event. So, you can't do that, but you will be making synthetic choices at that point or more substantial esthetic choices.
In terms of the two targets, as I mentioned, it's kind of enrollment and events. For enrollment, we focused initially just on the Poisson Simple Poisson process, but will be extending doctrine numerous other models primarily probably focusing on a piecewise poisson type model. And then for the event milestones, you can do it on a blinded. Or on blind basis, Which is basically just a situation where you're treating the survivor processes coming from either a single group or, you know, coming from the individual groups, depending on whether you have access to the treatment ID in your interim data. Practically speaking and most clinical trials. And based on who's likely to be doing these types of predictions, the blind and situation is more likely. And, of course, for that reason, you're probably more likely to end up a situation where that's not as accurate, but it's still better than no guessing, or just kind of using simple calculations.
And in the initial release will be focusing on exponential time models, increasing piecewise exponential on the Y bar model. But obviously, we know it's Friday of the parametric survival model use. But we know piecewise exponential can be used to approximate pretty much any distribution to some extent for survival processes.
And of course, it goes without saying that, you know, that's about what you can do, what you get out at the end of it, you get visualizations, you get reports. And then you can get access to the simulation. Like the individuals' simulation results themselves will adapt to the actually an individual simulations themselves. Or summary of what happened in every single simulation boat off the subject level. Or if you could cite data on the site level as well. And we'll probably be adding more visualizations for that type of information as well. And you can see an example of the visualization here on the right-hand side, just showing the trend for this study, for the event process. So you can see here that this is. this is basically showed us in a moment anyway, but in this case, you know, around 21 months, it's the current time of the black line.
And then you can see calling from around 21 months onwards. You can see that we have these predictions using simulation, and a 95% prediction interval for what we think is likely to happen based on what's happened thus far. And that's integrating the effective enrollment, and the effect of all everything else on the dropout process and everything else, Which I'll be able to show during the actual demonstration in a moment.
So, this slide is just really a tabular summary of what I've discussed here, where, you know, I'm just giving an idea of what you can do for interim data, and what you can do for summary data. For interim data, you can pretty much do everything. We don't currently include weibel process for the dropout process. It just seemed a little bit unnecessary. But, if that's something is interesting, we're happy to do that. And obviously, you can't use can't do a pretrial prediction.
You can't mix like, you know, milestone predictions based on real data, if the trial hasn't started yet. So that that's fairly obvious, but I just thought I'd put it in there, because obviously if you have, if you're using summary data, you're just using, you know, fixed values, like sample size equals 100 events equals 10, dropout equals one. Time current time equals 10. Like, these are things that, you can all set those to zero, and then you're basically doing a pre trial, you know, estimation or simulation of what you think will happen based on what you think, the starting point, Like let's say, using the estimates that you have from the sample size calculation itself.
So, just I mentioned here, obviously, for interim data, you can both do subject level modeling for the enrollment process at both the subject level on the site level, both for the survival process. That is currently exclusively up the subject level. We were interested in looking at survival processes varying by size.
But I think from our perspective at the moment, pre trial, or certainly in trial, you're you're probably not hoping that the effect on a per trial basis is not that significant for the event process itself. But if you think it will be interesting to kind of have a survival process which does vary by the covariate of sys, then we'll be happy to look at that in future development. But for now, survival is treated as a global type process, not a site level process.
And obviously, for summary data, right now, we don't cover the site level modeling. So you can't create your own artificial sites at the moment, but that is on the development docket, probably for one of the very early updates, either later this year or early next year.
As mentioned earlier, we have unblinded, unblinded, or comparative, a non comparative. Depending on your terminology, prefer, obviously the comparative, non comparative, was highlighted in the FDA adaptive design guidance, and then for survival, we can also deal with the case of where enrollment is still ongoing, or if enrollment has already complete.
So, this is it will make a huge difference, but just just to note that both those cases are covered.
In terms of the enrollment bodil, we are, as I mentioned, focusing exclusively on the Pulse on type model at the moment, but we will be extending that significantly in future releases. And then for survival, we will be looking at the exponential piecewise exponential on Weibo Models. Just mentioned, as I said previously, the Weibel doesn't really make sense for summary data, because, as I said, the probability of having the event is conditional on how long you have been available to have the event on the ... Model. Which, unfortunately, means that if you're using summary data, you don't know how long the people who are still available based on your summary data have been in the study. So to some extent that basically obligates the, or makes it impossible to do weibel, what I'm making some strong assumptions are having to simulate some arrival times based on information you don't have available to you.
Presumably, if you're using summary data, just a small note that if you have at survival analysis in which you're going to have a fixed follow-up, so the classic survival analysis, it's typically that everyone is a study you started. There are some enrollment period, let's say 12 months And then the study has continued for maybe another 12 months. And then at the point that the study is finished, the remaining subjects are censored. At that point, we will have not had the event or have not dropped out or how hot somewhat competing processes occur.
At port, there is also the alternative, that in certain cases, it makes more sense to have a fixed follow-up, which means that regardless of when you joined the study, you will have some kind of fix follow, let's say, for 12 months. So, unlike the case where the other case, where, depending, when you get enrolled in the study, your maximum amount of time in the study will be different. So, you know, in the example 12 month, 12 month, if I got recruited, the very start of the study. I could be necessarily up to 24 months. If I was recruited at the very end of the enrollment period, I could only have a maximum time in the study of 12 months So that depends on when I was recruited for the fixed follow up That would be fixed. So you would always only have a follow-up of 12 months until you're who you were censored on. We're no longer considered.
As I mentioned, pretrial prediction is just literally there's no sample size. There's no events, there's no dropout. The current time is zero. We can handle that situation for the summary data, and you can kind of make some assumptions and modeling before the study occurs even.
So, this is the example I've used in the previous two webinars. I'll be using a different example in future webinars on this topic, but I thought I would just bring this I one final time. Make sure everyone is aware of it, is aware of this. And, basically, this is an example where we have a survival analysis where 50% of the total required number of events occurred. So, a 8800 to 374 port, 87% of our expected enrollment has occurred, some 402 at a 460. And, in this case, 118, out of 127 sites that we have available, have been already opened. So, we have initial nine sites that could be opened if we need them, at this point.
And, in this example, I've covered a lot of these scenarios before, but I'm hoping to show off some, maybe, some of the more obscure things as we're going along here.
But, basically, we're focused on what's the effect of looking at a subject level enrollment process, versus a site level enrollment process? What happens if we have on blinded data where we actually know the treatment group, versus the probably the more common situation, where we would not know that information. And then the effect of different survival models. So why bulk versus exponential, for example, and just do allow you to have a preview of what's going to come out of that.
The process itself here is basically the summary of what's happening at the current point in this dataset, and then some of the targets that you might have for that base on this paragraph here on the left.
So, this is what nQuery predict will look like when you first open it. When you activate NQuery predict or nQuery advanced expert, you'll see a button here called Create Prediction Double appear in the Home quadrant of the home screen.
And then, you can also select it from the toolbar, using a prediction with subject level data, or prediction with fixed parameters, and then you can also, in, you'll be prompted to import data, although you cannot just skip that if you're planning to use summary data.
So I think what I'm going to do is start with, well, it's probably going to be the most likely situation for this particular trial, where we have, where we want to predict events.
That's fairly obvious from the, the description of the problem. We don't have access to which group each subject is in. That's probably, as I mentioned, the more likely context for anyone who's involved in the on the ground with the actual trial themselves, obviously if it's a data monitoring committee or you want to get them involved, that's where the unblinded data might become useful. And I suppose to some extent if you're doing an interim analysis for group sequential design, you're going to have to look at the comparative unblinded data anyway. So in that case you might request to the DMC use prediction such as S will be shown here to kind of give you an estimate of how long the trial is going to take based on the information thus far. And I said, I might just be a useful additional compliment for them. And they may finally, obviously, they might find that information useful to themselves for in terms of making logistical decisions for the trial.
We're going to assume that we have access to cite data. So we don't just have access to subject, and we also have to cite data.
And we're going to assume that the enrollment is ongoing, that's kind of implied from how we've discussed the problem previously, so just to give you an idea of what the data we're looking out, will look like.
This is kinda some data, it's second from, I think an example of that, that I find for some real or mostly cleaned up clinical trial data, which I've added some noise suit, is kinda create a different problem.
But, you can basically see here on the right, you have region, which are what we won't be covering. NQuery, predict one point. Oh, but is a, you know, one of the first things that we'll be looking at in future updates. But you can see S ID here stands for site ID. So, we kept for each subject when which site, that particular subject came from.
We have their arrival time. So, this is the time that they arrived into the study starting from around polio by zero point two.
Monson, until around, 24 months, in or 25 months in, we have the follow up time, which is only relevant as viable possibilities, just like, how long were you went to study until something happened? And the definition of something in this case is, you know, what is your current status?
And your current status is basically either you had an event, you dropped out, or not you are still available to have the event and that is defined by its current status call him here where one is an event zero is a someone who would be censored if we entered the study now.
Or basically, more and more, more easily, Who is still available to have the event? On is a dropout, but as we'll show in a moment, you can customize what those particular codes are.
They don't have to be 1, 0 -1, that could be, you, know, E A and D for event available dropout.
It doesn't really matter, as long as there's three categories, or if one of the categories is myths and you can specify that, cost you using the software itself.
This case, we have the treatment group, zero for Control one for Treatment Group. We will be using that for now. But we will be using that when we show the effect of using unblinded data instead.
So the follow up here, as I said, depends on your current status.
So if you've had the event, this was better just the amount of time you're in the study, until you had the event.
If the same follows, if you have dropout. So if you are 1 and -1 in the current column, that means is how long you're in a study until something happened to that event? Or dropout occurred. And then if you have zero, that indicates how long you been in the study until the current time at which point you could be censored if we entered the study right now. Or, you know, in this case will be continued study onwards.
How long you've been in study?
In addition to how long we're going to simulate, you're going to be in the study going forward until you have the event that we simulate for you.
And obviously we're going to be stimulating some brand new subjects as well and we're gonna give those an event and a dropout time as well.
Just mentioned the site data, once again, we need that. Each one of these rows corresponds to a site. And the information we need in this case, it's just what the ideas. So we can link these two together. So, you can see psi D 1 1 is here, ... 101 is here, so, that tells us, here, at this person came from one-on-one and this is what we know about one-on-one or what we want to know, But one-on-one to be able to simulate a site. We have a cap here, does the maximum number of people who are willing to recruit, not particular size, the time that that site opened, for the case where the site has already opened, The rate that we've seen in that site. Or the rate that we expect, that sites, In this case, it is the actual rate scene in that site, except for the handful of sites that we have in this dataset who are on opened. And you can see that for on open sites, there's no open time arrival time. But we do have a start and an end time.
And this is basically the window of time during which we believe that this study will be allowed to open our We're going to be opening this study, and we'll be kind of randomly picking between these two values to pick the actual opening time, the simulated opening time, for these particular on open sites.
And it will, nQuery predict will automatically detect that Any site that has an open time is assumed to already be open, or more accurately, it's any open time that's less than the current time applied from the subject data is already open. Then, any site that isn't open, needs to have a window of time during which it could be open, and you could set this V equal to each other if you want to have a fixed opening time going forward. But, we, we do allow some, you know, randomness in terms of when that could open a window of time when it can open. Instead, an nQuery at the moment.
So, once we select the type of prediction that we're interested in, and obviously, I mentioned here, enrollment is ongoing, it's not complete.
Then, we move on to our next steps, which is usually then selecting the datasets that we want, or for the case of, when we're using summary data, then we're entering the, you know, fixed values that we have for the sample size recruited so far, the current time, et cetera. So, we're going to select our subject data.
And it automatically detects, in this case the arrival time because it's the only column that corresponds what you need for arrival time, which is real values that are greater than zero.
We need our status indicator, which was the current the basically, current status column, I'm reading texts, the 1 -1 and 0 are our respective categories, but if that was wrong, you can obviously change needs to be anything else. I believe it's just on alphabetically by default, but you can obviously change to see if events status was actually , we can select that. And 1 -1 and 0 are correct. Obviously, this is, you know, data that I've obviously optimized for nQuery predict today.
Time on study is the follow up. So this is how long you've been on the study. And as I said, that definition depends on whether you've had the event or not yet.
And then the site ID, which is what we're going to use to link our site data on our subject data so that we know what's happened in each site so far by counting up how many cities occur in our subject data.
We didn't get to our site data, which because we only have two datasets at the moment, it automatically just assumes. The second data set is the correct one. If it was more than two, then it might slightly, you know, it'll either be empty, or might select one of them by default. And in this case, this happens to be optimized to select the right ones. Where we see the site ID is correct, we have the rate, the rate, recruitment, right?
Do we expect in each site, the enrollment cap and each site, the opening time for all of our open sites. And then, as mentioned here, we don't have to have, on open sites, We can choose to ignore that, and leave. that is optional. But in this case, we do have an open sites.
And for them, we want to specify that they're opening and closing window of time is from the start and end columns in that data set.
So when we're thinking about the prediction for survival, we kinda split into two parts. There's, well, what's the recruitment process going to be? Obviously, this will be mostly skipped if we had enrollment set to complete and then what will the survival process to be? So let's focus here on the accrual options first. On the main thing here on the main screen, it's just what is our target sample size? So we said that was 460, I believe.
So, we're, you know, we already have 402 people in our study. We want to have a total of 460, we can see here just 212 people who are still available to have the event. So, if we add 58, which is the additional number of people are going to recruit and to study. And we added 212 and we can see that we have an additional 5, 270 subjects, who can we're going to simulate their event slash dropped by time. So, we're gonna have 58 new people that we're going to recruit. We're gonna fight, we're going to simulate what their recruitment time is, and then we're going to have an additional 270 people that are going to recruit. We're going to estimate or predict what, we think they're estimated event slash drop by time, it's going to be, and I'm not gonna focus on too much here, but just know here that, as I mentioned, you can for the case where you have a survival analysis and you want to have a fixed follow up. And you could set that on the screen as well.
But for now, let's just stick with the more common scenario where all subjects will be followed until the end of the study, at which point those who have not had the event or dropped out will be censored.
So we can click on this tab here, but this next up is important. So if you click next, it will automatically bring us to this anyway. And this is our accrual information per site top.
Basically, we see here the top that we can specify the total number of sites. And we can add new sites or remove all of the on open sites, if our interests there are shown on the moment. But just to focus on the table here for a moment, you can see here, will ignore this for now. Because it will automatically pop the open sites at the top of the table.
And we can see here that we get the accrual rate per site, which is what's being taken from the site level data.
So, that corresponds to this column here, the right column, and you can see here that from the workspace tab on the left, we can easily go back and forth, to each step of we need to.
We have the enrollment cap, which also was taken from the data from the cap, call him in our site data. And then we saw the plan, the cruel, right site. Which is just a read only version of the site data call you in so that you can know that if you make changes to this call you, that you know what that was originally meant to be. The site initiation time. This is when the site actually opened and we have the number of accruals that have occurred in this site so far, just a small note here, that this number of accruals here is based on a count of how many times each S ID occurs in the subject level data.
While in this case, we've calculated the the planned accrual rate per site based on this number of accruals that is not necessarily true. So, note that this column here is based on the subject level data, based on the count of the linking site ID column.
And the, you know, the implied rate from what's happened so far could differ from this. So, for example, there are open sites, which have had no people so far, but you can see they do have a plan, you know, recruitment, right? That's not equal to zero, for example.
So, just a small note here, this is based on the site level data as kind of a pretrial estimate or maybe calculated using these number of accrual spot that they don't necessarily have to correspond to each other.
Then you can see that for the open sites, things are pretty much the same, except we obviously don't know the site initiation time. And logically, no one can be recruited in that study. And then the main thing is we have these two editable calling this where we can change the window of time during which someone could be entered into this study. And obviously, you need the start time here to be greater than the current time.
And in the case of the current calendar time, basically the amount of time since it started to study, that is automatically calculated for you from the subject level data, by taking the maximum of the arrival and follow up times for the available subjects. So anyone who has a zero, we take the sum of these two. We see which one is the maximum.
Logically, if the data has been done correctly, do you should act all the zeros to somebody? You should actually be identical? They should equal each other. That makes sense. If the current time is correct. bullish, if they aren't, will take the maximum and assume that's correct.
Just note the current count, two times around 25 months into this study, that's been calculated automatically based on the subject level data, as the maximum for those who current status is available of the arrival plus the follow-up time. So remember, the follow-up is times the amount of time you've been on the study, since you've arrived into study. So the amount of time.
in total, until the current point is the arrival time plus the follow-up time.
And so, you can also see that for every single site, we can edit what we actually think the recruitment rate is going to be in this particular site. We can change the enrollment cap, if a particular interest, we can set this to a high value if we don't want to have an enrollment cap, for example.
And, you know, we can, as I said, we can change the window for the open sites. And, as I mentioned, if we want to add new sites that aren't from original site level data, that's very easy to do. We just change the total number of sites to some value greater than the original data. And then we have an additional row here, And we're obviously we need to specify the window of time during, which we think to study.
My particular site may open, let's say, 27 to 30 months, 30 months.
Then let us say, we want a ...
rate of one per month, and then an enrollment cap of 10.
So that's very easy to add.
And if you want subtract trial sites as needed, But just a note here that obviously you, any site it's already opened can't be deleted and this one.
If in that case you would need to set the accrual righteous to zero to do effectively the same.
But let's get rid of that artificial site and remain with our original dataset.
I'll move on to our final step, which is the event and drop out information.
And so, in this case, we can see that we still have to target sample size, which is inherited from the previous step, where we can still edit that, at this stage, if we want to. And this will override the previous value.
And we now see that we can see there's 187 events that have happened so far, and 374 events, by default, is our target number of events, That's just twice the current number of events.
Basically, we just assume that the The target sample size is double the current sample size, and the target events is twice the current events, as a starting assumption, but, obviously, you'll need to change that to whatever is the correct value for your particular study.
And, you can see here that, below this, we then select the response distribution, the survival model that we want to use, to simulate our event process. And our main two choices are between exponential on Weibo and Weibel. Here, you can see we have our scale and shape parameter. There's not too much going on here. If you're familiar with these distributions that you should be relatively easy to understand. But not for the exponential. We can also add a piecewise study where we can change how we think the event rate would change over time effectively and there's a lot of flexibility, obviously comes from Omega Piecewise, exponential distribution.
Let's for now assume the default exponential, and you can see here that it has a great that's been shown by default. This is the estimated hazard rate based on the subject level data based on for the subjects.
You have have a current status upon this current started having event, we take all their follow up times and then we do a fairly simple calculation to derive what the estimated event rate would be based on that, using the exponential distribution. And you can see we're also doing the same for the Weibull distribution here.
So these values that are shown by default will change, depending on what data you put into this, into the, into the algorithm, and they will automatically kinda pick the best estimate that we have so far based on those.
And then it's very similar for dropout, except that we're limited to the exponential distribution. But we can have a piecewise distribution that we want, as well.
So once we specify our exponential Event and Dropout Model, in this case, when they move on to the simulation controls, it just allows you to select how many simulations you want. We'll have a percentile somewhere, you can change the percentiles of interest, and you can even save these optional datasets as well, which give you a summary statistics for each simulation run. You can put as many of the individual simulations as you want, using this option here. You can see what happened in each site, on average, you using the sidewise summary, and then you can actually see in any given individual simulation, what happened in each site, using this option here.
So, we click Run, and then we'll get a calculation, will get this Simulation and Progress bar, which will just give us a live update of what's happening in our simulation as all, obviously, giving us an idea of how progress is going on. The calculation here isn't too slow at all, and then we'll finally have one step at the end, just letting you know that we're creating reports and creating all of the additional plots, et cetera, that we're going to shift to you in a second.
Now, obviously, if you increase sample size as significantly, then this simulation would take a lot longer, but in this case, we're only simulating You an additional 52 people, and when even simulate your survival times for around 200 people. So, unsurprisingly, for this case, the calculation doesn't take too long.
So, I think, first thing, first, let's just look at the simulation summary.
So this is really just a, you know, an overall summary of what on the left hand side, will see what went into the simulation. And this is very detailed, giving everything that went into the simulation. And then on the right-hand side, the results of the simulation telling you that in this case, the average sample size was 460. So we reached our sample size target, and the vast majority of cases. And that. In this case, the accrual iteration took another approximately two months or so. So remember, we're around 25 months in at the pointed to study. This analysis is happening, but that it was 43 months for us to reach the target number events that we needed of 374. And we can see, in this case, the 374 targets for events was reached for every single study.
Just a small note here that we include this, because if you had an aggressive dropout process, for example, then there's a chance that you wouldn't reach your target events. And therefore, this would tell you how many times you didn't reach the target of events.
In that case, you see those around six dropouts, on average. The average follow-up time for each subject has run 12. And so, that's how long they were in the study until something happened. Either, they dropped out, or at the event, or they reached the end of the study. And I can see, in this case, at all, sites ends up being open. So, all the sites that we've just fight had windows that occurred. You know, the randomly assigned at initiation time, based on the, the end and start time we gave them was within this was, you know, they were built below 27 so that once these all opened in this case. And you can see that we have the 5% to 95% percentile, 25, the inter quartile range effectively for what happened in the simulation, and the median or 50% percentile for the events, sample size and dropout accrual, study duration, etcetera. Obviously, events and sample size will reach in all cases to these. Don't tell you an interesting.
But you can see the dropouts range from 4 to 9 acculturation for one hundred twenty six point six hundred twenty seven point five, 40 mm zero point nine eight up to 45.75 on all cases sites opened fully.
And you can also see that there's a lot of plots here available that you may be interested in. So, you can see here the enrollment production. You can see here that, you know, maybe there's reason to re, think our current estimates for the recruitment rate because this looks like a bit out of place compared to what we had here previously. So, we may need to make some you know estimates here.
Or perhaps there's just some low here depends on what you know that this is where your domain expertise And understanding what's happened in trials so, far is important, whether this is actual plateau or whether this was just a low due to some other exit Janice effect or you weren't recruiting people over this period and we're expecting to recoup more going forward.
You didn't have the events prediction which we see as this kind of uh, concave SRE convex type of distribution. As you can see here that you know in this case once again probably GT enrollment process, it kind of goes live out here. But if you were to draw a line of the process, kinda go in here, it kind of fits well the overall event process. But perhaps there's a case here that this is kind of tapering off and perhaps, I should have been lowered smell.
And then, for dropout, just not that much information, because there's so few dropouts, but you can see it's mostly consistent here.
And as I mentioned there, the tables that are used for each Tildes graphs is available in the Tables tab here.
So if you want to see what went into that, into that graph, you can see in detail what, what did those in these prediction tables here. And then as I mentioned, the optional elements are the summary statistics for simulations. So for each simulation we can see exactly what happened, like how many people were censored and how long the study took.
We have the person simulations, subject level data, so we can see that in Simulation one, for subject for that was already in the study. They assumed an event time of 26.99, 4, and a dropout time of 1.97. And if we'd done this, this note, that this is the sum of the Simulated Value Plus, the time that they were in the study, up until this point.
Which, first, for subject four, is equal to this, I believe, equal to around nine, I think.
But regardless, the important thing is that this is the total time, not the, just the time, from the current time.
And you can see here that like that time happens to be less than less than the like the time simulators, and then it's less than dropped by time as well.
We can see what happened in each site on average.
We can also see what happened in simulation one for each site. Like when that side open wall is the observed enrollment rate, how many accruals occurred et cetera?
OK, so hopefully that gives you a good idea of what's happening in nQuery, I suppose just with the final few moments that we haven't, can we have like when we had five minutes or 10 minutes left? I apologize. We run over a little bit.
We would like, What would the effect of being if we'd gone for an unblinded events process instead. So, I'll just go over the prediction to this other tab that we opened allowed in the datasets that we have.
Like, so, and then we'll restart this prediction at the beginning, XML have unblinded data, events instead.
And most everything is more or less the same, where We have the same inputs and the same values, of course.
This ID, frankly, the site thing we don't need to select and they're not done for us automatically.
The accrual process is basically identical. We don't need to worry about the recruitment, like the approval process. We just need to change the target to be 464.
And then, we can see here that the big differences on the event process, where now, instead of treating the events process that's coming from a single global survival process, we're now treating as coming from two per group survival processes, based on the per group indicator, the treatment indicator we had in the site level data.
So, you can see here that we've automatically calculated what the hazard ratio would be. Our best, that's been a hazard ratio based. What's happened so far, Vriend point seven, And we can see what the control and treatment rates are here.
And we can see what the dropout model. We can do it on a per group basis, as well.
But from a practical purpose, what we're interested in here is just kind of seeing what were the, what's the effect of having this additional information on our time estimates.
That's a distance a little bit.
And so, You know, we also might want to see some other, I think there's a lot of other things we might be able to consider.
I think, know, we're running a little bit over time, so I won't be able to go into all of them, but I'm hoping to show you just kind of flavor of the type of options that you have, the type of effect you might have, depending on the type of information you have available.
And so what we'll probably see in this case is that the event process ends up being a little bit, ask, you know, ends up being a little more accurate. And that probably ends up helping us get a better idea of what we expect the actual timing of the process to be.
So you can see in this case that the accrual processes is really effective. That's basically identical. But you can see that the study duration here is around 43.76 months. So if we compare that to the report from the blinded case, that was around 43.28. So not a huge effect, but not not no effect. Either. We can see that we need a slot for being more accurate. Then we have a slight underestimate by using the blind a data compared to the unblinded data.
You can see here, there was actually a small case here where this happened to one site, didn't open one time in one of the simulations. But for practical purposes, the site's nearly all was opened. But you can also see that we get an idea of what happened on a per group basis. Note, by default, an nQuery predict right now. It seems that the sample size ratio in the subject level data, It's the sample size ratio one point. At the end, that's why these are the slightly not exactly equal at this point, but you see that there's obviously more events in the control group than the treatment group as expected.
And in this case, there's slightly less of dropouts in the control group versus to treatment group, but, you know, overall, this ended up didn't making a huge difference. But let us say, What if we use the weibel distribution instead of the exponential distribution?
And, you know, in that case, the Y Val is a you know, it's a more flexible survival model. We know that the exponential is a special case of the Weibo Model when indicates that the shape parameters are equal to one.
And you can see that the shape parameters here, if they are equal to one, that tells you that your data is, you know, fairly exponential, these aren't, know very close to one, but they're not you know ridiculously far from one either. So, we're not dealing with data where the exponential model is likely to have been incredibly inaccurate but it is likely based on data that the Weibull model will be a little bit off or you know, some some way off from.
What we got from the exponential, you can see here, that, you know, the study duration has increased around 50 months, compared to the 47 months can come to the conclusion that, we were being a little bit optimistic.
When we're using the more simplified exponential type processes, that, perhaps if you're using a more flexible, why bold type approach, and we're kind of using how long you were in the study, to kind of give you an idea of how long you're likely to survive going forward.
Then you might end up coming to inclusion that perhaps things aren't quite as rosy as like, 47 months or whatever that is, and, of course, you can play around with all of these different things and stuff like that.
And, of course, if we were to look at the situation where we had only subject data, no things are more or less the same, from a practical point of view.
Pause, just to note here, dots, you know, we're obviously dealing with a situation where we only have a single global recruitment rate, instead of the site level, site specific level information that we're using, in this case. So, instead of simulating each site, we're just assuming some kind of global event process, global recruitment process. And then, from our perspective, everything after that is more or less the same.
I'm just briefly mentioned up for the summary data case.
Know, we could, in some, In some cases, just assume that The study hasn't even started yet. Look, we could do two things. We could either, you know, they take the data and split it out. Take 402.
At 201 per group, and then the number events is 187.
We can split that in some way. So, let us say 8700 and control group 87 in the other group.
On. so on and so forth. But, one interesting mode, you know, we can tend to have the dropouts B 1 and 2, which I believe it's actually the value. And then, it's around 25 months in. And then, we could use this to kind of do a simulacrum effectively of what we did previously.
But unfortunately, I think, we've reached the end of this. But, if you have any questions about entering shown here, and if you want to see some, like, the survival stopped on the more detail on the enrollment stuff done in more detail. I did cover both of those separately in the previous two webinars, so, there's lots of useful information in those webinars as well, if you want those there.
So, just a very briefly, like discussion discussion here, is just like your delays are common. in clinical trials, the tools are going to allow us to, you know, more accurately, have an idea of how the trial is actually doing.
In real time, or useful, Anchor Predict is our solution for that, using simulation.
Prediction models are needed for a variety of different levels of data that might be available. Summary data to interim data. We might have site level data, and you know, it's a variety of different methods that could be propose, some of which are available, then cryptic many more which will add in the future. We'll hopefully get a flavor of what might be available. And nQuery predict. And maybe you have ideas of things that we should be looking at going forward as well.
If you want any more information and nQuery in general and some of our workshops and previous webinars, you can go to ...
So if you want to see those previous two webinars, that cover the milestone prediction at a kind of technical, broad level, then they those are two good resources to find out, such as the last two webinars, our July and August webinars. Just to mention that, there's references at the end of the slides For both, for pretty much everything we've covered here today.
So, I'm going to take a couple of seconds here. Just take any questions we had. I think because of loose time, I probably won't have time to answer very many of them. But I'll make sure to answer anyone that you have after this webinar is complete later today by e-mail.
So, there's just one question, while there are several questions. But the one question that is mentioned here is just the, I think, just came in late there, when I showed off the subject level. Where it's just asking, can you have, you know, changing?
Can you change the period of time that you're selecting the recruitment rate, basically? Because you had the situation in the dataset where you kinda have this flick in the enrollment prediction rather than a kind of following the trend. So, that is one of the things we're very closely looking at, Your office, You can calculate this yourself If you needed to, and from the original data, and you can input that manually.
But my hope is, in one of the first updates, we have to kind of provide more flexibility, and not only be able to select which period of time will be used to give you the default rate, but also to provide options for piece wise recruitment process. So, basically, this table would extend for multiple rows, and you can change dot OK from time, 25 to 28.
I went, There couldn't be 16, But from then on, I want it to be 32 or something like that.
Basically very similar to how the exponential table works here when I increase the number of hazard pieces. So, this is basically what we'll be doing for the recruitment process on the global basis. And, you know, to be more difficult for the, for the site level basis, mostly for UI reasons, But we'll be looking to do that on a per site basis as well, going forward, rather than just having a single recruitment rate per site.
OK, there's a number of other questions by the Fed, if you have any questions about purchasing, nQuery, predict, or technical issues or any other questions about it, I would recommend e-mailing And we'll be happy to facilitate you with all the information that you may need for that. And, as I said, any questions I didn't get to today, I'll make sure to e-mail you later on with my reply and response and provide you access to the information as needed.
So, I think just the apology for overrunning a little bit here, but I want you to thank you so much for attending today, and I hope you a very good day and goodbye.
These Stories on Guide to Sample Size
No Comments Yet
Let us know what you think