25.11.2014 Views

Biostatistics

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

14.2 TIME-TO-EVENT DATA AND CENSORING 755<br />

The probability distribution function, just as defined in Chapter 4, is represented by<br />

the set of probabilities that specify the possible values of a random variable. In the context<br />

of survival analysis, this density function represents the probability of an event occurring<br />

in a defined interval of time. We might ask, for example, what is the probability of<br />

surviving 2 months? Although fully appreciating the intricacies of this probability<br />

distribution requires knowledge of calculus, we can illustrate its meaning conceptually<br />

by remembering a concept from our discussion of the normal distribution in Chapter 4.<br />

When we calculated probabilities for the normal distribution, we were interested in<br />

calculating the area under a curve that was bounded by two values. Similarly, in survival<br />

analysis we are interested in calculating the probability of an event bounded by an interval<br />

of time, say Dt, and then finding our probability as the interval becomes very small, that is<br />

as Dt ! 0. Hence, the probability distribution function, f ðtÞ, is defined by<br />

f ðtÞ ¼<br />

Pðt T < t þ DtÞ<br />

; as Dt ! 0 (14.2.3)<br />

Dt<br />

That is, the set of probabilities of events that occur in an infinitesimally small interval of<br />

time defines the probability function. It is also possible to find this function by examining<br />

what happens during a change in FðtÞ, say DFðtÞ, or a change in S(t), say DSðtÞ, in a given<br />

interval of time. That is<br />

f ðtÞ ¼ DFðtÞ<br />

Dt<br />

¼<br />

DSðtÞ<br />

Dt<br />

(14.2.4)<br />

Finally, a function that is often encountered in survival analysis is the hazard function,<br />

hðtÞ. This function is used to define the instantaneous probability of an event occurring given<br />

that the subject has survived up to a given time, t. This function is defined as<br />

hðtÞ ¼<br />

Pðt T < t þ DtjT tÞ<br />

; as Dt ! 0 (14.2.5)<br />

Dt<br />

Note that this function is based on a conditional probability, wherein we are interested in<br />

calculating the probability of an event occurring given that the subject has already survived<br />

to a defined time. The condition of having already survived to a given time means that the<br />

probability of surviving into the future is influenced by having already survived previous<br />

time periods. This idea can be very important in some instances, where surviving the early<br />

stages of a disease may dramatically decrease the potential of an event occurring in the near<br />

future. As an example, consider cancer where nonrecurrence, or remission, for a period of<br />

5 years generally increases survivorship. This function can also be expressed in terms of<br />

two functions previously defined. This expression is<br />

hðtÞ ¼ f ðtÞ<br />

SðtÞ<br />

(14.2.6)<br />

Because the hazard function can exceed 1, it is not truly a probability, though it is based on<br />

the conditional probability of an event occurring. The hazard function is often defined in<br />

survival analysis by a known distribution such as the lognormal, exponential, or Weibull

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!