06.06.2013 Views

STOCHASTIC

STOCHASTIC

STOCHASTIC

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

W. T. ZIEMBA<br />

the use of the dynamic programming approach. The paper presents a general<br />

framework and a mathematical analysis that attempts to isolate those problems<br />

for which the dynamic programming approach is valid. The exposition<br />

is an adaptation of Denardo and Mitten (1967) that is based in part on some<br />

of their earlier work (see the literature [4, 5, 8]). 2<br />

We will utilize the term sequential decision processes (SDPs) to describe<br />

the phenomenon under study. It will be assumed that the mechanism governing<br />

the evolution of the process is explicit and that the SDPs satisfy certain<br />

termination and monotonicity assumptions. The termination assumption<br />

guarantees that the process is completed in a (fixed) finite number of transitions.<br />

The basic result for such processes is that there exists an optimal policy.<br />

The development also yields an algorithm for determining an optimal policy<br />

and its return function.<br />

The analysis of terminating SDPs requires less formidable mathematical<br />

tools than is required for the analysis of general SDPs. However, the more<br />

complex models, such as infinite horizon models, share many features of the<br />

terminating models. The language used to describe terminating models may<br />

be used for general SDPs. Also the monotonicity property plays a central<br />

role in the analysis of these models as well as of the terminating models.<br />

Furthermore the property of evolution dependence, discussed in Section V,<br />

facilitates the analysis of a large class of infinite horizon models using the<br />

methods discussed here. The development illustrates the central role in<br />

dynamic programming played by the notion of an optimal policy. This notion<br />

may be accepted as a premise called the "optimality postulate" in which it is<br />

assumed that an optimal policy does exist. Utilizing this postulate and the<br />

monotonicity assumption, the functional equations are derived.<br />

II. Sequential Decision Processes<br />

A review of the wide variety of problems that have been treated by dynamic<br />

programming reveals that the vast majority (if not all) may be characterized<br />

as processes that pass through a set of states in response to a sequence of<br />

choices of decisions. The values associated with the process typically depend<br />

on both the states traversed and the decisions made. The following five factors<br />

may be identified as the basic elements of dynamic programming problems:<br />

stages, states, decisions, transitions, and returns. These terms will be given<br />

2 Throughout this book, references cited by numbers in square brackets will be found in<br />

the reference lists at the ends of the articles in which they occur.<br />

44 PART I MATHEMATICAL TOOLS

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!