Information Sharing in a Multi-Echelon Inventory System
Information Sharing in a Multi-Echelon Inventory System
Information Sharing in a Multi-Echelon Inventory System
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
468<br />
order<strong>in</strong>g costs are charged at both the supplier and the<br />
retailer, whereas fixed order<strong>in</strong>g costs only occur at the<br />
highest echelon <strong>in</strong> Clark and Scarf’s model. The assumption<br />
of fixed order<strong>in</strong>g costs is realistic, but results<br />
<strong>in</strong> analysis difficulties. The assumption causes the<br />
model to no longer possess structural properties for optimal<br />
<strong>in</strong>ventory policies.<br />
The system was modeled as a discrete time Markov<br />
decision process (DTMDP) to obta<strong>in</strong> the optimal <strong>in</strong>ventory<br />
policies. The state space of the centralized system<br />
is<br />
S = { ( x0, x1) | 0≤xi ≤ Ui, i = 01 , } (1)<br />
The action space of the system at state ( x0, x1)<br />
is<br />
{ }<br />
Ax ( 0, x1) = ( a0, a1)|0 ≤ai ≤ Ui − xi, i=<br />
0,1 (2)<br />
The one-step transition probability from state ( x0, x1)<br />
to state ( x′ 0, x′<br />
1)<br />
with action ( a0, a1)<br />
is<br />
P{ ( x′ 0, x′ 1) | ( x0, x1) ; ( a0, a1)<br />
} =<br />
P{ ( x′ 0, x′ 1) | ( x0, x1)<br />
} =<br />
P{ x′ 0 | x0} ⋅ P{ x′ 1| x<br />
1}<br />
(3)<br />
where x i ( i = 01) , is the state of stage i after the<br />
shipment from the supplier to the retailer and before<br />
the customer receives his order. S<strong>in</strong>ce x1≤x1≤ x1 + a1,<br />
then<br />
⎧1,<br />
Px { ′ 0 | x<br />
0}<br />
=⎨<br />
⎩0,<br />
For the retailer,<br />
if x′ 0 = x<br />
0;<br />
else<br />
(4)<br />
⎧P{<br />
D = x1 − x′ 1} ,<br />
⎪<br />
P{ x′ 1| x1} = ⎨P{<br />
D≥ x1} ,<br />
⎪<br />
⎩0,<br />
if 0 < x′ 1≤x1; if x′<br />
1 = 0;<br />
if x′ 1 > x1<br />
(5)<br />
The transition probability only depends on the current<br />
state for a given policy.<br />
Denote function 1{} ⋅ as an <strong>in</strong>dicator function. The<br />
one-step cost is<br />
c(( x′ 0, x′ 1) | ( x0, x1) ; ( a0, a1))<br />
=<br />
c0(( x′ 0, x′ 1) | ( x0, x1) ; ( a0, a1))<br />
+<br />
c1(( x′ 0, x′ 1) | ( x0, x1) ; ( a0, a1))<br />
(6)<br />
where<br />
c0(( x′ 0, x′ 1) | ( x0, x1) ; ( a0, a1))<br />
=<br />
hx′ 0 0 + p0( a1 − x1 + x1)<br />
+<br />
qa + K ⋅ 1 a > 0<br />
(7)<br />
0 0 0 0<br />
{ }<br />
Ts<strong>in</strong>ghua Science and Technology, August 2007, 12(4): 466-474<br />
c1(( x′ 0, x′ 1) | ( x0, x1) ; ( a0, a1))<br />
=<br />
hx′ 1 1+ q1( x1− x1) + K1⋅1 x1− x1><br />
0 +<br />
p max 0, d x 1 x 0<br />
{ }<br />
{ − } ⋅ { ′ = }<br />
1 1 1<br />
(8)<br />
The long-run average cost of the system, given policy<br />
ϕ = { a0( x0, x1) , a1( x0, x1) ∈ A( x0, x1) , ( x0, x1) ∈ S}<br />
, is<br />
then given by<br />
C( ϕ) = ∑ π(<br />
x0, x1) ⋅ EC(( x0, x1) ; ( a0, a1))<br />
(9)<br />
where ( x0 ( x0, x1) ∈S<br />
x1)<br />
( x0, x1)<br />
for policy ϕ and x0 x1 a0 a1<br />
π , is the stationary distribution of state<br />
EC(( , ) ; ( , )) is the<br />
expected one-step cost at state ( x0, x1)<br />
.<br />
EC(( x0, x1) ; ( a0, a1))<br />
=<br />
P ( x′ , x′ ) | ( x , x ) ; ( a , a ) i<br />
∑<br />
( x0′ , x1′ ) ∈S<br />
{ }<br />
0 1 0 1 0 1<br />
c(( x′ , x′ ) | ( x , x ) ; ( a , a ))<br />
0 1 0 1 0 1<br />
(10)<br />
Thus the optimal policy of the centralized system is the<br />
policy that m<strong>in</strong>imizes the cost <strong>in</strong> Eq. (10). Accord<strong>in</strong>g to<br />
Putterman [11] , the policy can be obta<strong>in</strong>ed us<strong>in</strong>g a standard<br />
value iteration algorithm or a policy iteration algorithm.<br />
3 Decentralized <strong>System</strong><br />
In a decentralized system, the supplier and the retailer<br />
make their decisions <strong>in</strong>dependently. The decision objective<br />
of both stages is to m<strong>in</strong>imize their own expected<br />
long-run-average costs.<br />
3.1 Full <strong>in</strong>formation shar<strong>in</strong>g<br />
If full <strong>in</strong>formation shar<strong>in</strong>g is available <strong>in</strong> the system,<br />
the supplier’s order<strong>in</strong>g policies and state are known to<br />
the retailer, and vice versa. Therefore, the state spaces<br />
of the supplier and the retailer are the same.<br />
S = S = ( x , x ) | 0≤x ≤ U , i = 01 , (11)<br />
{ i i }<br />
{ ≤ ≤ }<br />
0 1 0 1<br />
A ( x , x ) = a ( x , x ) | 0 a ( x , x ) U −x<br />
0 0 1 0 0 1 0 0 1 0 0<br />
{ ≤ ≤ }<br />
A ( x , x ) = a ( x , x ) | 0 a ( x , x ) U −x<br />
1 0 1 1 0 1 1 0 1 1 1<br />
(12)<br />
(13)<br />
Assume that the supplier adopts a stationary policy<br />
ϕ 0 = { a0( x0, x1) ∈ A0( x0, x1) , ( x0, x1) ∈ S0}<br />
, which is<br />
known to the retailer. At state ( x0, x1)<br />
, the retailer orders<br />
a1( x0, x1)<br />
and the retailer knows that the supplier<br />
ϕ0<br />
will order a0 ( x0, x1)<br />
. Therefore, the one-step transition<br />
probability of the retailer from state ( x0, x1)<br />
to<br />
( x′ x′<br />
)<br />
a ( x , x ) is determ<strong>in</strong>ed by<br />
state 0, 1 with action 1 0 1