11.08.2013 Views

Information Sharing in a Multi-Echelon Inventory System

Information Sharing in a Multi-Echelon Inventory System

Information Sharing in a Multi-Echelon Inventory System

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

468<br />

order<strong>in</strong>g costs are charged at both the supplier and the<br />

retailer, whereas fixed order<strong>in</strong>g costs only occur at the<br />

highest echelon <strong>in</strong> Clark and Scarf’s model. The assumption<br />

of fixed order<strong>in</strong>g costs is realistic, but results<br />

<strong>in</strong> analysis difficulties. The assumption causes the<br />

model to no longer possess structural properties for optimal<br />

<strong>in</strong>ventory policies.<br />

The system was modeled as a discrete time Markov<br />

decision process (DTMDP) to obta<strong>in</strong> the optimal <strong>in</strong>ventory<br />

policies. The state space of the centralized system<br />

is<br />

S = { ( x0, x1) | 0≤xi ≤ Ui, i = 01 , } (1)<br />

The action space of the system at state ( x0, x1)<br />

is<br />

{ }<br />

Ax ( 0, x1) = ( a0, a1)|0 ≤ai ≤ Ui − xi, i=<br />

0,1 (2)<br />

The one-step transition probability from state ( x0, x1)<br />

to state ( x′ 0, x′<br />

1)<br />

with action ( a0, a1)<br />

is<br />

P{ ( x′ 0, x′ 1) | ( x0, x1) ; ( a0, a1)<br />

} =<br />

P{ ( x′ 0, x′ 1) | ( x0, x1)<br />

} =<br />

P{ x′ 0 | x0} ⋅ P{ x′ 1| x<br />

1}<br />

(3)<br />

where x i ( i = 01) , is the state of stage i after the<br />

shipment from the supplier to the retailer and before<br />

the customer receives his order. S<strong>in</strong>ce x1≤x1≤ x1 + a1,<br />

then<br />

⎧1,<br />

Px { ′ 0 | x<br />

0}<br />

=⎨<br />

⎩0,<br />

For the retailer,<br />

if x′ 0 = x<br />

0;<br />

else<br />

(4)<br />

⎧P{<br />

D = x1 − x′ 1} ,<br />

⎪<br />

P{ x′ 1| x1} = ⎨P{<br />

D≥ x1} ,<br />

⎪<br />

⎩0,<br />

if 0 < x′ 1≤x1; if x′<br />

1 = 0;<br />

if x′ 1 > x1<br />

(5)<br />

The transition probability only depends on the current<br />

state for a given policy.<br />

Denote function 1{} ⋅ as an <strong>in</strong>dicator function. The<br />

one-step cost is<br />

c(( x′ 0, x′ 1) | ( x0, x1) ; ( a0, a1))<br />

=<br />

c0(( x′ 0, x′ 1) | ( x0, x1) ; ( a0, a1))<br />

+<br />

c1(( x′ 0, x′ 1) | ( x0, x1) ; ( a0, a1))<br />

(6)<br />

where<br />

c0(( x′ 0, x′ 1) | ( x0, x1) ; ( a0, a1))<br />

=<br />

hx′ 0 0 + p0( a1 − x1 + x1)<br />

+<br />

qa + K ⋅ 1 a > 0<br />

(7)<br />

0 0 0 0<br />

{ }<br />

Ts<strong>in</strong>ghua Science and Technology, August 2007, 12(4): 466-474<br />

c1(( x′ 0, x′ 1) | ( x0, x1) ; ( a0, a1))<br />

=<br />

hx′ 1 1+ q1( x1− x1) + K1⋅1 x1− x1><br />

0 +<br />

p max 0, d x 1 x 0<br />

{ }<br />

{ − } ⋅ { ′ = }<br />

1 1 1<br />

(8)<br />

The long-run average cost of the system, given policy<br />

ϕ = { a0( x0, x1) , a1( x0, x1) ∈ A( x0, x1) , ( x0, x1) ∈ S}<br />

, is<br />

then given by<br />

C( ϕ) = ∑ π(<br />

x0, x1) ⋅ EC(( x0, x1) ; ( a0, a1))<br />

(9)<br />

where ( x0 ( x0, x1) ∈S<br />

x1)<br />

( x0, x1)<br />

for policy ϕ and x0 x1 a0 a1<br />

π , is the stationary distribution of state<br />

EC(( , ) ; ( , )) is the<br />

expected one-step cost at state ( x0, x1)<br />

.<br />

EC(( x0, x1) ; ( a0, a1))<br />

=<br />

P ( x′ , x′ ) | ( x , x ) ; ( a , a ) i<br />

∑<br />

( x0′ , x1′ ) ∈S<br />

{ }<br />

0 1 0 1 0 1<br />

c(( x′ , x′ ) | ( x , x ) ; ( a , a ))<br />

0 1 0 1 0 1<br />

(10)<br />

Thus the optimal policy of the centralized system is the<br />

policy that m<strong>in</strong>imizes the cost <strong>in</strong> Eq. (10). Accord<strong>in</strong>g to<br />

Putterman [11] , the policy can be obta<strong>in</strong>ed us<strong>in</strong>g a standard<br />

value iteration algorithm or a policy iteration algorithm.<br />

3 Decentralized <strong>System</strong><br />

In a decentralized system, the supplier and the retailer<br />

make their decisions <strong>in</strong>dependently. The decision objective<br />

of both stages is to m<strong>in</strong>imize their own expected<br />

long-run-average costs.<br />

3.1 Full <strong>in</strong>formation shar<strong>in</strong>g<br />

If full <strong>in</strong>formation shar<strong>in</strong>g is available <strong>in</strong> the system,<br />

the supplier’s order<strong>in</strong>g policies and state are known to<br />

the retailer, and vice versa. Therefore, the state spaces<br />

of the supplier and the retailer are the same.<br />

S = S = ( x , x ) | 0≤x ≤ U , i = 01 , (11)<br />

{ i i }<br />

{ ≤ ≤ }<br />

0 1 0 1<br />

A ( x , x ) = a ( x , x ) | 0 a ( x , x ) U −x<br />

0 0 1 0 0 1 0 0 1 0 0<br />

{ ≤ ≤ }<br />

A ( x , x ) = a ( x , x ) | 0 a ( x , x ) U −x<br />

1 0 1 1 0 1 1 0 1 1 1<br />

(12)<br />

(13)<br />

Assume that the supplier adopts a stationary policy<br />

ϕ 0 = { a0( x0, x1) ∈ A0( x0, x1) , ( x0, x1) ∈ S0}<br />

, which is<br />

known to the retailer. At state ( x0, x1)<br />

, the retailer orders<br />

a1( x0, x1)<br />

and the retailer knows that the supplier<br />

ϕ0<br />

will order a0 ( x0, x1)<br />

. Therefore, the one-step transition<br />

probability of the retailer from state ( x0, x1)<br />

to<br />

( x′ x′<br />

)<br />

a ( x , x ) is determ<strong>in</strong>ed by<br />

state 0, 1 with action 1 0 1

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!