Download - Academy Publisher

More documents

Recommendations

Info

A second-order centered difference scheme is applied to approximate the time derivatives, and a fourth-order staggered scheme with centered differences to approximate the spatial derivatives. From Eqs.1, for n σ + xy examples, , n r + xy and n 1 v + x can be approximated as: + + p + + n + n + + − ( , ) ( , ) ( , ) vy (, i j ) n + + n + + τ ⋅π i j τε i j ∂vx i j ∂ σ xy ( i , j ) = σxy ( i , j ) + ( + ) + + τσ ( i , j ) ∂y ∂x (5) τ + n− n + ( rxy ( i, j) + rxy ( i, j)) 2 + n + + τ −1 τ n − + + rxy ( i , j ) = (1 + ) ((1 − ) ⋅rxy ( i , j ) − + + + + 2 τσ ( i , j ) 2 τ ( i , j ) (6) σ + + s + + n + n + τ ⋅ μ( i , j ) τ ( i , j ) v ( , ) vy (, i j ) ( ε ∂ 1) ( x i j ∂ − ⋅ + + + + + )) τσ ( i , j ) τσ ( i , j ) ∂y ∂x + n ( , ) 1 (, ) ( , ) ( , ) ( i + j + + n n n τ ∂σ xy σ i j + + + + ∂ xx n vx i j = vx i j + + + fx ) (7) + ρ( i , j) ∂y ∂x Where i, j, k, n are the indices for the three spatial directions and time, respectively.τ denotes the size of a timestep. The discretion of the spatial differential operator, for example,is given: + ∂vi ( , j) 1 = (1(( c ⋅ v i+ 1, j ) − v (, i j )) + c 2(( ⋅ v i+ 2, j ) −v ( i− 1, j ))) (8) ∂x l Where l denotes grid spacing and c1, c2 denote the differential coefficients. B. PML absorbing boundary condition In order to simulate an unbounded medium, an absorbing boundary condition (ABC) must be implemented to truncate the computational domain in numerical algorithms. Since the highly effective perfectly matched layer (PML) method [13] for electromagnetic waves was proposed, the PML has been widely used for finite-difference and finite-element methods [14]. The effectiveness of the PML for elastic waves in solids and the zero reflections from PML to the regular elastic medium were proved [15]. III. IMPLEMENTATION ON GPUS A. Overview To compute the staggered-grid FD and PML equations on graphics hardware, we divide the geophysical model into regions represented as textures with the input and output parameters. Fig.1 shows the division of a 2D model into a collection of regions. Two sets of alternating textures binding with a FBO are used to represent the parameters for computing frame at time t and t+1 respectively. All the other input-only variables are stored similarly in 2D textures. To update FBO textures at each time step, we render quadrilaterals (regions) mapped with the textures in order, so that a fragment program running at each texel computes the FD equations. After each frame, FBO attachments are swapped and the latest updated textures are then used as inputs for the next simulation step. B. Domain decomposition The computational domain is divided into an interior region and PML regions. The outgoing waves are absorbed by the PML via high attenuation. For 2D model, the size of the interior region is N1*N2, denoting the interior part of the whole textures. PML region covers 8 blocks and can be divides three types based on the computing process: PML1, processing absorbing condition both in x and z direction, with the size NBC*NBC; PML2 only in z direction and PML3 only in x direction. The vertex processors will transform on computational domains and the rasterizer will determine all the pixels in the output buffer they cover and generate fragments. In our approach, velocity and stress fields are staggered in spatial and time domain. Therefore, the velocity field is updated after the stress field is updated and the priority of subdomains processing is interior region, PML2 (3), PML3 (2), PML1. 0 PML_1 X PML_2 PML_1 NBC Frame t+1 N=1-N Swap Frame t Z PML_3 PML_1 Interior Region N2=Height-2*NBC PML_3 N1=Width -2*NBC PML_2 PML_1 (A) Vertex computational region P-Texture[N] P-Texture[1-N] Vx-Texture[N] Vx-Texture[1-N] Vz-Texture[N] Vz-Texture[1-N] WriteOnly ReadOnly (B)Fragment processing P_FBO VX_FBO VZ_FBO Fig.1. Implementation of staggered grid FD method with PML conditions on GPUs. Vertex processing switches computing regions and render quadrilaterals (regions) mapped to textures; fragment program run at each texel for computing frame at time t and t+1. FBO attachments are swapped and the latest updated textures are then used as inputs for the next simulation step (loops go on). C. Data representation For 2D viscoelastic FD schemes, computation on interior region at each frame is to get P X , P Z , P XZ , V X , V Z , R X , R Z and R XZ ; computation on PML regions needs two extra components for P X , P Z , P XZ , V X or V Z on each of 4 absorbing boundaries. Considering the time t and t+1, there are actually (8+5*2*4)*2 = 96 arrays storing these components to run in the whole process. 130
In order to reduce to the number of textures and to switch computational domain quickly, 16 textures are used to store interior stress-velocity parameters, called whole- texture with the size width*height, width and height denoting grid-point numbers in x and z direction respectively. It equals the size of the rendering viewport. The input seismic parameters Vp, Vs, Qp, Qs, ρ are also stored in whole- texture. 5*2 extra textures, called x- direction-texture, are used to express all velocity and stress parameters in x absorbing direction; and the size of each texture is (2*NBC, height), NBC defining the gridpoints of the absorbing boundary. In our work, NBC is 10. Simultaneously, there are 5*2 extra arrays in z absorbing direction, called z-direction-texture, and the size is (width, 2*NBC).There are actually another 20 arrays at time t+1. From above analysis, the memory requirements MEMREq for 2D viscoelastic modelling is obtained by, 2 2 MEMREg = ( N ∗C + N * NBC * D) ∗ Fz /1024 (11) where the total number C of the whole-texture is 21 and the total number D of x-and z- direction- texture is 40. Fz of float size equals 4 bytes. TABLE I MEMORY REQUIREMENTS FOR 2D VISCOELASTIC MODEL Gridsize( N 2 ) Memory Requirements 512 2 22.56M 1024 2 87.12 M 2048 2 342.25 M 4096 2 1356.50 M Table1 shows 2D viscoelastic model which is only less than 2048*2048 gridsizes can be supported on current GPUs directly, since graphics video memory size is about 256-512M in general. D. Fragment processing The results of each step are written to a twodimensional memory location called framebuffer. But the characteristics of the GPU prohibit the usage of the buffer for reading and writing operations at the same time. Fortunately, the OpenGL extension FrameBuffer Object (FBO) allows reusing the framebuffer content as a texture for further processing in a subsequent rendering pass and the “ping-pong" method allows two buffers swap their role after each write operation [3]. In order to generate and store intermediate results for each simulating step on GPUs, our strategy is to bind each texture that is rendered to onto one attachment on an assigned FBO; two textures binding on the same FBO are used to represent one stress-velocity component at time t and t+1, respectively. After each frame, FBO attachments are swapped and the latest updated textures are then used as inputs for the next simulation step while N = 1- N (N = 0, 1). IV. NUMERICAL EXAMPLES We have experimented with our GPU accelerated method implemented using C++/OpenGL and Cg shader language on a PC with Intel Core(TM)2 Duo E7400 2.8GHz, 2G main memory. The graphics chip is NVidia GeForce 9800GT with 512M video memory and 550MHz core frequency. The OS is Windows XP. All the results related to the GPU are based on 32-bit single-precision computation of the rendering pipeline. For comparison, we have also implemented a software version of the seismic simulation using single-precision floating point, and we have measured their performance on the same machine. Two samples are used. One is a three-layer model (model1) and another is similar to Kelly model [14] (model2). The source is located at the top of the medium (Fig.2) and propagates downwards ricker wave with a dominant frequency of approximately 50 Hz. Vp =2600m/s Vs =1800 Qp =120 Qs= 110 ρ= 2 4g/cm 3 Vp = 3050 Vs = 2400 Qp = 230 Qs= 220 ρ= 2 7 Vp = 4000 Vs = 3300 Qp = 350 Qs= 267 ρ=3 0 2500 (a) Model1: Three layer model 1 2 4 3 5 6 7 2500 m 1Vp=1200 Vs=800 Qp=80 Qs=60 ρ=1.4 2Vp=2440 Vs=1330 Qp=100 Qs=90 ρ=2.2 3Vp=3210 Vs=2389 Qp=219 Qs=190 ρ=2.3 4Vp=3500 Vs=2173 Qp=200 Qs=170 ρ=2.5 5Vp=3225 Vs=1800 Qp=294 Qs=220 ρ=2.5 6Vp=3822 Vs=2980 Qp=490 Qs=340 ρ=2.6 7Vp=4785 Vs=3800 Qp=450 Qs=400 ρ=3.4 (b) Model2: Modified Kelly model Fig.2. two test models: both are the same size of 2500m*2500m. The computation runs 2000 steps on both GPU and CPU, respectively. The results for the two models are plotted in Fig.3. The time of viscoelastic wave simulation is defined as a function of the grid size. In order to compare the performance, GPU based method do not transfer the velocity-stress field or the parameters between the main memory and the graphics memory. The time spent on advecting and rendering the regions is negligible with respect to the simulation. The time of staggered-grid FD method is only dependent on the grid size, independent on geology structure complexity, which is shown in both GPU (Fig. 3a) and CPU (Fig.3b) based implementation. The performance of the GPU based implementation increases significantly with the increasing number of the gridpoints. The speedup factor, which is defined as the ratio of CPU’s time to GPU’s time on the same grid size, increases from about 2 (128 2 ) to 50 (2048 2 ). That is mainly due to the advantages of GPU’s parallel 131
Page 1 and 2:
Proceedings The Second Internationa
Page 3 and 4:
Table of Contents Message from the
Page 5 and 6:
Zuming Xiao, Zhan Guo, Bin Tan, and
Page 7 and 8:
Message from the Symposium Chairs T
Page 9 and 10:
Second International Symposium on N
Page 11 and 12:
ISBN 978-952-5726-09-1 (Print) Proc
Page 13 and 14:
model that can deal with time serie
Page 15 and 16:
ISBN 978-952-5726-09-1 (Print) Proc
Page 17 and 18:
training for the Wushu competition
Page 19 and 20:
ISBN 978-952-5726-09-1 (Print) Proc
Page 21 and 22:
information, called weak uncertain
Page 23 and 24:
Student side Student side Student s
Page 25 and 26:
ISBN 978-952-5726-09-1 (Print) Proc
Page 27 and 28:
If we define element of student as
Page 29 and 30:
ISBN 978-952-5726-09-1 (Print) Proc
Page 31 and 32:
QoS, each frame data is divided int
Page 33 and 34:
ISBN 978-952-5726-09-1 (Print) Proc
Page 35 and 36:
for Imaging Two and Three Phase Flo
Page 37 and 38:
ISBN 978-952-5726-09-1 (Print) Proc
Page 39 and 40:
A. Profiling&following control algo
Page 41 and 42:
Research and Realization about Conv
Page 43 and 44:
PDF document structure is a tree st
Page 45 and 46:
ISBN 978-952-5726-09-1 (Print) Proc
Page 47 and 48:
support 10 Mb / s. But ENC28J60 onl
Page 49 and 50:
ISBN 978-952-5726-09-1 (Print) Proc
Page 51 and 52:
condition the first byte output of
Page 53 and 54:
ISBN 978-952-5726-09-1 (Print) Proc
Page 55 and 56:
According to maximum membership deg
Page 57 and 58:
ISBN 978-952-5726-09-1 (Print) Proc
Page 59 and 60:
ubber according to the mass ratio o
Page 61 and 62:
Ⅲ. AN IMPROVED DNA ALGORITHM FOR
Page 63 and 64:
Ⅴ.CONCLUSION REMARKS DNA computer
Page 65 and 66:
its first child q 1 on the left, th
Page 67 and 68:
chains can greatly improve efficien
Page 69 and 70:
II. RELATED WORK A. Mobile Service
Page 71 and 72:
special services. SOAP is used to b
Page 73 and 74:
detection methods of DDoS attacks m
Page 75 and 76:
Step4 calculate the new subordinate
Page 77 and 78:
Figure 1. An analysis of the partit
Page 79 and 80:
ISBN 978-952-5726-09-1 (Print) Proc
Page 81 and 82:
The preceding three formulas can be
Page 83 and 84:
ISBN 978-952-5726-09-1 (Print) Proc
Page 85 and 86:
And the decay speed of buffer seque
Page 87 and 88:
ISBN 978-952-5726-09-1 (Print) Proc
Page 89 and 90: distance of view point. Given that
Page 91 and 92: ISBN 978-952-5726-09-1 (Print) Proc
Page 93 and 94: Ⅳ. EVALUATION OF BLENDED LEARNING
Page 97 and 98: symmetric with respect to the origi
Page 101 and 102: each sample belongs to each categor
Page 105 and 106: A. Data Preparing To generate our t
Page 109 and 110: oth α and β . determination of Qu
Page 113 and 114: Strong earthquake 0.1< M L
Page 117 and 118: indirect causes, and the logical re
Page 119 and 120: egression. Granger causality test i
Page 121 and 122: B. Evaluation We have evaluated thi
Page 123 and 124: used as a source and neighboring ce
Page 125 and 126: Figure 3. Drainage networks generat
Page 127 and 128: Combinational logic unit failures i
Page 129 and 130: Tabal.1 Combinational fault logic t
Page 131 and 132: under the endorsement of both the m
Page 133 and 134: TABLE I. DESCRIPTION OF PROPOSITION
Page 135 and 136: Figure 2. The process of the invers
Page 137 and 138: Figure 9. (a)the original image.(b)
Page 139: ISBN 978-952-5726-09-1 (Print) Proc
Page 145 and 146: that studying being going to be to
Page 147 and 148: Reference[10] analyzed the evolutio
Page 149 and 150: module, communication module, apper
Page 151 and 152: After the comprehensive performance
Page 153 and 154: system testing can be seen that the
Page 155 and 156: III. AUTONOMIC RESOURCE ALLOCATION
Page 157 and 158: The QoE i (T i ) is the ith user’
Page 161 and 162: nodes will be formed one cluster, t
Page 165 and 166: Where n is the number of data point
Page 169 and 170: Keyboard event Application User mod
Page 173 and 174: SQL Azure will eventually include a
Page 177 and 178: The nodes in the suffix tree are dr
Page 179 and 180: [3] Y. Li, S. M. Chung, and J. D. H
Page 181 and 182: some drilling fluid produces hydrog
Page 183 and 184: "normalization", whose membership b
Page 185 and 186: and minimum structural elements in
Page 187 and 188: esults of the spatial transform par
Page 189 and 190: B. Research on protocol actions Com
Page 191 and 192:
ACKNOWLEDGMENT This work is funded
Page 193 and 194:
A. Problem Description In a small b
Page 195 and 196:
[4] Huang, Hung, and J. Y. jen Hsu.
Page 197 and 198:
Further by calculating, the followi
Page 199 and 200:
TABLE II. THE CONCENTRATION BETWEEN
Page 201 and 202:
private key that obtained by using
Page 203 and 204:
ISBN 978-952-5726-09-1 (Print) Proc
Page 205 and 206:
ontology, and the domain dictionary
Page 207 and 208:
ISBN 978-952-5726-09-1 (Print) Proc
Page 209 and 210:
Eq.4 is NP-hard and can be solved b
Page 211 and 212:
ISBN 978-952-5726-09-1 (Print) Proc
Page 213 and 214:
Where: G i is the selection field o
Page 215 and 216:
If there is X j in a generation, wh
Page 217 and 218:
Theorem 2.4([10]). Let L 1 and L 2
Page 219 and 220:
The relation between the lattice im
Page 221 and 222:
Figure 2. Example of single-step de
Page 223 and 224:
evocation and the fourth group stor
Page 225 and 226:
focused mainly on rule-based forms
Page 227 and 228:
Ⅵ. CONCLUSION Figure 6. The Flask
Page 229 and 230:
EDCF in comparison with DCF, has so
Page 231 and 232:
B. Simulation results and analysis
Page 233 and 234:
ISBN 978-952-5726-09-1 (Print) Proc
Page 235 and 236:
in routing table. When search resou
Page 237 and 238:
ISBN 978-952-5726-09-1 (Print) Proc
Page 239 and 240:
others through the selective emotio
Page 241 and 242:
such as weather information, techno
Page 243 and 244:
the opening window, a related page
Page 245 and 246:
1) Process context storage areas. I
Page 247 and 248:
V. CONCLUSION In this thesis, sCPU-
Page 249 and 250:
main types of horizontal search env
Page 251 and 252:
Enterprise Portal security provides
Page 253 and 254:
() t = [ S () t S () t ] T N S ,...
Page 255 and 256:
[2] Kumaravel, N., and Kavitha, V.,
Page 257 and 258:
output, so BP network has been wide
Page 259 and 260:
input to the artificial neural netw
Page 261 and 262:
(2) In a process (tokens from outsi
Page 263 and 264:
The reduction process consists of t
Page 265 and 266:
all eight normal vectors are identi
Page 267 and 268:
is B object , and the number of emp
Page 269 and 270:
xi, j y' = εα ( (1 + tanh( )) −
Page 271 and 272:
Error 8000 7000 6000 5000 4000 3000
Page 273 and 274:
Architecture (AMBA) a new bus archi
Page 275 and 276:
In order to ensure the smooth proce
Page 277 and 278:
ISBN 978-952-5726-09-1 (Print) Proc
Page 279 and 280:
In this paper, we adopt two Sobel o
Page 281 and 282:
ISBN 978-952-5726-09-1 (Print) Proc
Page 283 and 284:
four layers.The far right of the gr
Page 285 and 286:
ISBN 978-952-5726-09-1 (Print) Proc
Page 287 and 288:
TABLE II. OPERATIONAL EMPLOYEE TABL
Page 289 and 290:
A. Object-relational Type To audit
Page 291 and 292:
to execution time. Also, DBA would
Page 293 and 294:
Liping Chen .......................
show all

Download - Academy Publisher

Create successful ePaper yourself

Delete template?

Save as template?