15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Lower Bounds on I/O<br />

The most trivial batched problem is that of scanning (a.k.a. streaming or touching) a file of N data items,<br />

which can be done in a linear number O(N/DB) = O(n/D) of I/Os. Permuting is one of several simple<br />

problems that can be done in linear CPU time in the (internal memory) RAM model, but require a<br />

nonlinear number of I/Os in PDM because of the locality constraints imposed by the block parameter B.<br />

The proof of the permutation lower bound (32.4) of Theorem 32.6.2 is due to Aggarwal and Vitter<br />

[15]. A stronger lower bound is obtained from a more refined argument that counts input operations<br />

separately from output operations [111]. For the typical case in which B log m = ω(log N), the I/O lower<br />

bound, up to lower order terms, is 2n log mn. For the pathological, in which B log m = o(log N), the I/O<br />

lower bound, up to lower order terms, is N/D. Permuting is a special case of sorting, and hence, the<br />

permuting lower bound applies also to sorting. In the unlikely case that B log m = o(log n), the permuting<br />

bound is only Ω(N/D), and the comparison model must be used to get the full lower bound (32.3) of<br />

Theorem 6.1 [15]. The reader is referred to [199] for further discussion and references on lower bounds<br />

for sorting and related problems.<br />

32.7 Matrix and Grid Computations<br />

Dense matrices are generally represented in memory in row-major or column-major order. Matrix<br />

transposition, which is the special case of sorting that involves conversion of a matrix from one representation<br />

to the other, was discussed in the subsection on “Permuting and Transposition.” For certain<br />

operations such as matrix addition, both representations work well; however, for standard matrix multiplication<br />

(using only semiring operations) and LU decomposition, a better representation is to block<br />

the matrix into square B × B submatrices, which gives the upper bound of the following theorem:<br />

Theorem 32.7.1 [108,177,202,212] The number of I/Os required for standard matrix multiplication of<br />

two K × K matrices or to compute the LU factorization of a K × K matrix is Θ(K 3 /min{K, M}DB).<br />

Hong and Kung [108] and Nodine et al. [155] give optimal EM algorithms for iterative grid computations,<br />

and Leiserson et al. [134] reduce the number of I/Os of naive multigrid implementations by a<br />

Θ(M 1/5 ) factor. Gupta et al. [102] show how to derive efficient EM algorithms automatically for computations<br />

expressed in tensor form.<br />

If a K × K matrix A is sparse, that is, if the number N z of nonzero elements in A is much smaller than<br />

K 2 , then it may be more efficient to store only the nonzero elements. Each nonzero element A i,j is represented<br />

by the triple (i, j, A i,j). Unlike the dense case, in which transposition can be easier than sorting (e.g.,<br />

see Theorem 32.6.3 when B 2 ≤ M), transposition of sparse matrices is as hard as sorting.<br />

Theorem 32.7.2 For a matrix stored in sparse format and containing N z nonzero elements, the number of<br />

I/Os required to convert the matrix from row-major order to column-major order, and vice-versa, is Θ(Sort(N z)).<br />

The lower bound follows by reduction from sorting. If the ith item in the input of the sorting instance<br />

has key value x ≠ 0, there is a nonzero element in matrix position (i, x).<br />

For further discussion of numerical EM algorithms, the reader is referred to the survey by Toledo<br />

[190]. Some issues regarding programming environments are covered in [59] and Section 32.14.<br />

32.8 Batched Problems in Computational Geometry<br />

Problems involving massive amounts of geometric data are ubiquitous in spatial databases [131,172,<br />

173], geographic information systems (GIS) [131,172,194], constraint logic programming [119,120],<br />

object-oriented databases [215], statistics, virtual reality systems, and computer graphics [90]. NASA’s<br />

Earth Observing System project, the core part of the Earth Science Enterprise (formerly Mission to Planet<br />

Earth), produces petabytes (10 15 bytes) of raster data per year [76]. Microsoft’s TerraServer online database<br />

© 2002 by CRC Press LLC

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!