Signal Processing and Coding for TDMR Channels 1 Front ... - IDEMA

1 Front matter1.a.DateApril 29, 20111.b.AbstractSignal Processing and Coding for TDMR ChannelsIndustry is approaching the limit of the data storage density possible by reading and writing single tracks onconventional magnetic disks. The proposed project considers an alternate approach called two dimensionalmagnetic recording (TDMR), wherein bits are read and written in two dimensions on conventional magneticdisks. These disks have magnetic grains of different sizes packed randomly onto the disk surface. A keyproblem in TDMR is that a given magnetic grain retains the polarization of the last bit written on it; hence,if a grain is large enough to contain two bit centers, the older bit will be overwritten by the newer one. Bitsare read from the disk by a 2D read head. Because bits are stored at such high density, the signal read froma given bit will suffer 2D intersymbol interference (2D ISI) from adjacent bits in both down and cross trackdirections. As the result of a recent NSF-funded research project, the Co-PIs have developed iterative 2DISI equalization algorithms that are among the best performing reported in the open literature. Many of thetechniques developed at WSU for the 2D-ISI channel can also be used for the TDMR detection problem. Toenable TDMR to become viable, advanced 2D signal processing and coding techniques must be developedto combat the combination of grain overwriting and 2D ISI.This proposal’s objective is to investigate signal processing and coding techniques for TDMR channelsthat can approach the recently estimated channel capacity of about 0.5 bit/grain. Specific objectives include:1. Investigate detectors for TDMR magnetic grain channels based on three channel models of successivelyincreasing complexity: the rectangular-grain, discrete-grain Voronoi, and Voronoi models.2. Integrate TDMR detection, 2D ISI equalization and channel coding in an overall turbo-detection architecture.3. Evaluate our developed algorithms using statistically generated TDMR data and experimental TDMRdata from member companies of the ASTC.A version of this proposal has recently been submitted to NSF as a GOALI proposal with industry partnerHitachi; the GOALI requests support for two Ph.D. students for three years. The requested ASTC fundingwould support one additional Ph.D. student for three years. The ASTC student would work on the interfaceand interaction between channel coding, TDMR detection, and 2D ISI equalization, while the two GOALIstudents would work on TDMR detection and 2D ISI equalization. The ASTC student would investigateissues such as: (a) interleaver design to spread commonly encountered error patterns out over 2D space tofacilitate error correction; (b) symbol mapping to allow the channel code to easily detect and correct highprobability error patterns generated by the 2D ISI equalizer, and the interaction between the symbol mapperand the channel decoder; (c) introduction of intentional correlation to the channel coded bit stream in orderto improve the performance of the overall system; and (d) channel capacity computation for the combinedmagnetic grain and 2D ISI channel models.1.c.Proponents and affiliationsDr. Benjamin Belzer, Principal Investigator, and Dr. Krishnamoorthy Sivakumar, Co-Principal Investigator,Washington State University School of Electrical Engineering and Computer Science.1

1.d.Designated contact person1. Technical Point of Contact: Dr. Benjamin Belzer, Washington State University School of EECS, P.O.Box 642752, Pullman, WA 99164-2752. Phone: 509-335-4970. Email: belzer@eecs.wsu.edu.2. Contractual Point of Contact: Dan Nordquist, Washington State University Office of Grant and ResearchDevelopment, 423 Neill Hall, PO Box 643140, Pullman WA 99164-3140. Phone: 335-7717.Email: nordquist@wsu.edu.2 Subject of research and relevance to issues to be solved2.a.Description of the research matter and its connection with ASTC stated goalsThe proposed project addresses the ASTCTDMR topic number 3: joint 2D practicalsoft detection and coding design for TDMR.Fig. 1 from [1] shows an overview of a proposedTDMR system that could potentiallyachieve 10 Terabits/in 2 , which is about 10times the maximum density limit of currentsingle-track magnetic disk storage systems.A standard magnetic disk has magneticgrains of different sizes packed randomlyonto the disk surface. In the proposedTDMR system, information bits are channelcoded to a density of two bits per magneticgrain, and written by a special shingledwrite process that enables high density Figure 1: Overview of TDMR (from [1]; used by permission.)recording. A key problem with this processis that a given magnetic grain retains the polarization of the last bit written on it; hence, if a grain is largeenough to contain two bit centers, the oldest bit will be overwritten by the newer one if they have differentpolarities. Bits are read from the disk by a 2D read head capable of reading multiple bits at a time in bothdown- and cross-track directions. Because the bits are stored at such high density, the signal read from agiven bit will suffer interference from adjacent bits in both down and cross track directions; this is called2D intersymbol interference (2D ISI). To recover the original data bits, advanced 2D signal processing andcoding techniques must be developed to combat the combination of grain overwriting and 2D ISI.Proposal Objectives: The objective of this proposal is to investigate signal processing and codingtechniques for TDMR channels that can approach the recently estimated channel capacity of about 0.5bit/grain. Specific technical objectives include:2.a.i.1. Investigate detectors for TDMR magnetic grain channels based on three channel models of successivelyincreasing complexity: the rectangular-grain, discrete-grain Voronoi, and Voronoi models.2. Integrate TDMR detection, 2D ISI equalization and channel coding in an overall turbo-detection architecture.3. Evaluate our developed algorithms using statistically generated TDMR data and experimental TDMRdata from member companies of the ASTC.Related work by other groupsPrevious systems-level work in TDMR has mainly considered channel modeling and channel capacity. Therehas been relatively little work on joint detection and decoding algorithms for TDMR; although there have2

een a number of proposed approaches, very few have been pursued to the point where actual bit error rate(BER) performance can be quantified. (One exception is the work by Pan and Ryan et. al. in [2]; this paperis discussed further in subsection 2.b.i. below.) Even less work has been done on signal processing andcoding to combat the combined effects of magnetic grain overwrite and 2D ISI.Among the simplest TDMR models that captures the 2D nature of the TDMR channel is the rectangulardiscrete-grain model (DGM). As shown in Fig. 2 (a), the rectangular DGM consists of four distinct graintypes consisting of unions of the smallest grain type; relative to the smallest type their sizes are 1 × 1, 2 × 1,1 × 2 and 2 × 2. The four grain types occur with probabilities P 1 ,...,P 4 . Typically it is assumed that thereis one coded bit per 1 × 1 grain. The rectangular DGM model was introduced by Kavcic et. al. in [3],where capacity upper and lower bounds for this model were derived showing a potential density of 0.6 bitsper grain, translating to 12 Terabits/in 2 at typical media grain densities of 20 Teragrains/in 2 . This paper alsoshowed that the rectangular DGM could easily be generalized into a model that included 2D ISI, and couldalso be generalized into a model with rectangular grains but non-integer grain boundaries; non-integer grainboundaries are a simple attempt to model the random alignment of actual magnetic grains.A more general model called the Voronoi-based discrete-grain model (VDGM) in [4] was introducedin [5]. In the VDGM, shown in Fig. 2 (b), the grains (shown gray-shaded in the figure) are constructedof arbitrary unions of discrete tiles (“tiny squares” in the figure), and can therefore assume a much widervariety of shapes and sizes than the rectangular DGM.A still more general model, shown in Fig. 2 (c), is called the Voronoi model. Here the grain centers areconstructed by adding a (usually fairly small) random offset to a regular grid [6], and the grains are simplythe Voronoi regions about the grain centers. To our knowledge TDMR detection and coding schemes havenot been developed for either the VDGM or Voronoi models to the point where BER performance can bequantified, although several detection schemes are proposed in [4]. (a)Figure 2: Magnetic grain models considered in this proposal: (a) rectangular discrete grain model; (b)Voronoi discrete grain model; (c) Voronoi model.(b)(c)2.a.ii.The 2D ISI ProblemConsider the detection of an M × N binary equiprobable two dimensional (2D) independent and identicallydistributed (i.i.d.) image f with elements f(k, l) ∈ {−1, 1} from received image r with elements (pixels)r(m, n) = h(m − k, n − l) · f(k, l)+w(m, n). (1)k lIn (1), h(k, l) are elements of a finite impulse response 2D mask h, and the w(m, n) are zero mean i.i.d.Gaussian random variables (r.v.s). The model in (1) applies, e.g., to 2D data storage systems, which sufferfrom 2D intersymbol interference (ISI) at high storage densities. Such systems are under active development3

for next generation optical disk storage (e.g., [7]), and holographic data storage (e.g., [8]) and have beenproposed for two-dimensional magnetic storage [1].Direct maximum likelihood (ML) detection of f from r requires comparison of r with 2 MN candidateimages, and is therefore impractical for typical image dimensions of M,N ≥ 64. For one dimensionalsignals, the Viterbi algorithm (VA) provides an efficient method for ML detection of ISI-corrupted data [9].But the VA does not generalize to two or higher dimensions. Nonetheless, asymptotically tight union boundson the performance of 2D ML detection have been developed in [10]; these bounds are useful in assessingthe performance of 2D detection algorithms at high SNR.A number of 2D decision-feedback VAs (DFVAs) have been constructed, based on row-by-row rasterscanning of the image (e.g. [11–14]). To our knowledge, [13, 14] employed the first iterative algorithmfor 2D ISI reduction; the DFVA was run on rows and columns, and bits agreeing in both directions werefixed for subsequent iterations. Subsequent work employed the turbo principle (after turbo coding [15]). In[16], the 2D convolution is decomposed into two 1D computations, and an iterative algorithm exchangessoft information between 1D soft-in soft-out (SISO) detectors. In [17], the binary source image is codedwith a low-density parity check (LDPC) error correcting code before transmission over a separable 2D-ISIchannel. Separability is exploited to construct an iterative row-column algorithm in which a non-binarycolumn SISO detector is followed by a binary row SISO detector, followed by an iteration of the LDPCdecoder, etc. The LDPC’s coding gain enables [17] to approach within less than 1 dB the bit error rate(BER) curve for the non-ISI channel. In [18], soft information is exchanged between maximum a posteriori(MAP) row and column detectors; this scheme avoids decision feedback by making decisions on multiplerows/columns, rather then one row/column at a time. An iterative row-column soft decision feedback (SDF)algorithm (IRCSDFA) similar to that of [18], which outperforms the algorithms of [14], [16], and [18] wasalso recently developed [19].Approximations of generalized belief propagation (GBP) are developed for the 2D-ISI and related problemsin [20]. The GBP-based 2D-ISI equalizer uses exact inference over the sub-region of the image coveredby the ISI mask, and passes messages between adjacent sub-regions. The GBP equalizer achieves ML performancefor the cases tested in [20], but has been demonstrated only on small (20 × 20 or smaller) imagesand with 2 × 2 ISI masks with relatively low-magnitude coefficients relative to the main tap h(0, 0); suchcases are easily handled because the nearby boundary conditions greatly aid the estimation, and because theISI is relatively less compared to masks with a flatter coefficient energy distribution.A brief description of the trellis construction used in the IRCSDFA is presented next (see [19] fordetails).2.a.iii.Trellis definitionTwo-dimensional convolution can be viewed as the inner product of imagef(m, n) with inverted mask h(−k, −l), with mask coefficient h(0, 0) atpixel position (m, n). The inverted mask raster-scans through the imagerow-by-row or column-by-column. For the row-by-row case we define theIRCSDFA trellis states and inputs as in Figure 3. Trellis generation forthe 3 × 3 mask on the mth image row is initiated by placing the inputmarked (m, n) in Figure 3 at the left end of the row, where the initialvalues of the six state pixels are −1 due to the boundary conditions, andthe vector of three input pixels can take eight different values. The entirestate/input block is then shifted right to pick up the next three input pixels,and the previous three input pixels become the middle three state pixels.The trellises for each row are terminated at the right end of the row by extrashifts into the boundary pixels. For the 3 × 3 mask, the 64-state trellis haseight branches entering and leaving each state, with no parallel branches. Figure 3: Trellis definition.At each position (m, n), the trellis branch output vector consists of three3 × 3 inner products between the inverted mask and the pixel values defined by the trellis; the upper inner4

product uses two feedback rows, the middle uses one feedback row, and the lower uses received pixels only.The branch metric is the squared Euclidean distance between the branch output and the received pixel vector[r(m, n),r(m +1,n),r(m +2,n)]. The column-by-column case is similar to the row-by-row case. As theimage pixels are i.i.d., the above-described trellis constructions impose the Markov condition that, given thecurrent trellis state, subsequent states and branch outputs are independent of past states or outputs. ThisMarkov condition allows the use of a modified BCJR [21] algorithm for detection.Subsequently, we developed an iterative soft-decision feedback zigzag algorithm (ISDFZA) [22]. Incontrast to the IRCSDFA, which is based on row or column scanning, the ISDFZA employs a zig-zag scanto construct a space-varying trellis spanning the entire image; the longer trellis provides improved performanceat low SNR. The IRCSDF detector was then concatenated with the ISDFZA where soft informationwas exchanged between the two detectors in an iterative fashion. The concatenated system [22] providesfurther improvement with overall performance close to the corresponding maximum likelihood (ML) bound.Extensive comparisons in [22] show that the concatenated ISDFZA-IRCSDFA provides superior or competitiveperformance to other recently proposed 2D-ISI equalization algorithms that constituted state-of-the-artat the time of publication of [22]. Specifically, in [22] direct comparisons are made to the previously publishedalgorithms of [16], [17], [20], and also to our own previous IRCSDFA of [19] (which had previouslybeen shown to have better performance than [18]). For this reason our more recent results presented next insubsection 2.a.iv. are compared only to the best results reported in [22], as we believe that [22] establisheda new state-of-the-art for equalization of 2 × 2 or 3 × 3 2D ISI channels at the time it was published.2.a.iv.Recent work by the PIsIn traditional iterative row-column soft decision feedback (IRCSDF) algorithms [14, 16, 19] for a twodimensionalintersymbol interference channel with additive white Gaussian noise, the pixel estimates usedas extrinsic information exchanged between detectors were assumed to be statistically independent. Althoughthis assumption simplifies the overall algorithm, there is no theoretical basis for the independenceassumption. Indeed, the dependence between extrinsic information estimates of pixels was verified experimentally[23]. Motivated by this, the PIs have redesigned the traditional IRCSDFA [19] by dropping theindependence assumption on the extrinsic information exchanged between the row and column decoders. Inthe joint version of the IRCSDF, referred to as the block (BLK) algorithm, we estimate and exchange jointstatistics for the pixels involved in the extrinsic information exchange. To address the increased computationaland storage complexity introduced by the joint statistics, we have also developed a simplified versionof the block (SBLK) algorithm. Experimental results demonstrate that the SBLK algorithm performs almostas well as the BLK algorithm. In the following, we present a brief description of BLK algorithm; see [23]for more details.Block AlgorithmFollowing the notations and derivation of the IRCSDFA (assuming independence of pixels in extrinsicinformation) from [19], [22], we present only the most relevant equations from [19], which get modified inthe BLK algorithm. Figure 4 parts (a) (2 × 2 mask) and (g) (3 × 3 mask) show the state (labeled s), input(labeled i), and feedback (labeled ω) pixels for the row decoder; parts (d) and (j) show similar informationfor the column decoder. The main modification is in the computation of γ in the traditional BCJR algorithm[21]. Computation of γ in the traditional BCJR algorithm [19] (see equations (2), and (3) below) requiresa priori probability of the input block (two pixels for 2 × 2 mask and three pixels for 3 × 3 mask) and apriori probability of the feedback block (two pixels for 2 × 2 mask and six pixels for 3 × 3 mask), from theextrinsic information obtained from the previous decoder in the overall iterative scheme. In the sequel, wepresent these equations for the 3 × 3 mask; obvious changes lead to expressions for the 2 × 2 case.γ i (y k ,s ,s)=p (y k | U = i,S k = s, S k−1 = s ) × P (U = i | s, s ) × P (s | s ), (2)5

wherep (y k | U = i,S k = s, S k−1 = s )=P (y k2 | i 0 ,i 1 ,i 2 ,s,s ) × Ω 2P (Ω 2 )P (y k1 | i 0 ,i 1 ,s,s , Ω 2 )× P (Ω 1 )P (y k0 | i 0 ,s,s , Ω 1 , Ω 2 ) .Ω 1(3)Here k represents the trellis stage, y k =(y k0 ,y k1 ,y k2 ) is the received vector, i =(i 0 ,i 1 ,i 2 ) is the inputvector, s, s are the current and previous states, and Ω 1 , Ω 2 represents the two rows of feedback pixels. In[19], [22], for simplicity, we assumed that the pixels in the input/feedback block were statistically independent.Consequently, marginal probabilities (from the extrinsic information) were multiplied to obtain therequired joint probabilities. In other words, we set (for input pixel block):and for feedback pixel block:P (s | s )=P (U = i) =P (i 0 ) × P (i 1 ) × P (i 2 ) (4)P (Ω 1 )=P (ω 0 ) × P (ω 1 ) × P (ω 2 ) and P (Ω 2 )=P (ω 3 ) × P (ω 4 ) × P (ω 5 ). (5)The independence assumption in (4) and (5) is strictly speaking not valid in practice; this was verifiedbased on the joint probabilities obtained in our block algorithm. The key idea is to replace equations (3)–(5)with equations (6)–(7) shown below, respectively. p (y k | U = i,S k = s, S k−1 = s )=P (y k2 | i 0 ,i 1 ,i 2 ,s,s ) × P (Ω 1 , Ω 2 )Ω 1 ,Ω 2 (6)P (y k1 | i 0 ,i 1 ,s,s , Ω 2 )P (y k0 | i 0 ,s,s , Ω 1 , Ω 2 ) ,P (s | s )=P (U = i) =P (i 0 ,i 1 ,i 2 ), and P (Ω 1 , Ω 2 )=P (ω 0 ,ω 1 ,ω 2 ,ω 3 ,ω 4 ,ω 5 ). (7)We now address the problem of obtaining the joint probabilities in equation (7). In the BCJR algorithmwe compute α k (s) =P (S k = s, y1 k), β k(s) =P (y Nrk+1 | S k = s) and γ i (y k ,s,s )=P (U = i,S k =s, y k | S k−1 = s ) by a forward-backward procedure, where N r denotes the number of stages in thetrellis. We then compute λ i k (s) =P (U = i,S k = s, y N r1 )= s α k−1(s )β k (s)γ i (y k ,s,s ), which givesthe unnormalized probability P (U = i). To estimate the pixel located at (m, n) from the vector λs, wemarginalize the λs over the other two input pixels i 1 and i 2 :λ i 0k(s) = λ i k (s). (8)i 1 ,i 2The corresponding log likelihood ratio (LLR) is given by (in the binary case, there is one LLR per pixel, the“other LLR” being 0):sL(k) = logλi0=+1 k(s). (9)(s)s λi0=−1 kIn our block (BLK) algorithm, we compute the joint probability of the block of pixels that constitute thestate and input (block of four pixels for 2 × 2 mask and block of nine pixels for 3 × 3 mask) as shown inFigure 4 (a) and (g). We first compute the same α, β; γ is computed based on (2), (6)–(7). For computingthe 16- or 512-valued LLRs L k (i,s), we do not sum the λ i k (s) probabilities over the inputs i 1, i 2 and thestate s as in (8) and (9). Instead of computing two LLRs (one of them being equal to zero) for each pixel,for each pixel we now have 2 4 = 16 LLRs for the 2 × 2 mask, and 2 9 = 512 LLRs for the 3 × 3 mask asfollows:L k (i,s) = logλ i k (s)λ i=i 0k(s = s 0 )6. (10)

In (10), for the 2 × 2 mask i = (i 0 ,i 1 ), i 0 =(−1, −1), s = (s 0 ,s 1 ), and s 0 = (−1, −1).For the 3 × 3 case, i = (i 0 ,i 1 ,i 2 ), i 0 =(−1, −1, −1), s = (s 0 ,s 1 ,s 2 ,s 3 ,s 4 ,s 5 ), ands 0 =(−1, −1, −1, −1, −1, −1).Before passing this information to the nextdecoder (in an iterative scheme), the input extrinsicLLR is subtracted from the LLR computed in(10). It is customary to also multiply the extrinsicLLR by a weight factor less than one, since theextrinsic information is not reliable, at least in theinitial iterations. Designing an optimal weightschedule (as a function of iteration number) hasbeen discussed in [22], under the independenceassumption. Finally, note that, the joint probabilityof the four/nine-pixel block can be suitablymarginalized to obtain the joint input probabilityand joint feedback probability required in equations(7). This marginalization is shown in Figure4 (b, e) and (h, k).Simplified Block AlgorithmThe BLK algorithm requires the exchange of16-valued (512-valued) LLR for each pixel forthe 2 × 2 (3 × 3) mask. This involves significantlymore storage and computation comparedto our original IRCSDFA, where there was a singleLLR for each pixel. A simplified version ofthe block algorithm, which we call the SBLK algorithm,has been developed [23] to address thisproblem.The key idea is to store and exchange LLRsonly for the joint pairs that we need in the nextdecoder. This is illustrated in Figure 4 parts (c,f) and (i, l). For the 2 × 2 mask, we only needto store two pixel pairs which is 2 2 × 2 = 8LLRs — half the 16 LLRs required in the BLKalgorithm. For the 3 × 3 mask, we only need2 6 +2 3 = 72 LLRs — almost one seventh of the512 LLRs required in the BLK algorithm. Sincewe do not need subtraction of the input LLR, theperformance of the simplified algorithm is almostas good as that of the block algorithm. Figure 4: Structure definition of the BLK and SBLK algorithmfor 2 × 2 mask ((a) through (f)) and 3 × 3 mask((g) through (l)): (a), (g) state and input pixels definitionof row decoder; (b), (h) joint block from the rowdecoder and the marginalization applied in the columndecoder for the BLK algorithm; (c), (i) joint blocks fromthe row decoder and their roles in the column decoder forthe SBLK algorithm; (d), (j) state and input pixels definitionof column decoder; (e), (k) joint block from thecolumn decoder and the marginalization applied in therow decoder for the BLK algorithm; (f), (l) joint blocksfrom the column decoder and their roles in the row decoderfor the SBLK algorithm. (In (b), (e), (h), and (k),‘X’ indicates the pixels which are marginalized out.)Soft-Decision Feedback Zigzag Algorithm with Joint Extrinsic Information ExchangeThe idea of estimating and exchanging joint extrinsic information can be applied to other types of detectorsas well. Towards that end, the PIs have recently developed an iterative soft-decision feedback zigzagalgorithm using joint extrinsic information, which is based on a zigzag scan of the pixels [22], instead ofa row-by-row or column-by-column scan of the pixels. The key ideas and motivation are similar to that7

of the BLK and SBLK algorithm. For brevity, we omit the detailed equations, which are presented in ourconference paper [24]. We refer to this new algorithm the block zigzag (BLKZ) algorithm.Experimental ResultsWe now present Monte Carlo simulation results for the BLK, SBLK, BLKZ and joint concatenatedalgorithms, and compare their performance with the previous ISDFZA and concatenated algorithms in [22](with the independence assumption) and also the ML bound. All simulations employ a random 64 × 64binary image f(m, n) with pixel values chosen from the alphabet {−1, +1} and a 3 × 3 averaging maskh (where all mask coefficients are 1/9). The plots presented below show the bit-error-rate (BER) of theestimated binary input image, versus signal-to-noise ratio (SNR). The SNR is defined as in [14]:SNR = 10 log 10 Var[f ∗ h]σ 2 ωBER, (11)where ∗ denotes 2D convolution between the image f and the mask h and σω 2 is the variance of the additiveGaussian noise. To compute the received image r(m, n), we assume a boundary of −1 pixels aroundf(m, n); the receiver uses this known boundary condition to simplify the trellis near edge pixels.Figure 5 shows that both the BLK andSBLK algorithms provide almost 1.2 dBgain over the original row-column algorithm 10 −1(IRCSDF); their performance is only 0.3 dBaway from the ML bound. A constant weightwas was used (0.3 for both BLK and SBLK)with ten iterations in each case. Figure 5 also10 −2shows that the BLKZ algorithm gains morethan 1 dB SNR over the independent ISD-FZA at BER 10 −3 , and has a steeper slope.The BLKZ algorithm presented in Fig. 5 appliesa constant weight of 0.01 (over all iterations)to the extrinsic LLRs, and uses a to-10 −3tal of eight iterations. This is in contrast tothe original ISDFZA of [22], which requiresan iteration-dependent weight schedule optimizedwith EXIT charts.−4Indep. IRCSDFAML bound10Joint IRCSDFA BLKFigure 5 also shows the performance ofJoint IRCSDFA SBLKthe serial concatenation of the BLKZ algorithmwith the SBLK algorithm [23]; the jointIndep. ISDFZAIndep. concatenated systemconcatenated system’s details are omitted due 10 −5 Joint ISDFZA BLKZto space limitations. Most significantly, theJoint Concatenated System3 × 3 joint concatenated system outperforms 8 9 10 11 12 13 14 15 16SNR (dB)the independent concatenated system of [22]by about 1 dB at BER 10 −4 , outperforms thejoint IRCSDFA of [23] by about 0.3 dB atFigure 5: Monte Carlo simulations results for 3 × 3 averagingmask.BER 2 × 10 −5 , and performs within 0.2 dBof the ML bound at BER 10 −5 ; to our knowledge this is the best published performance to date for equalizationof the 3 × 3 averaging mask channel.2.b.Proposed research approachesThis subsection describes new work to be done under the proposed project. First, the overall effort proposedin the NSF GOALI proposal is described. Then, the ASTC supported research is explained within the8

context of the GOALI project, as it both depends upon and extends the scope of the GOALI project. TheASTC research plan also includes a contingency plan in case the GOALI proposal is declined.Figure 6 provides an overview of the TDMR channel model. Information bits are channel encoded andinterleaved. The coded bits π(u(i, j)) are then written onto the magnetic grains of the recording medium,causing overwrite errors as previously described in subsection 2.a.i.. The combination of writing and readingthe bits at high density also gives rise to 2D ISI, which we model as 2D convolution followed by AWGN.Figure 7 is an example signal processing and decoding architecture for processing the signal r(i, j) fromthe magnetic disk. This iterative equalizer and decoder has a double loop structure. The signal first passesinto a 2D ISI equalization block, which provides a LLR estimate L e [y(i, j)] of the bits y(i, j) output fromthe magnetic grain model (MGM); this estimate is then passed as extrinsic estimation to a MGM estimator,which in turn passes its own estimates of the y(i, j) back to the 2D ISI equalizer as a priori information.The MGM estimator passes LLR estimates of the channel coded bits u(i, j) to the channel decoder, which inturn passes its own estimates of the u(i, j) back to the MGM to help refine the MGM estimates. The use ofsoft information in the form of LLRs which are estimated by modified sum-product algorithms (SPAs, [25])running in the estimation blocks is a key feature of the proposed signal processing and coding framework.Both the MGM estimator and the 2D ISI equalizer accept timing and position error information from thejoint timing and position estimation module to be developed under TDMR research topic 5, and use thatinformation to correct their estimates. We emphasize that the architecture shown in Fig. 7 is only one ofseveral configurations we propose to investigate; in particular, it may prove advantageous to combine thefunctions of one or more of the detection blocks shown in this figure and run the sum product algorithm onthe relevant combined factor graphs.πFigure 6: Transmitter block diagram for TDMR.π π Figure 7: Receiver block diagram for TDMR.2.b.i.Magnetic grain model detectorsWe propose to investigate three magnetic grain model detectors of successively increasing complexity andmodeling accuracy: a discrete-grain rectangular boundary detector, a discrete-grain Voronoi model-baseddetector, and a Voronoi-model based detector. This approach is taken for two reasons: (1) progressing fromrelatively simpler to more complex models allows the later investigations into more complicated modeldetectors to benefit from insights learned from the simpler models; and (2) our goal is to identify the tradeoffsbetween algorithmic complexity and performance, which will allow a designer to select the best design given9

the processing resources available. The accuracy of the models will be assessed by comparison with actualTDMR signals provided by Hitachi or other ASTC member companies.Two-row discrete grain rectangular boundary BCJR algorithmWe propose a two-row BCJR-type detector that assumes the discrete grain rectangular model shown in Fig. 2(a). By looking at two rows of the input signal simultaneously, it should be possible to substantially improveperformance (in terms of lower BER for a given code rate) over previously proposed algorithms that look atonly one row at a time (e.g., [2]). In our previous work on row-column scanned 2D-ISI detectors, we haveobserved substantial performance gains by considering multiple input rows at a time [19].The state-input block for the two-row DGM algorithmis shown in Fig. 8. This state-input blockscans through a given 2D data block row-by-row in raster order, corresponding to the scan order of typicalshingled writing heads proposed for TDMR [26]. The input bits at the kth trellis stage are (u k0 ,u k1 ), and the model outputs are (y k0 ,y k1 ). The bit marked ’X’ is feedback from the previously detected row; the probabilities associated with the ’X’ pixel areused to modify the state transition probabilities in a Figure 8: The state and feedback definition for tworowrectangular discrete-grain model detector.soft-decision feedback scheme somewhat similar toour 2D ISI row-column algorithm described in [19].In the BCJR algorithm, the first and most important step is to compute the gamma probability:γ i (y k ,s ,s)=p (y k | U = i,S k = s, S k−1 = s ) · P (U | s, s ) · P (s | s ). (12)The restrictions on connectivity between current andnext states that result from the grain geometries areshown in Table 1. In (12), state transition probabilitiesP (s | s ) can be computed from Table 1 and thegrain probabilities shown in Fig.2 (a). The P (s | s )probabilities can be stored in a 39 × 39 table, since wehave 39 possible states in one trellis stage based on Table1. The table’s rows are the current state S and thecolumns are the next state S. Since the P (s | s ) tableis very sparse (i.e., more than half of the elements inthis table are zero), we show only one typical row inTable 2. In Table 2, P( ¯B, ¯F ), P( ¯B), P( ¯F ) specify probabilitiesof the feedback pixel ’X’ in Fig.8, where ‘ ¯B’means ‘not equal to B’; these three feedback probabilitiesare computed from the LLRs from the detection ofprevious rows. From this table, we could compute thes 0 : (m,n) s 1 : (m+1,n) s 0: (m,n+1)A A, B, D, E, F, H A, B, C, D, F, GB C A, B, C, D, F, GC A, B, D, E, F, H A, B, C, D, F, GD A, B, D, E, F, H EE A, B, D, E, F, H A, B, C, D, F, GF G HG A, B, D, E, F, H IH I A, B, C, D, F, GI A, B, D, E, F, H A, B, C, D, F, GTable 1: Connectivity restrictions between s 0 atlocation (m,n) and s 1 at (m+1,n), and between s 0at (m,n) and s 0 at (m,n+1).P (s | s ) in the gamma probability in (12), and the other two probabilities could also be computed fromthe connectivity restriction in Table 1, and the a-priori probability received from previous scans. We notethat since only one column in the past is considered, the 39 grain states should be sufficient to describe thetrellis, with the inputs u k and outputs y k described via branch transition probabilities.To benchmark the performance of our DGM detector, we will combine it with a high performancechannel code, e.g., an irregular repeat accumulate (IRA) code, and compare the maximum code rate atwhich a low BER can be achieved to the upper and lower capacity bounds for the 4-grain channel derived in[3]. Such a comparison in [2] showed that the one-row DGM algorithm in [2] achieved code rates up to 65%of the average of the upper and lower capacity bounds; this result demonstrates that there is still substantial10

Table 2: Conditional probabilities of next states given current state is ‘AA’.s P (s|s = AA) s P (s|s = AA) s P (s|s = AA) s P (s|s = AA)AA P 1 · P 1 · P ( ¯B, ¯F ) AB P 1 · P 2 · P ( ¯B, ¯F ) AD P 1 · P 3 · P ( ¯B, ¯F ) AE 0AF P 1 · P 4 · P ( ¯B, ¯F ) AH 0 BC P 2 · P ( ¯B, ¯F ) CA P 1 · P (B)CB P 2 · P (B) CD P 3 · P (B) CE 0 CF P 4 · P (B)CH 0 DA P 1 · P 3 · P ( ¯B, ¯F ) DB P 2 · P 3 · P ( ¯B, ¯F ) DD P 2 3 · P ( ¯B, ¯F )DE 0 DF P 4 · P 3 · P ( ¯B, ¯F ) DH 0 EA 0EB 0 ED 0 EE 0 EF 0EH 0 FG P 4 · P ( ¯B, ¯F ) GA P 1 · P (F ) GB P 2 · P (F )GD P 3 · P (F ) GE 0 GF P 4 · P (F ) GH 0HI 0 IA 0 IB 0 ID 0IE 0 IF 0 IH 0room for improvement even for this relatively simple discrete rectangular grain model.We also plan to investigate three generalizations to the above-described algorithm. First, we will explorerunning two DGM detectors iteratively along rows (down-track) and columns (cross-track) and exchangingextrinsic information similar to our 2D ISI row-column approach. As TDMR blocks are expected to besignificantly longer in the down-track direction than in the cross track, we will not realize as much gainfrom the cross-track detector as from the down-track, but still our prior experience suggests that worthwhilegains should occur. An additional advantage is that the feedback probabilities for the ’X’ pixel shown inFig. 8 can come from the other scanning direction, rather than from previous rows (or columns) of a givenscan direction; experience with our 2D ISI algorithms shows that this is also advantageous.Second, we will generalize to a rectangular DGM with non-integer block boundaries as described in [3].In the simplest non-integer model, there are nine grain types of size m × n, where m ∈{1, 3/2, 2} andn ∈{1, 3/2, 2}. The equations for the DGM and for the 2D ISI convolution are modified by appropriateupsampling and downsampling, respectively [3]. Using non-integer boundaries will increase the complexityof the proposed BCJR algorithm, but will give a more accurate modeling of actual magnetic grains.Third, we will look into joint estimation of state and input grain and bit states, similar to the jointestimation used in the 2D ISI algorithms described in subsection 2.a.iv. above. As in the 2D ISI algorithms,we expect that joint estimation will give significant performance gains.Voronoi discrete grain model-based algorithmsTo allow more general grain shapes and sizes, wepropose a probabilistic Voronoi discrete grain model (PVDGM). In this model, square bits are subdividedinto small tiles (“tiny squares” in Fig. 2(b)), and grains are generated probabilistically as unions of tiles according to the four connection probabilities P0 ,...,P 3 shown in Fig. 9. The connection probabilitiescan be adjusted to model different grain size distributions and anisotropies; also, absolute limits onFigure 9: Connectivity configuration of the probabilisticVoronoi discrete grain model; connectiongrain size in down- and cross-track directions can bespecified.probabilities P 0 ,...,P 3 can be adjusted to modelAn example of grains generated by the PVDGMdifferent grain size distributions and anisotropies.is shown in Fig. 10(b), where tiles in the same grainshare the same grain number; here a grain size restriction of 8 tiles in each dimension was applied. In this11

Figure 10: Voronoi discrete grain model with grain size restriction of 8 tiny squares in each dimension,where the original pattern has alternate bits on each row and column (i.e. [0 1 0; 1 0 1; 0 1 0]): (a) the initialbinary image; (b) the grain model; (c) the binary image after writing to the grain model.example, we assume that each bit is subdivided into 4×4 tiles, and that the original bits alternate by row andcolumn in the pattern [0 1 0; 1 0 1; 0 1 0] shown in part (a) of the figure. We simulate the write operationby raster scanning tile-wise the original bit pattern of part (a) over the grain tile pattern of part (b), such thatthe last tile written in a given grain determines the state of the entire grain. The resulting binary pattern isseen in part (c) of the figure. In part (c), the state of a given bit is determined by the state of the majorityof its tiles, so that the bits would be read as [0 1 1; 1 1 1; 0 1 0]; the uneven grain distribution has causedtwo bit errors. We note that the write model can easily be changed to the more realistic case where grainstake on the values that are written in the exact center of the bit; in this case some of the smaller grains (e.g.,grain 26) would not be written at all. Also, the PVDGM has the ability to accommodate multiple grains perbit, which is consistent with current state-of-the-art, and facilitates comparison with experimental data fromprototype TDMR systems.For detection with the PVDGM, we propose a BCJR-like algorithm that raster scans over tiles; trellisstates at the position of a given tile would correspond to the possible connectivities of that tile and itsneighbors in its causal past. The detector would find grain patterns that are most consistent with the bitsread from the channel (or passed in from the 2D ISI detector), and with the a priori information about thedata bits received from either the 2D ISI detector or the channel decoder. In estimating the original databits, the detector would take into account the estimated grain pattern and associated grain-overwrite effectswhen computing a posteriori probabilities (APPs). Trellis state complexity could be controlled by limitingthe extent of the tiles considered to be part of the causal past of the current tile; as in the 2D ISI detectors,we expect that including larger regions in the causal past would lead to better performance.Initial experiments with the PVDGM model would be conducted without 2D ISI in order to obtainquantitative comparisons with the rectangular block DGM detector described previously. In addition, therelatively simple two-dimensional Markov structure of the PVDGM and its finite-state nature suggest thatits channel capacity could be estimated using the generalized Blahut Arimoto algorithm of Vontobel et.al. [27], similar to the way that a capacity lower bound for the rectangular grain DGM was estimated in[3]. We propose to attempt to derive bounds to the capacity of the PVDGM channel, in order to facilitatebenchmarking of detection algorithms for the PVDGM.Voronoi-model algorithmsTo generate a more general Voronoi model that is still finite state, we propose to restrict the grain centersto a finite number of locations on a fine grid about each bit center. The displacements of grain centers frombit centers would be drawn randomly according to a 2D probability mass function (PMF) peaked about theorigin. However, large enough displacements would be allowed so that grains could span more than one bit.12

This is important for accurate simulation of grain size distributions in magnetic media. It is also importantbecause when R. Wood et. al. (in follow-up work to [28]) looked into establishing TDMR timing andposition from the edges of well-defined data patterns on the disk, they found that the jitter did not reduceasymptotically to zero when averaging over an increasing number or length of the edges. This is because theedges are still correlated with the underlying grid and this long-range order becomes apparent when lookingover very long spans. We believe this effect can be reduced by allowing larger random displacements ofgrain centers from bit centers. An example finite-state Voronoi model with a density of one bit per grainappears in Fig. 11. In this figure grain centers (solid dots) are allowed to range within in a 7 × 7 square gridabout the bit centers (asterisks). Note that a grain center can end up outside its associated bit’s boundary;this allows grains to enclose more than one bit center, as for example grain 2 encloses bits 2 and 5.We propose to investigate two types of detection algorithms based on the above finite-state Voronoimodel. First, a BCJR algorithm would raster scan over the bits and their associated grain centers. The grainstates associated with a particular trellis stage at the current grain would be the collection of possible centerpositions of adjacent grains in the causal past of the current grain. The Voronoi boundaries between thecurrent grain and its causal past are easily inferred from a given grain state, as seen in Fig. 11. It is sufficientto consider only the causal past to account for grain overwrite effects. Since any given bit has four adjacentbits in its causal past (the bit to its left and three bits in the previous row), the number of states for theVoronoi model in Fig. 11 would be 49 4 = 5764801. Some states would be eliminated because we would notallow the model to generate a grain center for the current bit in the causal past of any previously generatedgrain centers; for example, the grain center 5 in Fig. 11 would not be allowed to be both to the left of andabove grain center 2.If the grain displacement PMF is non-uniform (e.g., a sampledGaussian), then another method to reduce the number ofstates would be to use a non-uniform grain-center grid withequiprobable bins, which would still allow large displacements.The BCJR algorithm would estimate APPs for the bits read fromthe disk based on the most likely grain center positions inferredfrom the channel data and the a priori data passed into it, andwould account for grain overwrite effects.The second detection algorithm for the finite-state Voronoimodel would be based on generalized-belief-propagation (GBP)[20, 29]. In this detector, a given local block of grain centerswould pass information about their locations among each other,and thereby establish the most likely grain boundaries consistentwith the channel and a priori information passed into thedetector. Messages would also be passed between local blocksalong their borders. The size of the local blocks would be limitedby the property that Voronoi boundaries can be establishedby knowing only the position of immediately adjacent grain centers.It is possible that a GBP-based algorithm could performbetter than the BCJR-based algorithm, or at least offer a greaterFigure 11: Voronoi boundaries of thecausal past of bit 5. The grain centers(solid dots) are chosen randomly within a7×7 square grid about the bit centers (asterisks).Note that a grain center can endup outside its associated bit’s boundary,thereby allowing grains to enclose morethan one bit center; e.g., grain 2 enclosesbits 2 and 5.range of trade-offs between complexity and performance. Aprobabilistic graphical model-based detector for the Voronoimodelwas previously proposed in [4] and some details of the relevant sum-product algorithm were provided;however, it was also pointed out that the proposed algorithm suffered from loops in its factor graphand that GBP could be used to eliminate the loops.13

2.b.ii.Integrated TDMR detectorsWe propose to study the integration of the magnetic grain model detectors described in the previous subsectionwith 2D ISI detection and channel decoding. Initially, a serial decoding architecture with separatedetectors for TDMR and 2D ISI as in Fig. 7 will be considered. Routing of extrinsic information betweendetectors, and scheduling of outer and inner iterations between and within the detectors, will be optimizedvia use of EXIT charts [30]; the Co-PIs have experience in optimizing multidimensional parameter spacesfor iterative detection systems with EXIT charts and have published novel schemes for doing so in [22].The advantages of joint estimation of bits and grain states in the TDMR models, and exchange of jointLLRs between joint TDMR estimators and joint 2D ISI estimators (such as those proposed by the Co-PIs in[23, 24]) will also be investigated. Of particular importance will be novel schemes for subtracting the effectsof input joint extrinsic information from the output extrinsic information for 2D TDMR detectors and forinterfacing the TDMR detectors to the 2D ISI detectors in such a way as to reduce loops in the factor graphof the overall detector; these issues are more complicated for 2D joint detectors than they are for standard1D turbo-equalizers. In [24], the Co-PIs and their student Yiming Chen introduce two-dimensional extrinsicinformation flow (TEXIF) charts to visualize extrinsic information flow in 2D ISI detectors that employ jointdetection; these charts allow derivation of novel LLR subtraction and interfacing schemes.We also propose to investigate combining magnetic grain model detection and 2D ISI detection byrunning the sum product algorithm on the combined factor graph of the two models. In previous workon turbo-equalization or combined channel estimation and decoding where two different Markov modelswere employed, the approach of combining the factor graphs for the two models has sometimes provedadvantageous.We will also explore reduced complexity versions of the above algorithms. In particular, we will leveragethe results of our conference paper [31], where we showed that a 95% complexity reduction in the 2DISI IRCSDFA equalizer could be achieved with almost no performance penalty by sorting the soft-decisionfeedback configuration probabilities (e.g., the probabilities of the two rows marked Ω 1 and Ω 2 in Fig. 3)and retaining only the top few most probable configurations. In addition to greatly reducing the complexityof the 2D ISI equalizer in Fig. 7, it is possible that variations of this technique could also be used toreduce the complexity of the various proposed magnetic grain model detectors, especially in cases wheremarginalization must be done over a large number of soft-decision feedback or a priori configurations.2.b.iii.ASTC-sponsored research: integrating channel coding with turbo-TDMR detection and equalizationWe propose to have the ASTC-supported student work with the Co-PIs on the interface and interactionbetween channel coding, TDMR detection, and 2D ISI equalization.For channel codes, we propose to use existing high performance soft-input/soft-output codes such asirregular repeat-accumulate (IRA) codes [33] or serially-concatenated convolutional codes (SCCCs) [32].SCCCs are known to have better performance at high SNRs than parallel concatenated convolutional codes(PCCCs), and hence are more appropriate for applications such as magnetic disk storage that require verylow BERs. It is known that IRA codes (which are a sub-class of low-density parity check (LDPC) codesthat admit fast encoding) outperform SCCCs and PCCCs at low and moderate SNRs; furthermore, IRAcodes have much lower decoding complexity than SCCCs. Unfortunately, IRA codes are subject to errorfloors at high SNR. However, recent papers (e.g., the extended IRA (e-IRA) codes of Yang and Ryan [34])have demonstrated how to lower the error floor of IRA codes. In a recent M.S. thesis by a student of theCo-PIs [35], a serial concatenation scheme for IRA codes is shown to significantly reduce their error floors;this scheme has the apparent feature that it should lower the error floor of any component LDPC codes.Hence, we propose to try the coding scheme of [35] with the e-IRA codes of [34], in order to arrive at a lowcomplexity coding scheme with good performance at high SNRs.The interaction between the channel code, the equalizer/grain-estimator and the interleaver will also beinvestigated. Initially, the focus will be on a simple interleaver design that spreads closely spaced bits before14

interleaving to bits with high 2D Euclidean distance after interleaving, so that correlated errors generatedby the combined magnetic grain and 2D ISI channel are spread widely apart by the de-interleaver beforechannel decoding, making them easier to correct. A starting point for this interleaver design will be thewell-known 1D S-interleaver, which spreads bits apart by at least S positions.Correlated channels such as the combined magnetic grain and 2D ISI channel usually have certain highprobability error patterns that limit the performance of the entire equalizer/detector. We will investigatemethods to analytically find these error patterns, and to design the code and interleaver so that they are easilycorrected. One way to do this is to use a symbol mapper (essentially a specialized interleaver) between theencoder and the magnetic grain model, such that the high probability error patterns become easily correctableafter de-mapping.For ISI channels, it is well known that pre-coding can reduce the bit error rate and simplify the receiverarchitecture. In recent work on the capacity of magnetic grain channel models by Kavcic, there has beensome suggestion that intentional introduction of correlation into the channel codewords could help achievechannel capacity; essentially this is a combination pre-coder and channel encoder. We will investigatetechniques for introducing this correlation using simple rate-1 finite state machines so that the resultingcorrelated codewords remain linear and also retain their original code rate.We also propose to investigate methods to compute tight bounds for the channel capacity of the combinedmagnetic grain and 2D ISI channels. These bounds would be of great use in benchmarking the proposeddetection algorithms. Channel capacity lower and upper bounds have been developed by Kavcic for thediscrete grain rectangular model [3], and lower and upper bounds on the capacity of the finite state 2D ISIchannel with AWGN have been developed by Chen and Siegel in [36]. We will use these prior boundingtechniques as a starting point to develop bounding techniques on the capacity of the combined channel. Ifprogress is made on the capacity of the discrete-grain Voronoi channel, then we will also investigate methodsto compute the capacity of the combined discrete-grain Voronoi and 2D ISI channels.In the event that the NSF GOALI proposal is declined, then work will be redirected to the discrete rectangulargrain model with the goal of creating an entire turbo-detection-equalization-and-decoding architecturesuch as that shown in Fig. 7, starting with the two-row BCJR algorithm outlined in subsection 2.b.i. above.Preliminary results from this work will then be used to strengthen future re-submissions of the GOALIproposal.2.b.iv.Evaluation planThe developed TDMR detection and decoding algorithms will be evaluated in several ways. First, eachdetector will be evaluated with simulated TDMR data generated from the same magnetic grain model thatthe algorithm uses. The performance of the system will be compared to the channel capacity of the combinedmagnetic grain and 2D ISI channel, if known. But to test the robustness of the algorithms, they willalso be evaluated with data generated using a truly random grain model with the same areal density. Finally,the detector/decoders will also be evaluated against actual TDMR data supplied by Hitachi, or otherASTC member companies, in order to refine the channel models and their associated detection algorithms.(Hitachi’s letter of support for the GOALI proposal mentions that they may provide TDMR data for this purpose.)This will also help the algorithms to correctly model and take into account noise due to bit positionand timing inaccuracies, which have been previously studied in a number of papers (e.g., [28]).2.c.Likely outcomes of researchThe most likely outcomes of this research are (i) a complete turbo-detection-equalization and decoding algorithmsfor the discrete-grain rectangular and Voronoi discrete-grain models; and (ii) progress towardsa turbo-detection-equalization-decoding algorithm for the general Voronoi model. In the event that theGOALI is not funded, then only the discrete-grain rectangular turbo-detection-equalization-decoding algorithmwould likely be completed. The simulation codes for these algorithms, implemented in C and C++,15

will be another important project outcome of value to the ASTC. In a broader sense, successful completionof the proposed project could enable TDMR to become commercially viable within the next few productgenerations. The signal processing and coding techniques developed in this project could potentially beincluded in future commercial hard disk drives employing 2D magnetic recording.3 Resources required to perform project3.a.PersonnelThe co-PIs Belzer and Sivakumar will be the senior personnel working on this project. They will be responsiblefor the overall conduct of the project as well as for the day-to-day managament and supervision ofthe graduate students working on this project. Three graduate students will be working on this project (20hours/week each). They will be responsible for developing the algorithms and conducting related experimentsto test their performance.3.b.Laboratory and office spaceUsual laboratory and office space will be required to house the PIs and graduate students, as well as theassociated computing equipment. We do not anticipate a need for any specialized equipment. Two copiesof the Matlab software package along with the Signal Processing toolbox will be required.3.c.ComputationalA cluster of computers will be required to run simulation experiments. In addition, a personal desktopcomputer for each of the project personnel and peripheral devices (e.g., printers) will be required.4 Resources other than ASTC funding dedicated to perform project4.a.GrantsThe PIs have a pending proposal with the National Science Foundation (NSF) for work related to thatproposed in this project. If awarded, this grant will provide salary support for the PIs as well as a stipendfor two graduate students.4.b.NoneContracts4.c.OtherThe PIs have an existing computing cluster with 17 nodes. One of the nodes has a quad-core processor,while the rest of them have dual-core processors. This cluster will be used to run most of the simulationexperiments for this project.5 Resources requested from ASTC and how they will be utilized5.a.FundingWe are requesting total funding in the amount of $182,000 from ASTC for the three year project period.The breakdown of the budget is given below. A detailed budget spreadsheet is included in the Appendix.16

i. Overhead: Washington State University’s current federally negotiated rate for overhead costs (alsoreferred to as facilities and administrative costs) for on-campus research is 49.5%. This rate is appliedto the total direct cost, less tuition for graduate students. This overhead amounts to $50,376 over 3years.ii. Direct project cost: The total direct cost, over three years, for this project is $131,624. This includesgraduate student stipend, tuition assistance (paid as fringe benefits), equipment, travel, software licenses,and miscellaneous laboratory supplies.iii. Facility use fees: Noneiv. Equipment and software: One PC workstation for the student, and one additional cluster node PCto run simulations, at $1200 each. Two MATLAB licenses with Signal Processing Toolbox, at $900each; MATLAB is used for rapid prototyping of algorithms before they are converted into C++, andalso for plotting and data visualization. This equipment and software are purchased during the firstyear for use during the entire project period.v. Materials: $2,250 over three years. This includes miscellaneous laboratory supplies (e.g., paper,printer toner, books, memory), and fees for computer services.vi. Student stipends: $59,394 over three years, for one graduate student. In addition, fringe benefits,including tuition and health insurance, in the amount of $35,780 will be charged as direct costs.vii. Travel: Travel to quarterly ASTC meetings has been budgeted at $1,500 each (includes airfare andthree days of hotel accommodation and per diem). This adds up to $18,000 (12 meetings) over threeyears. In addition, travel to two technical conferences (one person each) per year has been budgetedat $2,000 each — total of $12,000 over three years.5.b.Expected technical cooperation with sponsorsWe expect that Hitachi or one of the other ASTC sponsors will provide access to measured data from TDMRequipment (e.g., noisy bits) along with the ground truth (the true value of the bits), which we will use toassess the performance of our algorithms.5.c.Sponsors’ facility utilizationThe proposed algorithm development work will be performed in computing labs at WSU. No utilization ofsponsor facilities is planned.5.d.Expected students’ internshipsWe believe that one summer internship for the ASTC-supported student over the course of the three-yearproject time period would be beneficial to both the student and the project.6 Time lineAs mentioned earlier in our research plan (see Section 2), this proposed project will be conducted in conjunctionwith a project proposal recently submitted to the National Science Foundation (NSF), with industrypartner Hitachi. The specific tasks and associated time line for the ASTC project are outlined below:• Year 1: Design interleaver between magnetic grain estimator and channel decoder to spread commonlyencountered error patterns out over 2D space. These errors will subsequently be corrected bythe channel code. Investigate serially concatenated IRA and e-IRA codes for low complexity17

• Year 2: Design symbol mapping to allow the channel code to easily detect and correct high probabilityerror patterns generated by the 2D ISI equalizer, and investigate the interaction between the symbolmapper and the channel decoder. Methods for computing upper and lower bounds on the capacity ofthe combined rectangular-grain and 2D ISI channel will also be investigated.• Year 3: Investigate the introduction of intentional correlation to the channel coded bit stream in orderto improve the performance of the overall system. Investigate methods for computing upper and lowerbounds on the capacity of the combined discrete-Voronoi and 2D ISI channel.7 Home institutions & resourcesThe PIs have available extensive research facilities including a 17-node 36-core Linux Beowulf cluster withdedicated cluster head server for high-performance computing and Monte-Carlo simulations, and variousPCs, workstations, printers and scanners in a dedicated 1000 square foot laboratory with desk space for sixgraduate students. The PIs and their students are members of the Information Research Lab (IRL). IRLincludes 5 Linux/WinXP PC workstations, SGI RAID-based video editing station, Abekas real-time videodisk, two scanners, one B/W laser printer and one color inkjet printer. A full time computer systems supportstaff with part-time student help is available for computer system issues.8 Contact information and biographical sketchesBiographical Sketch — B. BelzerContact Information: belzer@eecs.wsu.edu, 509-335-4970, 509-335-3818 (Fax).Education: University of California at Los Angeles, Ph.D. in Electrical Engineering, 1996.Current Appointment: Associate Professor, School of Electrical Engineering and Computer Science,Washington State University, Pullman, WA.Recent Publications Most Closely Related to the Proposed Project:• T. Cheng, B. J. Belzer, and K. Sivakumar, “Row-column soft-decision feedback algorithm for twodimensionalintersymbol interference,” IEEE Signal Processing Letters, vol. 14, no. 7, pp. 433–436,July 2007.• Y. Zhu, T. Cheng, K. Sivakumar, and B. J. Belzer, “Markov random field detection on two-dimensionalintersymbol interference channels,” IEEE Transactions on Signal Processing, vol. 56, no. 7, pp. 2639–2648, July 2008.• Y. Chen, P. Njeim, T. Cheng, B. J. Belzer, and K. Sivakumar, “Iterative soft decision feedback zig-zagequalizer for 2D intersymbol interference channels,” IEEE Journal on Selected Areas in Communications,special issue on “Data Communication Techniques for Storage Channels and Networks,” vol.28, no. 2, pp. 167–180, February 2010.• Y. Chen, B. J. Belzer, and K. Sivakumar, “Iterative row-column soft decision feedback algorithm usingjoint extrinsic information for two-dimensional intersymbol interference,” 44th Annual Conference onInformation Sciences and Systems, Princeton, March 2010.• Y. Chen, B. J. Belzer, and K. Sivakumar, “Iterative soft decision feedback zig-zag algorithm usingjoint extrinsic information for two-dimensional intersymbol interference,” 45th Annual Conferenceon Information Sciences and Systems, Baltimore, March 2011.Biographical Sketch — K. SivakumarContact Information: siva@eecs.wsu.edu, 509-335-4969, 509-335-3818 (Fax).Education: The Johns Hopkins University, MSE and PhD in Electrical and Computer Engineering andMSE in Mathematical Sciences.18

Current Appointment: Associate Professor, School of Electrical Engineering and Computer Science,Washington State University, Pullman, WA.Recent Publications Most Closely Related to the Proposed Project:• T. Cheng, B. J. Belzer, and K. Sivakumar, “Row-column soft-decision feedback algorithm for twodimensionalintersymbol interference,” IEEE Signal Processing Letters, vol. 14, no. 7, pp. 433–436,July 2007.• Y. Zhu, T. Cheng, K. Sivakumar, and B. J. Belzer, “Markov random field detection on two-dimensionalintersymbol interference channels,” IEEE Transactions on Signal Processing, vol. 56, no. 7, pp. 2639–2648, July 2008.• Y. Chen, P. Njeim, T. Cheng, B. J. Belzer, and K. Sivakumar, “Iterative soft decision feedback zig-zagequalizer for 2D intersymbol interference channels,” IEEE Journal on Selected Areas in Communications,special issue on “Data Communication Techniques for Storage Channels and Networks,” vol.28, no. 2, pp. 167–180, February 2010.• Y. Chen, B. J. Belzer, and K. Sivakumar, “Iterative row-column soft decision feedback algorithm usingjoint extrinsic information for two-dimensional intersymbol interference,” 44th Annual Conference onInformation Sciences and Systems, Princeton, March 2010.• Y. Chen, B. J. Belzer, and K. Sivakumar, “Iterative soft decision feedback zig-zag algorithm usingjoint extrinsic information for two-dimensional intersymbol interference,” 45th Annual Conferenceon Information Sciences and Systems, Baltimore, March 2011.19

9 Appendix: detailed budgetIF ANY INTERNATIONAL COLLABORATION OR FOREIGN INVOLVEMENT IN THIS PROPOSAL IS EXPECTED PLEASE REVIEWTHE INFORMATIN FOUND AT http://www.ogrd.wsu.edu/international.aspPI's Name: Ben Belzer Year 1 Year 2 Year 3 Year 4 Year5 CumulativeAgency: Advanced Storage Technology Consortium 08/16/11 08/16/12 08/16/13 08/16/11SALARIES - 00 Pay Rate # Mos. % FTE 08/15/12 08/15/13 08/15/14 08/15/14Benjanmin Belzer 10,113.12 0.00 100.00% Salary - - - - -Benefits 20.0% - - - - - -Co-PI: K. Sivakumar 10,135.08 0.00 100.00% Salary - - - - -Benefits 20.0% - - - - - -Co-PI: 2.00 100.00% Salary - - - - - -Benefits 20.0% - - - - - -Co-PI: 2.00 100.00% Salary - - - - - -Benefits 20.0% - - - - - -Co-PI: 2.00 100.00% Salary - - - - - -Benefits 20.0% - - - - - -Post-Doc/Research Assoc: Salary - - - - - -Benefits 29.4% - - - - - -Classfied Staff: Salary - - - - - -Benefits 29.4% - - - - - -PhD Students 50.00% Salary 19,027 19,788 20,579 - 59,394One student @ 12 months QTR 9,470 9,943 10,441 - 29,854Health 1,613 1,678 1,745 - 5,0351.5% 285 297 309 - - 891Master Student 50.00% Salary - - - - - -QTR - - - - -Health - - - - -1.5% - - - - - -WAGES - 01 $ Per Hr. Hrs/Wks # Wks.Student: $10.00 0 30 Wages - - - - -Benefits 2.1% - - - - - -Student: Wages - - - - - -Benefits 9.7% - - - - - -*Non-Student Temporary Wages - - - - - -Benefits 9.7% - - - - - -**Non-Student Temporary Wages - - - - - -Benefits 18.0% - - - - - -***Non Student Temporary Wages - - - - - -Benefits 60.6% - - - - - -Total Salary 19,027 19,788 20,579 - - 59,394Total Wages - - - - - -Total Salary & Wages 19,027 19,788 20,579 - - 59,394BENEFITS - 07Total Benefits 11,368 11,918 12,494 - - 35,780Total Salaries/Wages/Benefits 30,395 31,706 33,073 - - 95,174CAPITAL EQUIPMENT - 06 ($5,000 +) ----Total Capital Equipment - - - - - -GOODS/SERVICES - 03Publication costs, incidental supplies - - - -Two Matlab licenses with Signal Processing toolbox 1,800 1,800One desktop PC for graduate student 1,200 1,200One cluster node 1,200 1,200Misc. lab supplies 750 750 750 2,250-Total Goods/Services 4,950 750 750 - - 6,450TRAVEL - 04Domestic 10,000 10,000 10,000 - 30,000Foreign -Total Travel 10,000 10,000 10,000 - - 30,000SUBCONTRACTS/RESTRICTED - 14- - -- - -- - -- -Total Subcontracts/Restricted - - - - - -COMMUNICATIONS - 11-Total Phone Equip Rental/Line/Cell Charges - - - - - -PERSONAL SERVICES CONTRACTS - 02--Total Personal Services Contracts - - - - - -COMPUTER SERVICES - 05-Total Computer Services - - - - - -STIPENDS/SUBSIDES - 08-Total Stipends/Subsides - - - - - -TOTAL DIRECT COSTS 45,345 42,456 43,823 - - 131,624EXCLUSIONSQTR 9,470 9,943 10,441 - - 29,854Equipment (Over 5k) - - - - - -Subcontracts (After Initial $25K For Each Subcontract)YR 2+ SUB EXCL ENTER BY HAND. 0 0Other (Off-Site Rental & Stipends, Etc) - - - - - -Base 35,875 32,513 33,382 - - 101,770TOTAL F&A - 13 *See note at bottom of page F&A Rate 49.50% 17,758 16,094 16,524 - - 50,376TOTAL COSTS 63,103 58,550 60,347 - - 182,000F&A Base Type: MTDC TD TC SWB OtherXApproved By: Joy RobbinsDate: 04/27/2011Category/Object Year 1 Year 2 TotalSalaries - 00 19,027 19,788 20,579 - - 59,394Wages - 01 - - - - - -Personal Serv ice Contract - 02 - - - - - -Goods/Services - 03 4,950 750 750 - - 6,450Travel - 04 10,000 10,000 10,000 - - 30,000Computer Services - 05 - - - - - -Equipment - 06 - - - - - -Benefits - 07 11,368 11,918 12,494 - - 35,780Stipends/Subsides - 08 - - - - - -Phone - 11 - - - - - -Subcontracts/Restricted - 14 - - - - - -Total Direct Costs 45,345 42,456 43,823 - - 131,624F&A - 13 17,758 16,094 16,524 - - 50,376Total Costs 63,103 58,550 60,347 - - 182,000The Non-Student Temporary rate shows with and without PERS and medical insurance. Non-Student Temporary Employees (NSTEs)become eligible for PERS if they work 70 or more hours per month in any five months of a 12 month period.NSTEs become eligible for medical insurance in the seventh month if they work 480 or more hours in a consecutive six month period. They mustwork in the first month of the six month period. (These WSU contributions are absorbed by the departments in subobjects HE, HF, HM and MD)*No PERS, No Health (less than 70 hrs a month) 9.70%**PERS with No Health (more than 70 hours for 5 mths) 18.00%***PERS, Health, Med (6 consecutive mths PT work) 60.60%Please see Guideline 2 for further information.20Approved by4/28/2011C:\siva\grants\proposal\ASTM\ASTM Budget - siva - belzer Apr 2011.xlsx

References[1] R. Wood, M. Williams, A. Kavcic, and J. Miles, “The feasibility of magnetic recording at 10 terabitsper square inch on conventional media,” IEEE Trans. Magnetics, vol. 45, no. 2, pp. 917–923, Feb.2009.[2] L. Pan, W. E. Ryan, R. Wood, and B. Vasic, “Serial turbo coding performance for rectangular-grainTDMR models,” in Proc. IEEE Globecom 2010, Miami, FL, USA, Dec 6-10 2010, pp. 1–5.[3] A. Kavcic, X. Huang, B. Vasic, W. Ryan, and M. F. Erden, “Channel modeling and capacity boundsfor two-dimensional magnetic recording,” IEEE Trans. Magnetics, vol. 46, no. 3, pp. 812–818, Mar.2010.[4] A. R. Krishnan, R. Radhakrishnan, B. Vasic, A. Kavcic, W. Ryan, and M. F. Erden, “2-D magneicrecording: Read channel modeling and detection,” IEEE Trans. Magnetics, vol. 45, no. 10, pp. 3830–3836, Oct. 2009.[5] K. S. Chan, J. J. Miles, E. Hwang, B. V. K. VijayaKumar, J. G. Zhu, W. C. Lin, and R. Negi, “TDMRplatform simulations and experiments,” IEEE Trans. Magnetics, vol. 45, no. 10, pp. 3837–3843, Oct.2009.[6] A. R. Krishnan, R. Radhakrishnan, and B. Vasic, “Read channel model for detection in twodimensionalmagnetic recording systems,” IEEE Trans. Magnetics, vol. 45, no. 10, pp. 3679–3682,Oct. 2009.[7] W. Coene, “Coding and signal processing for two-dimensional optical storage (TwoDos),” Mar. 2004,powerpoint presentation available at http://cm.bell-labs.com/cm/ms/events/WGIR04/pres/coene.ppt.[8] G. T. Huang, “Holographic memory,” MIT Technology Review, vol. 9, Sept. 2005.[9] G. D. Forney, “The Viterbi algorithm,” Proceedings of the IEEE, vol. 61, pp. 268–278, Mar. 1973.[10] K. M. Chugg, “Performance of optimal digital page detection in a two-dimensional ISI/AWGN channel,”in Proc. Asilomar Conf. on Signals, Systems and Comp., Nov. 1996, pp. 958–962.[11] J. F. Heanue, K. Gürkan, and L. Hesselink, “Signal detection for page-access optical memories withintersymbol interference,” Applied Optics, vol. 35, no. 14, pp. 2431–2438, May 1996.[12] R. Krishnamoorthi, “Two-dimensional Viterbi like algorithms,” Master’s thesis, Univ. Illinois at UrbanaChampaign, 1998.[13] C. Miller, B. R. Hunt, M. A. Neifeld, and M. W. Marcellin, “Binary image reconstruction via 2-DViterbi search,” in Proc. IEEE International Conference on Image Processing, (ICIP97), vol. 1, 1997,pp. 181–184.[14] C. L. Miller, B. R. Hunt, M. W. Marcellin, and M. A. Neifeld, “Image restoration using the Viterbialgorithm,” Journal of the Optical Society of America A, vol. 16, pp. 265–274, February 2000.[15] C. Berrou and A. Glavieux, “Near optimum error correcting coding and decoding: turbo-codes,” IEEETrans. Commun., vol. 44, no. 10, pp. 1261–1271, Oct. 1996.[16] X. Chen and K. M. Chugg, “Near-optimal data detection for two-dimensional ISI/AWGN channelsusing concatenated modeling and iterative algorithms,” in Proc. IEEE International Conference onCommunications, ICC’98, 1998, pp. 952–956.21

[17] Y. Wu, J. A. O’Sullivan, N. Singla, and R. S. Indeck, “Iterative detection and decoding for separabletwo-dimensional intersymbol interference,” IEEE Trans. Magnetics, vol. 39, no. 4, pp. 2115–2120,July 2003.[18] M. Marrow and J. K. Wolf, “Iterative detection of 2-dimensional ISI channels,” in Proc. Info. TheoryWorkshop, Paris, France, Mar./Apr. 2003, pp. 131–134.[19] T. Cheng, B. J. Belzer, and K. Sivakumar, “Row-column soft-decision feedback algorithm for twodimensionalintersymbol interference,” IEEE Signal Processing Letters, vol. 14, pp. 433–436, July2007.[20] O. Shental, N. Shental, S. Shamai, I. Kanter, A. J. Weiss, and Y. Weiss, “Discrete-input twodimensionalGaussian channels with memory: estimation and information rates via graphical modelsand statistical mechanics,” IEEE Transactions on Information Theory, vol. 54, no. 4, pp. 1500–1513,April 2008.[21] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linear codes for minimizing symbolerror rate,” IEEE Transactions on Information Theory, vol. 20, pp. 284–287, March 1974.[22] Y. Chen, T. Cheng, P. Njeim, B. Belzer, and K. Sivakumar, “Iterative soft decision feedback zig-zagequalizer for 2D intersymbol interference channels,” IEEE Journal on Selected Areas in Communications,vol. 28, no. 2, Feb. 2010.[23] Y. Chen, B. Belzer, and K. Sivakumar, “Iterative row-column soft-decision feedback algorithm usingjoint extrinsic information for two-dimensional intersymbol interference,” in Proceedings of the 44thConference on Information Sciences and Systems (CISS 2010), Princeton, NJ, March 2010.[24] ——, “Iterative soft-decision feedback zigzag algorithm using joint extrinsic information for twodimensionalintersymbol interference,” in 45th Annual Conference on Information Sciences and Systems,Baltimore, March 2011.[25] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs and the sum-product algorithm,”IEEE Trans. Inform. Theory, vol. 47, pp. 498–519, Feb. 2001.[26] S. Greaves, Y. Kanai, and H. Muraoka, “Shingled recording for 2-3 Tbits per square inch,” IEEE Trans.Magnetics, vol. 45, no. 10, pp. 3823–3828, Oct. 2009.[27] P. Vontobel, A. Kavcic, D. Arnold, and H.-A. Loeliger, “A generalization of the Blahut Arimoto algorithmto finite-state channels,” IEEE Trans. Inform. Theory, vol. 54, pp. 1887–1918, May 2008.[28] E. Hwang, R. Negi, V. Kumar, and R. Wood, “Investigation of position and timing uncertainty intwo-dimensional magnetic recording (TDMR) at 4Tbit/in 2 ,” in Proc. 9th Perp. Mag. Recording Conf.(PMRC 2010), Sendai, Japan, May 17-19 2010.[29] J. S. Yedidia, W. T. Freeman, and Y. Weiss, “Constructing free-energy approximations and generalizedbelief propagation algorithms,” IEEE Trans. Inform. Theory, vol. 51, no. 7, pp. 2282–2312, Jul. 2005.[30] S. ten Brink, “Convergence behavior of iteratively decoded parallel concatenated codes,” IEEE Trans.Commun., vol. 49, no. 10, pp. 1727–1737, Oct. 2001.[31] H. Ma, K. Sivakumar, and B. J. Belzer, “Feedback probability sorting and LLR-based SED search foriterative row-column 2D ISI equalization,” in Proc. 44th annual Conference on Information Sciencesand Systems (CISS 2010), Princeton, NJ, Mar. 2010, pp. 1–6.22

[32] S. Benedetto, D. Divsalar, G. Montorsi, and F. Pollara, “Serial concatenation of interleaved codes:performance analysis, design, and iterative decoding,” IEEE Trans. Inform. Theory, vol. 44, pp. 909–926, May 1998.[33] H. Jin, A. Khandekar, and R. McEliece, “Irregular repeat-accumulate codes,” in Proc. 2nd Int. Symp.on Turbo Codes and Rel. Top., Brest, France, Sept. 2000, pp. 1–8.[34] M. Yang, W. E. Ryan, and Y. Li, “Design of efficiently encodable moderate-length high-rate irregularLDPC codes,” IEEE Trans. Commun., vol. 52, pp. 564–571, Apr. 2004.[35] Q. Ge, “Two dimensional irregular repeat-accumulate codes,” Master’s thesis, Washington State University,2009.[36] J. Chen and P. Siegel, “Information rates of two-dimensional finite state ISI channels,” in Proc. 2003IEEE Int. Symp. on Info. Theory, June/July 2003, p. 118.23

Signal Processing and Coding for TDMR Channels 1 Front ... - IDEMA

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?