Error Correction for Index Coding With Side Information - Spms

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 59, NO. 3, MARCH 2013 1517Error Correction for Index CodingWith Side InformationSon Hoang Dau, Vitaly Skachek, and Yeow Meng Chee, Senior Member, IEEEAbstract—A problem of index coding with side information wasfirst considered by Birk and Kol in 1998. In this study, a generalizationof index coding scheme, where transmitted symbols aresubject to errors, is studied. Error-correcting methods for sucha scheme, and their parameters, are investigated. In particular,the following question is discussed: given the side informationhypergraph of index coding scheme and the maximal numberof erroneous symbols , what is the shortest length of a linearindex code, such that every receiver is able to recover the requiredinformation? This question turns out to be a generalization of theproblem of finding a shortest length error-correcting code witha prescribed error-correcting capability in the classical codingtheory. The Singleton bound and two other bounds, referred to asthe -bound and the -bound, for the optimal length of a linearerror-correcting index code (ECIC) are established. For largealphabets, a construction based on concatenation of an optimalindex code with a maximum distance separable classical code isshown to attain the Singleton bound. For smaller alphabets, however,this construction may not be optimal. A random constructionis also analyzed. It yields another inexplicit bound on the length ofan optimal linear ECIC. Further, the problem of error-correctingdecoding by a linear ECIC is studied. It is shown that in order todecode correctly the desired symbol, the decoder is required tofind one of the vectors, belonging to an affine space containing theactual error vector. The syndrome decoding is shown to producethe correct output if the weight of the error pattern is less or equalto the error-correcting capability of the corresponding ECIC.Finally, the notion of static ECIC, which is suitable for use witha family of instances of an index coding problem, is introduced.Several bounds on the length of static ECICs are derived, andconstructions for static ECICs are discussed. Connections of thesecodes to weakly resilient Boolean functions are established.Index Terms—Broadcast, error correction, index coding, minimumdistance, network coding, side information.Manuscript received May 10, 2011; revised March 05, 2012; acceptedSeptember 04, 2012. Date of publication November 15, 2012; date of currentversion February 12, 2013. This work was supported by the National ResearchFoundation of Singapore under Research Grant NRF-CRP2-2007-03. Thispaper was presented in part at the 2011 IEEE International Symposium onInformation Theory.S. H. Dau was with the Division of Mathematical Sciences, School ofPhysical and Mathematical Sciences, Nanyang Technological University,Singapore 637371. He is now with SUTD-MIT International Design Centre,Singapore University of Technology and Design, 20 Dover Drive, Singapore138682 (e-mail: dausonhoang84@gmail.com).V. Skachek was with the Division of Mathematical Sciences, School of Physicaland Mathematical Sciences, Nanyang Technological University, Singapore637371. He is now with the Institute of Computer Science, Faculty of Mathematicsand Computer Science, University of Tartu, Tartu 50409, Estonia (e-mail:vitaly.skachek@gmail.com).Y. M. Chee is with the Division of Mathematical Sciences, School of Physicaland Mathematical Sciences, Nanyang Technological University, Singapore637371 (e-mail: ymchee@ntu.edu.sg).Communicated by A. Ashikhmin, Associate Editor for Coding Techniques.Digital Object Identifier 10.1109/TIT.2012.2227674A. BackgroundI. INTRODUCTIONTHE problem of index coding with side information (ICSI)was introduced by Birk and Kol [1], [2]. During the transmission,each client might miss a certain part of the data, dueto intermittent reception, limited storage capacity, or any otherreasons. Via a slow backward channel, the clients let the serverknow which messages they already have in their possession, andwhich messages they are interested to receive. The server has tofind a way to deliver to each client all the messages he requested,yet spending a minimum number of transmissions. As it wasshown in [1], the server can significantly reduce the number oftransmissions by coding the messages.The toy example in Fig. 1 presents a scenario with one broadcasttransmitter and four receivers. Each receiver requires a differentinformation packet (we sometimes simply call it message).The naive approach requires four separate transmissions,one transmission per an information packet. However, by exploitingthe knowledge on the subsets of messages that clientsalready have, and by using coding of the transmitted data, theserver can just broadcast one coded packet.Possible applications of index coding include communicationsscenarios, in which a satellite or a server broadcasts a set ofmessages to a set of clients, such as daily newspaper delivery orvideo-on-demand. ICSI can also be used in opportunistic wirelessnetworks. These are the networks in which a wireless nodecan opportunistically listen to the wireless channel. The clientmay obtain packets that are not designated to it (see [3]–[5]). Asa result, a node obtains some side information about the transmitteddata. Exploiting this additional knowledge may help toincrease the throughput of the system.The ICSI problem has been a subject of several recent studies[3], [6]–[13]. This problem can be viewed as a special case ofthe network coding (NC) problem [14], [15]. On the other hand,it was shown that every instance of the NC problem can be reducedto an instance of the ICSI problem in the following sense.For each NC instance, we can construct an ICSI instance suchthat there exists a scalar linear network code for the NC instanceif and only if there exists a perfect scalar linear indexcode for the corresponding ICSI instance (see [3] and [11] formore details).B. Our ContributionThe preceding works on the ICSI problem consider scenariowhere the transmissions are error-free. In practice, of course,this might not be the case. In this study, we assume that the transmittedsymbols are subject to errors. We extend some known0018-9448/$31.00 © 2012 IEEE

1518 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 59, NO. 3, MARCH 2013Fig. 1.Example of the ICSI problem.results on index coding to a case where any receiver can correctup to a certain number of errors. It turns out that the problemof designing such error-correcting index codes (ECICs) naturallygeneralizes the problem of constructing classical error-correctingcodes.More specifically, assume that the number of messages thatthe server possesses is , and that the designed maximal numberof errors is . We show that the problem of constructing ECICof minimal possible length is equivalent to the problem of constructinga matrix which has rows and the minimal possiblenumber of columns, such thatwhere is a certain subset of . Here, denotes theHamming weight of the vector , stands for a finite fieldwith elements, and is the all-zeros vector. If ,this problem becomes equivalent to the problem of designinga shortest length linear code of given dimension and minimumdistance.In this study, we establish an upper bound (the -bound) anda lower bound (the -bound) on the shortest length of a linearECIC, which is able to correct any error pattern of size up to. More specifically, let be the side information hypergraphthat describes the instance of the ICSI problem. Letdenote the length of a shortest length linear ECIC over ,suchthat every can recover the desired message, if the numberof errors is at most .Weusenotation for the lengthof an optimal linear error-correcting code of dimension andminimum distance over .Weobtainwhere is the generalized independence number andis the min-rank (over )of .For linear index codes, we also derive an analog of the Singletonbound. This result implies that (over sufficiently largealphabet) the concatenation of a standard maximum distanceseparable (MDS) error-correcting code with an optimal linearindex code yields an optimal linear ECIC. Finally, we consider(1)random ECICs. By analyzing its parameters, we obtain an upperbound on its length.When the side information hypergraph is a pentagon,and , the inequalities in (1) are shown to be strict.This implies that a concatenated scheme based on a classicalerror-correcting code and on a linear non-error-correcting indexcode does not necessarily yield an optimal linear error-correctingindex code. Since ICSI problem can also be viewed asa source coding problem [6], [13], this example demonstratesthat sometimes designing a single code for both source andchannel coding can result in a smaller number of transmissions.The decoding of a linear ECIC is somewhat different fromthat of a classical error-correcting code. There is no longer aneed for a complete recovery of the whole information vector.We analyze the decoding criteria for the ECICs and show thatthe syndrome decoding, which might be different for each receiver,results in a correct result, provided that the number of errorsdoes not exceed the error-correcting capability of the code.An ECIC is called static under a family of instances of theICSI problem if it works for all of these instances. Such anECIC is interesting since it remains useful as long as the parametersof the problem vary within a particular range. Boundsand constructions for static ECICs are studied in Section VIII.Connections between static ECICs and weakly resilient vectorialBoolean functions are also discussed.The problem of error correction for NC was studied in severalprevious works. However, these results are not directly applicableto the ICSI problem. First, there is only a very limitedvariety of results for nonmulticast networks in the existing literature.The ICSI problem, however, is a special case of the nonmulticastNC problem. Second, the ICSI problem can be modeledby the NC scenario [3], yet, this requires that there are directededges from particular sources to each sink, which providethe side information. The symbols transmitted on these specialedges are not allowed to be corrupted. By contrast, for error-correctingNC, symbols transmitted on all edges can be corrupted.C. OrganizationThis paper is organized as follows. Basic notations and definitions,used throughout this paper, are provided in Section II.The problem of index coding with and without error correctionis introduced in Section III. Some basic results are presentedin that section. The -bound and the -bound are derivedin Section IV. The Singleton bound is presented in Section V.Random codes are discussed in Section VI. Syndrome decodingis studied in Section VII. A notion of static ECICs is presentedin Section VIII. Several bounds on the length of such codes arederived, and connections to resilient function are shown in thatsection. Finally, the results are summarized in Section IX, andsome open questions are proposed therein.II. PRELIMINARIESIn this section, we introduce some useful notation. Here,is the finite field of elements, where is a power of prime, andis the set of all nonzero elements of .

DAU et al.: ERROR CORRECTION FOR INDEX CODING WITH SIDE INFORMATION 1519LetFor the vectorsand, the (Hamming) distance betweenand is defined to be the number of coordinates whereand differ, namely:Let be an undirected graph. A subset of vertices iscalled an independent set if , .Thesizeof the largest independent set in is called the independencenumber of and is denoted by . The graph iscalled the complement ofifIf and is a set of vectors (or a vector subspace),then the last definition can be extended toA coloring of using colors is a function ,suchthatThe support of a vector is definedtobetheset. The (Hamming) weight of avector , denoted ,isdefinedtobe ,thenumberof nonzero coordinates of . Suppose . We writewhenever .A -dimensional subspace of is called a linearcode over if the minimum distance of , i.e.,is equal to . Sometimes, we may use the notation for thesake of simplicity. The vectors in are called codewords. It iseasy to see that the minimum weight of a nonzero codeword in alinear code is equal to its minimum distance .Ageneratormatrix of an code is a matrix whose rows arelinearly independent codewords of .Then,.Theparity-check matrix of is an matrixover such that .Given , ,and,let denote the length of the shortest linear code overwhich has dimension and minimum distance .We useto denote theunit vector, which has a one at the th position, and zeros elsewhere.For a vectorand a subsetof ,where ,let denotethe vector .For an matrix ,let denote its th row. For a set,let denote the matrix obtained fromby deleting all the rows of which are not indexed by theelements of . For a set of vectors , we use notationto denote the linear space spanned by the vectors in .Wealso use notationfor the linear space spanned bythe columns of the matrix .Let be a graph with a vertex set and an edgeset . The graph is called undirected if every edge ,,and .Agraph is directed if every edgeis an ordered pair , . A directed graph iscalled symmetric ifThere is a natural correspondence between undirected graphand directed symmetric graphdefinedas(2)The chromatic number of is the smallest number such thatthere exists a coloring of using colors, and it is denotedby . By using the correspondence (2), the definitions of independencenumber, graph complement, and chromatic numberare trivially extended to directed symmetric graphs.III. INDEX CODING AND ERROR CORRECTIONA. Index Coding With Side InformationICSI problem considers the following communications scenario.There is a unique sender (or source) , who has a vectorof messagesin his possession. There arealso receivers , receiving information fromvia a broadcast channel. For each , has side information,i.e., owns a subset of messages ,where. Each , is interested in receiving the message(we say that requires ), where the mappingsatisfies for all . Hereafter,we use the notation. An instance of theICSI problem is given by a quadruple. It can alsobe conveniently described by a directed hypergraph [13].Definition 3.1: Letbe an instance of the ICSIproblem. The corresponding side information (directed) hypergraphis definedbythevertexsetand the (directed) hyperedge set ,whereEach hyperedge represents the side information and the demandfrom one receiver. We often refer toas an instanceof the ICSI problem described by the hypergraph .For instance, consider an ICSI instance where (threemessages), (four receivers), , ,, , , , ,and. The hypergraph that describes this instance has threevertices 1, 2, 3, and has four directed hyperedges. These are, , ,and .This hypergraph is depicted in Fig. 2(a).Each side information hypergraphcan be associatedwith the directed graphin the followingway. For each directed edge, there will bedirected edges ,for . For instance, isdepicted in Fig. 2(b). When and for all ,the graph is,infact,theside information graph, defined in[6].

1520 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 59, NO. 3, MARCH 2013For example, it is straightforward to verify that ,where is depicted in Fig. 2(a). We may select ,, ,and .ThenFig. 2. (a) Hypergraph and (b) its corresponding directed graph .The goal of the ICSI problem is to design a coding schemethat allows to satisfy the requests of all receivers in theleast number of transmissions. More formally, we have the followingdefinition.Definition 3.2: An index code overICSI problem described by-IC over ) is an encoding functionsuch that for each receiverfunctionforaninstanceofthe(or just an, there exists a decodingObserve that generalizes the over of theside information graph, which was definedin[6].Morespecifically,when and for all , becomesthe side information graph, and .The of an undirected graph was first introduced byHaemers [16] to bound the Shannon capacity of a graph, andwas later proved in [6] and [7] to be the smallest number oftransmissions in a linear index code. It was shown by Peeters[17] that finding the min-rank of a general graph is an NP-hardproblem. Studies on graph parameters related to min-ranks canalso be found in the works of Peeters [18], [19].The intuition behind the concept of min-rank is explained asfollows. For each ,if obtains where,then is able to determine . Indeed, as possessesfor every , he can calculate if .Hence, can retrieve as follows:satisfyingSometimes we refer to such as a non-error-correcting indexcode. The parameter is called the length of the index code.In the scheme corresponding to this code, broadcasts a vectorof length over .Definition 3.3: A linear index code is an index code, forwhich the encoding function is a linear transformation over. Such a code can be described aswhere is an matrix over .Thematrix is called thematrix corresponding to the index code . The code is alsoreferred to as the linear index code based on .Hereafter, we assume that is known to .Moreover, we also assume that the code is known to eachreceiver. In practice, this can be achieved by apreliminary communication session, when the knowledge of thesets for and of the code is disseminated betweenthe participants of the scheme.Definition 3.4: Supposecorresponds toan instance of the ICSI problem. Then, the min-rank of overis defined asThus, in order to satisfy all the demands, the sender maybroadcast packets ,where .Infact, it is sufficient to broadcast onlypackets. Therefore, the minimum number of packets (transmissions)required in this way is . It turns out thatis the smallest possible number of transmissions required ifscalar linear index codes are used, according to Lemma 3.5.This lemma was implicitly formulated in [6] for the case where, , for all , and generalized to itscurrent form in [20].Lemma 3.5: Consider an instance of the ICSI problem describedby .1) The matrix corresponds to a linear -IC over if andonly if for each there exists such that• ;• .2) The smallest possible length of a linear -IC over is.B. Error-Correcting Index Code With Side InformationDue to noise, the symbols received by,maybesubject to errors. Consider an ICSI instance,andassume that broadcasts a vector .Letbe the error affecting the information received by .Then, actually receives the vector

DAU et al.: ERROR CORRECTION FOR INDEX CODING WITH SIDE INFORMATION 1521instead of . The following definition is a generalization ofDefinition 3.2.Definition 3.6: Consider an instance of the ICSI problem describedby .A -error-correcting indexcode ( -ECIC) over for this instance is an encodingfunctionProof: For each ,wedefinethe set of all vectors resulting from at most errors in the transmittedvector associated with the information vector . Then,the receiver can recover correctly if and only ifsuch that for each receiverfunction, there exists a decodingfor every pairsatisfyingsatisfyingThe definitions of the length, of a linear index code, and of thematrix corresponding to an index code are naturally extended toan ECIC. Note that if is an -IC, then it is a -ECIC,and vice versa.Definition 3.7: An optimal linear -ECIC over isa linear -ECIC over of the smallest possible length.Consider an instance of the ICSI problem described by.Wedefine the set of vectors(Observe that is interested only in the bit ,notinthewhole vector .)Therefore, corresponds to a -ECIC if and only if thefollowing condition is satisfied: for all and for allsuch that and , it holdsDenote . Then, the condition in (6) can be reformulatedas follows: for all and for all such thatand , it holds(6)(7)Note that the two setsFor all,wealsodefineandThen, the collection of supports of all vectors inbyis given(3)coincide. Therefore, an equivalent condition to (7) is that for all,The necessary and sufficient condition for a matrix to be thematrix corresponding to some -ECIC is given in the followinglemma.Lemma 3.8: The matrix corresponds to a -ECIC overif and only ifSince for,wehavethe condition (4) can be restated as(4)Equivalently, corresponds to a -ECIC over if andonly if(5)for all and for all choices of nonzero .Corollary 3.9: For all,letfor all and for all choices of .

DAU et al.: ERROR CORRECTION FOR INDEX CODING WITH SIDE INFORMATION 1525for all and all choices of .Wededuce that the rows of also satisfyBy Corollary 3.10, corresponds to a linear -IC. Therefore,by Lemma 3.5, part 2, has at least columns. We deducethatVI. RANDOM CODESIn this section, we prove an inexplicit upper bound on theoptimal length of the ECICs. The proof is based on constructinga random ECIC and analyzing its parameters.Theorem 6.1: Letof the ICSI problem. Then, there exists aof length ifdescribe an instance-ECIC overwhich concludes the proof.The following corollary from Proposition 4.6 and Theorem5.1 demonstrates that, for sufficiently large alphabets, a concatenationof a classical MDS error-correctingcodewithanoptimal(non-error-correcting) index code yields an optimal ECIC.However, as it was illustrated in Example 4.8, this does not holdfor the index coding schemes over small alphabets.Corollary 5.2 (MDS Error-Correcting Index Code): For,where(12)is the volume of the -ary sphere in .Proof: We construct a random matrix over ,row by row. Each row is selected independently of other rows,uniformly over .Define vector spacesfor all.Wealsodefine the following events:(11)Proof: From Theorem 5.1, we haveandOn the other hand, from Proposition 4.6for(by taking doubly extendedReed–Solomon (RS) codes). Therefore, for these , (11)holds.Remark 5.3: Let , ,andfor all .Let and .For ,let .Let .Then, is the (symmetric) odd cycle of length . Therefore,.From[7],.From -boundBy contrast, from Theorem 5.1The event represents the situation when the receivercannot recover . Then, by Corollary 3.9, the event isequivalent to . ThereforeFor a particular event ,(13)(14)There exists a matrix that corresponds to a -ECIC if. It is enough to require that the right-hand side of(13) is smaller than 1. By plugging in the expression in (14), weobtain a sufficient condition on the existence of a -ECICover :As there are no nontrivial binary MDS codes, we havefor all choices of . Therefore, for these choices, the-bound is at least as good as the Singleton bound.Remark 6.2: The bound in Theorem 6.1 does not take into accountthe structure of the sets ’s, other than their cardinalities.Therefore, this bound generally is weaker than the -bound. Onthe other hand, for a particular instance of the ICSI problem, it

1526 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 59, NO. 3, MARCH 2013is easier to compute this bound, while calculating thein general is an NP-hard problem.-boundRemark 6.3: The bound in Theorem 6.1 implies a bound on, which is tight for some . Indeed, fix . The boundimplies that there exists a linear index code of length whenever(15)Let ,and for all .Letand.For,let.Letbe the corresponding side information hypergraph. Then, isthe complement of the (symmetric directed) odd cycle of length.Wehave for all . Then, (15) becomesIf ,thenweobtain . Observe that in this case,(see [13, Claim A.1]), and thus,the bound is tight.VII. SYNDROME DECODINGConsider the -ECIC based on a matrix . Suppose thatthe receiver , , receives the vector(16)where is the codeword transmitted by ,and is the errorpattern affecting this codeword.In the classical coding theory, the transmitted vector ,thereceived vector , and the error pattern are related by. Therefore, if is known to the receiver, then there isa one-to-one correspondence between the values of unknownvectors and . For index coding, however, this is no longer thecase. The following theorem shows that, in order to recover themessage from using (16), it is sufficient to find just onevector from a set of possible error patterns. This set is definedas follows:We henceforth refer to the set as the set of relevant errorpatterns.Lemma 7.1: Assume that the receiver receives .1) If knows the message , then it is able to determinethe set .2) If knows some vector , then it is able todetermine .Proof:1) From (16), we have2) Suppose that knows a vectorfor some . We show that is able thento determine . Indeed, we rewrite (17) asThe receivercan find some solution of the equation(18)(19)with respect to the unknowns and . Observe that(19) has at least one solution due to (18).From (18) and (19), we deduce thatThis equality implies that(otherwise, byCorollary 3.9, the sum in the right-hand side will havenonzero weight). Hence, is able to determine ,asclaimed.We now describe a syndrome decoding algorithm for linearECICs. From (17), we haveLet ,andlet beaparitycheck matrix of . We obtain thatLetbe a column vector defined by(20)(21)Observe that each is capable of determining .Then,wecan rewrite (20) asThis leads us to the formulation of the decoding procedure for, which is presented in Fig. 5.Remark 7.2: The solution in (22) might not be unique.Nevertheless, any such solution of the lowest Hamming weightyields the correct answer in the algorithm.Remark 7.3: Gaussian elimination can be used to solve (23)for . However, since also corresponds to an -IC, thereis more efficient way to do so. From Lemma 3.5, there existsa vector satisfying . Hence,for some . Therefore(17)If knows , then it is also able to determineSince has a knowledge of ,itisalsoabletodeterminethe whole .With the knowledge of and , can determine and .Therefore, it can also determine . Note that (23) may havemore than one solution with . However, as shown

DAU et al.: ERROR CORRECTION FOR INDEX CODING WITH SIDE INFORMATION 1527in the next theorem, if at most errors occur in ,thenitalwaysholds that .Example 7.4: Consider the ICSI instance presented in Example4.8. Suppose the binary ECIC based on the following matrixis usedLet . The sender broadcasts .As, the receiver knows in advance and .Since , requests . Suppose two errors occur andreceives the erroneous vector,whereThenAs and ,Fig. 5.Syndrome decoding procedure.Theorem 7.5: Letbe the vector received by,andlet . Assume that the procedure in Fig. 5 isapplied to . Then, its output satisfies .Proof: By Lemma 7.1, it is sufficient to prove that. Indeed, sincewe haveA parity check matrix ofisHence,, and therefore,(24)Recall that.Wehavefor some and , .Since is a solution of (22), and ,wededucethatas well. HenceandTherefore, the syndrome isIn this case, is the unique lowest Hamming weight vectorthat has syndrome . Choosing ,,wehaveSince knows and , can computeTherefore, as discussed in Remark 7.3,obtainswhich is equal to —the message that requests.Therefore, by Corollary 3.9, . Hence, ,asdesired, and therefore, .Remark 7.6: We anticipate Step 2 in Fig. 5 to be computationallyhard. Indeed, the problem of finding over of thelowest weight satisfying(25)for a given binary vector is at least as hard as a decisionproblem coset weights that was shown in [23] to beNP-complete.VIII. STATIC CODES AND RELATED PROBLEMSA. Static Error-Correcting Index CodesIn the previous sections, we focused on linear -error-correctingindex codes for a particular instance of the ICSIproblem. When some of the parameters , , ,and arevariable or not known, it is very likely that an ECIC for theinstance with particular values of these parameters cannot beused for the instances with different values of some of theseparameters. Therefore, it is interesting to design an ECIC whichwill be suitable for a family of instances of the ICSI problem.

1528 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 59, NO. 3, MARCH 2013Definition 8.1: Letbe a set of instancesforanICSIproblem.A -error-correcting index code overis said to be static under the set if it is a -error-correcting-IC over for all instances .Recall that an instancecan be described by theside information hypergraph .Foraset of instances,letwhereis defined as in (3). We also define(26)Lemma 8.2: The matrix corresponds to a -errorcorrectingindex code which is static under if and only ifConsider an instance.Forall,wehave and , and thuswe deduce thatHence by (3), the cardinality of each set inis at mostTherefore, due to (26), every set inhas at mostelements.It remains to show that every nonempty subset of whosecardinality is at most belongs to . Consider an arbitrary-subset of ,with .Consider an instancewithand .Sincewe havefor all and for all choices of .Proof: The proof follows from Definition 8.1 and Lemma3.8.Notice that when is used for an instancewith ,thenthelast rows of are simplydiscarded.One particular family of interest is , the family thatcontains all instances where each receiver owns at leastmessages as its side information. More formallyA -error-correcting index code which is static underwill provide successful communication between the sender andthe receivers under the presence of at most errors, despite apossible change of the collection of the side information sets ,a change of the set of receivers, and a change of the demandfunction, as long as each receiver still possesses at leastmessages.In the rest of this section, we assume that ,and .Definition 8.3: An matrix is said to satisfy the-Property if any nontrivial linear combination of at mostrows of has weight at least .Proposition 8.4: The matrix corresponds to a-error-correcting linear index code, which is static under, if and only if satisfies the -Property.Proof: Let be an matrix that satisfies the-Property. We show that this is equivalent to the conditionthat corresponds to a -error-correcting linear index code,which is static under . By Lemma 8.2, it suffices toshow that is the collection of all nonempty subsetsof , whose cardinalities are not greater than .The proof follows.B. Application: Weakly Resilient FunctionsIn this section, we introduce the notion of weakly resilientfunctions. Hereafter, we restrict the discussion to the binaryalphabet.The concept of binary resilient functions was first introducedby Chor et al. [24] and independently by Bennet et al. [25].Definition 8.5: A functionis called -resilientif satisfies the following property: when arbitrary inputs ofare fixed and the remaining inputs run through all the-tuples exactly once, the value of runs through everypossible output -tuple an equal number of times. Moreover, ifis a linear transformation, then it is called a linear -resilientfunction. We refer to the parameter as the resiliency of .The applications of resilient functions can be found in faulttolerantdistributed computing, quantum cryptographic key distribution[24], privacy amplification [25], and random sequencegeneration for stream ciphers [26]. Connections between linearerror-correcting codes and resilient functions were establishedin [24].Theorem 8.6 [24]: Let be an binary matrix. Then,is a generator matrix of a linear error-correcting code withminimum distance ifandonlyif is-resilient.Remark 8.7: Vectorial Boolean functions with certain propertiesare useful for design of stream ciphers. These propertiesinclude high resiliency and high nonlinearity (see, for instance,[26]). However, linear resilient functions are still particularly interesting,since they can be transformed into highly nonlinear resilientfunctions with the same parameters. This can be achieved

DAU et al.: ERROR CORRECTION FOR INDEX CODING WITH SIDE INFORMATION 1529by a composition of the linear function with a highly nonlinearpermutation (see [27] and [28] for more details).Below we introduce a definition of a -weakly -resilientfunction, which is a weaker version of a -resilient function.Definition 8.8: A function is called -weakly -resilient if satisfies the property that every set of coordinatesin the image of runs through every possible output -tuple anequal number of times, when arbitrary inputs of are fixedand the remaining inputs run through all the -tuplesexactly once.Remark 8.9: A -weakly -resilient functioncan be viewed as a collection of different -resilient functions, each such function is obtained by taking somecoordinates in the image of . Similarly to [24], consider ascenario, in which two parties are sharing a secret key, whichconsists of randomly selected bits. Suppose that at some momentout of the bits of the key are leaked to an adversary.By applying a -resilient function to the current -bit key, twoparties are able to obtain a completely new and secret key ofbits, without requiring any communication or randomness generation.However, if the parties use various parts of the key forvarious purposes, they may only require one of the -bit secretkeys (instead of the larger -bit key). In that case, a -weakly-resilient function can be used. By applying a -weakly -resilientfunction to the current -bit key, the parties obtain a setof different -bit keys, each key is new and secret (however,these keys might not be independent of each other).Theorem 8.10: Let be an binary matrix. Then,satisfies the -Property if and only if the functiondefined by is -weakly -resilient.Proof:1) Suppose that satisfies the -Property. Take any-subset .ByDefinition 8.3, the submatrixof is a generating matrix of the error-correcting codewith the minimum distance . By Theorem 8.6,the functiondefined byis -resilient. Since is an arbitrary -subset of ,thefunction is -weakly -resilient.2) Conversely, assume that the function is -weakly -resilient.Take any subset , . Then, the functiondefined by is -resilient.Therefore, by Theorem 8.6, is a generating matrixof a linear code with minimum distance .Sinceis an arbitrary -subset of ,byProposition8.4, satisfiesthe -Property.C. Bounds and ConstructionsIn this section, we study the problem of constructing a matrixsatisfying the -Property. Such with the minimal possiblenumber of columns is called optimal. First, observe thatfrom Proposition 8.4, we havewhich is the set of all nonempty subsets of of cardinality atmost . Next, consider an instance satisfying(27)whereis the side information hypergraphcorresponding to that instance. Such an instance can beconstructed as follows. For each subset, we introduce a receiver which requests the messageand has a set as its side information.It is straightforward to verify that indeed we obtain an instancesatisfying (27). The problem of designing anoptimal matrix satisfying the -Property then becomesequivalent to the problem of finding an optimal -ECIC.Thus, is equal to the number of columns in an optimalmatrix which satisfies the -Property.The corresponding -bound and -bound for canbe stated as follows.Theorem 8.11: Letlinearbe the smallest number such that acode exists. Then, we haveProof: The first inequality follows from the -bound andfrom the fact that , which is due to (27).For the second inequality, it suffices to show that. By Corollary 3.10, an matrix corresponds to an-IC if and only ifis linearly independent forevery .Since is the set of all nonemptysubsets of cardinality at most , this is equivalent to saying thatevery set of at most rows of is linearly independent. Thiscondition is equivalent to the condition that is a parity checkmatrix of a linear code with the minimum distance at least[29, Ch. 1]. Therefore, a linear -IC of length exists if andonly if an linear code exists. Since is thesmallest number such that ancode exists,we conclude that .Corollary 8.12: The length of an optimal -error-correctinglinear index code over which is static under satisfieswhere is the smallest number such that ancode exists.Proof: This is a straightforward corollary of Theorem 5.1(the Singleton bound) and Theorem 8.11.Corollary 8.13: For, the length ofan optimal -error-correcting linear index code over whichis static under is .Proof: For , there exists anlinear code with (for example, one can take an extendedRS code [29, Ch. 11]). Due to the Singleton bound, we concludethat is the smallest value such thatlinear code exists. Following the lines of the proof of Theorem8.11, there exists a -error-correcting index code of length, which is static under .As ,we have

1530 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 59, NO. 3, MARCH 2013(for example, by taking an extended RS code). Due to Corollary8.12, this static ECIC is optimal.Remark 8.14: We observe from the proof of Theorem 8.11that the problem of constructing an optimal linear (non-errorcorrecting)index code, which is static under , is, in fact,equivalent to the problem of constructing a parity check matrixof a classical linear error-correcting code.Example 8.15: Let , , ,and .From[21], the smallest possible dimension of a binary linear code oflength 20 and minimum distance 11 is 3. We obtain that.Wealsohave. Theorem 8.11 implies theexistence of a one-error-correcting binary index code of length22 which can be used for any instance of IC problem, in whicheach receiver owns at least 10 out of (at most) 20 messages, asside information. It also implies that the length of any such staticECIC is at least. Corollary 8.12 provides a betterlower bound on the minimum length, which is .Example 8.16: Below we show that with the same number ofinputs and outputs , a weakly resilient function may havestrictly higher resiliency .From Example 8.15, there exists a linear vectorial Booleanfunctionwhich is 10-weakly 2-resilient.According to [21], an optimal linear code has minimumdistance . Hence, due to Theorem 8.6, the resiliencyof any linear vectorial Boolean functioncannot exceed one.The problem of constructing an matrix that satisfiesthe -Property is a natural generalization of the problem ofconstructing the parity check matrix of a linearcode. Indeed, is a parity check matrix of ancode if and only if every set of columns of is linearlyindependent. Equivalently, any nontrivial linear combination ofat most columns of has weight at least one. For comparison,satisfies the -Property if and only if any nontrivial linearcombination of at most columns of has weight at least.Some classical methods for deriving bounds on the parametersof error-correcting codes can be generalized to the caseof linear static ECICs. Below we present a Gilbert–Varshamovlikebound.Theorem 8.17: Let denotes the volume of -arysphere of radius in given by (12). Ifthen there exists an matrix which satisfies the-Property.Proof: We build up the set of rows of one by one.The first row can be any vector in of weight at least .Nowsupposewehavechosen rows so that no nontrivial linearcombination of at most among these rows have weight lessthan . There are at mostvectors which are at distance less than from any linearcombination of at most among chosen rows (this includesvectors at distance less than from ). If this quantity issmaller than , then we can add another row to the set sothat no nontrivial linear combination of at most rows in hasweight less than . The claim follows if we replace by.Remark 8.18: If we apply Theorem 6.1 to the instancedefined in the beginning of this section, thenwe obtain a bound, which is somewhat weaker then its counterpartin Theorem 8.17, namely the matrix as aboveexists ifIX. CONCLUSIONIn this work, we generalize the Index Coding with Side Informationproblem toward a setup with errors. Under this setup,each receiver should be able to recover its desired message evenif a certain amount of errors happen in the transmitted data. Thisis the first work that considers such a problem.A number of bounds on the length of an optimal error-correctingindex code are constructed. As it is shown in Example4.8, a separation of error-correcting code and index code sometimesleads to a nonoptimal scheme. This raises a question ofdesigning coding schemes in which the two layers are treatedas a whole. Therefore, the question of constructing error-correctingindex codes with good parameters is still open.A general decoding procedure for linear error-correctingindex codes is discussed. The difference between decoding ofa classical error-correcting code and decoding of an error-correctingindex code is that in the latter case, each receiver doesnot require a complete knowledge of the error vector. Thisdifference may help to ease the decoding process. Finding anefficient decoding method for error-correcting index codes(together with their corresponding constructions) is also still anopen problem.The notion of error-correcting index code is further generalizedto static index code. The latter is designed to serve afamily of instances of error-correcting index coding problem.The problem of designing an optimal static error-correctingindex code is studied, and several bounds on the length of suchcodes are presented.APPENDIXLemma A.1: If is symmetric, then the generalized independencenumber of is the independence number of .Proof: It suffices to show that if is symmetric, then theset of generalized independent sets of and the set of independentsets of coincide.Let be a generalized independent set in .If ,thenobviously is an independent set in . Assume that .For any pair of vertices ,theset belongs to .By definition of , either there is no edge from to ,orthereisnoedgefrom to ,in .Since is symmetric, there

DAU et al.: ERROR CORRECTION FOR INDEX CODING WITH SIDE INFORMATION 1531are no edges between and , in neither directions. Therefore,is an independent set in .Conversely, let be an independent set in . For each, since there are no edges from to all other vertices in ,wededuce that . Due to (3), every subset of whichcontains belongs to . This holds for an arbitrary .Therefore, every nonempty subset of belong to .Weobtain that is a generalized independent set of .ACKNOWLEDGMENTThe authors would like to thank the authors of [7] for providinga preprint of their paper.REFERENCES[1] Y. Birk and T. Kol, “Informed-source coding-on-demand (ISCOD)over broadcast channels,” in Proc. IEEE Conf. Comput. Commun.,SanFrancisco, CA, 1998, pp. 1257–1264.[2] Y. Birk and T. Kol, “Coding-on-demand by an informed source(ISCOD) for efficient broadcast of different supplemental datato caching clients,” IEEE Trans. Inf. Theory, vol. 52, no. 6, pp.2825–2830, Jun. 2006.[3] S. E. Rouayheb, A. Sprintson, and C. Georghiades, “On the indexcoding problem and its relation to network coding and matroid theory,”IEEE Trans. Inf. Theory, vol.56,no.7,pp. 3187–3195, Jul. 2010.[4] S.Katti,H.Rahul,W.Hu,D.Katabi,M.Médard, and J. Crowcroft,“XORs in the air: Practical wireless network coding,” in Proc. ACMSIGCOMM, 2006, pp. 243–254.[5] S. Katti, D. Katabi, H. Balakrishnan, and M. Médard, “Symbol-levelnetwork coding for wireless mesh networks,” ACM SIGCOMMComput. Commun. Rev., vol. 38, no. 4, pp. 401–412, 2008.[6] Z. Bar-Yossef, Z. Birk, T. S. Jayram, andT.Kol,“Indexcodingwithside information,” in Proc. 47th Annu. IEEE Symp. Found. Comput.Sci., 2006, pp. 197–206.[7]Z.Bar-Yossef,Z.Birk,T.S.Jayram,and T. Kol, “Index codingwith side information,” IEEE Trans. Inf. Theory, vol. 57, no. 3, pp.1479–1494, Mar. 2011.[8] E. Lubetzky and U. Stav, “Non-linear index coding outperforming thelinear optimum,” in Proc.48thAnnu. IEEE Symp. Found. Comput. Sci.,2007, pp. 161–168.[9] Y. Wu, J. Padhye, R. Chandra, V. Padmanabhan, and P. A. Chou, “Thelocal mixing problem,” presented at the presented at the Inf. TheoryAppl. Workshop, San Diego, CA, 2006.[10] S. E. Rouayheb, M. A. R. Chaudhry, andA.Sprintson,“Ontheminimum number of transmissions in single-hop wireless codingnetworks,” in Proc. IEEE Inf. Theory Workshop, 2007, pp. 120–125.[11] S. E. Rouayheb, A. Sprintson, and C. Georghiades, “On the relationbetween the index coding and the network coding problems,” in Proc.IEEE Symp. Inf. Theory, Toronto, ON, Canada, 2008, pp. 1823–1827.[12] M. A. R. Chaudhry and A. Sprintson, “Efficient algorithms for indexcoding,” in Proc. IEEE Conf. Comput. Commun., 2008, pp. 1–4.[13] N. Alon, A. Hassidim, E. Lubetzky, U. Stav, and A. Weinstein, “Broadcastingwith side information,” in Proc. 49th Annu. IEEE Symp. Found.Comput. Sci., 2008, pp. 823–832.[14] R.Ahlswede,N.Cai,S.Y.R.Li, and R. W. Yeung, “Network informationflow,” IEEE Trans. Inf. Theory, vol. 46, no. 4, pp. 1204–1216,Jul. 2000.[15] R. Koetter and M. Médard, “An algebraic approach to network coding,”IEEE/ACM Trans. Netw., vol. 11, no. 5, pp. 782–795, Oct. 2003.[16] W. Haemers, “An upper bound for the Shannon capacity of a graph,”Algebr. Methods Graph Theory, vol. 25, pp. 267–272, 1978.[17] R. Peeters, “Orthogonal representations over finite fields and the chromaticnumber of graphs,” Combinatorica, vol. 16, no. 3, pp. 417–431,1996.[18] R. Peeters, “Uniqueness of strongly regular graphs having minimal-rank,” Linear Algebra Applicat., vol. 226–228, pp. 9–31, 1995.[19] R. Peeters, “On the -ranks of net graphs,” Designs, Codes Cryptogr.,vol. 5, pp. 139–153, 1995.[20] S. H. Dau, V. Skachek, and Y. M. Chee, “On the security of indexcoding with side information,” IEEE Trans. Inf. Theory,vol.58,no.6,pp. 3975–3988, Jun. 2012.[21] M. Grassl, Bounds on the Minimum Distance of Linear Codes andQuantum Codes [Online]. Available: http://www.codetables.de[22] M. Chudnovsky, N. Robertson, P. Seymour, and R. Thomas, “Thestrong perfect graph theorem,” Ann. Math., vol. 164, pp. 51–229,2006.[23] E. R. Berlekamp, R. J. McEliece, and H. C. A. van Tilborg, “On theinherent intractability of certain coding problems,” IEEE Trans. Inf.Theory, vol. IT-24, no. 3, pp. 384–386, May 1978.[24] B. Chor, O. Goldreich, J. Håstad, J. Freidmann, S. Rudich, and R.Smolensky, “The bit extraction problem or t-resilient functions,” inProc. 26th Annu. IEEE Symp. Found. Comput. Sci., 1985, pp. 396–407.[25] C. H. Bennet, G. Brassard, and J. M. Robert, “Privacy amplification bypublic discussion,” SIAM J. Comput., vol. 17, pp. 210–229, 1988.[26] C. Carlet, Vectorial Boolean Functions for Cryptography, ser.(ser.Boolean Models and Methods in Mathematics, Computer Science andEngineering). Cambridge, U.K.: Cambridge Univ. Press, 2010, ch. 9.[27] X.-M. Zhang and Y. Zheng, “On nonlinear resilient functions,” inProc. 14th Annu. Int. Conf. Theory Appl. Cryptogr. Technol., 1995,pp. 274–288.[28] K. Gupta and P. Sarkar, “Improved construction of nonlinear resilients-boxes,” IEEE Trans. Inf. Theory, vol. 51, no. 1, pp. 339–348, Jan.2005.[29] F. J. MacWilliams and N. J. A. Sloane, The Theory of Error-CorrectingCodes. Amsterdam, The Netherlands: North Holland, 1977.Son Hoang Dau received the B.S. degree in applied mathematics and informaticsfrom the College of Science, Vietnam National University, Hanoi,Vietnam, in 2006. He received the M.S. degree (2009) and Ph.D. degree(2012) in mathematical sciences from the Division of Mathematical Sciences,Nanyang Technological University, Singapore.During 2011-2012, he held research positions with Nanyang TechnologicalUniversity. He is currently a research fellow at SUTD-MIT International DesignCentre, Singapore University of Technology and Design, Singapore.His research interests are coding theory, network coding, and combinatorics.Vitaly Skachek received the B.A. (Cum Laude), M.Sc. and Ph.D. degrees incomputer science from the Technion—Israel Institute of Technology, in 1994,1998 and 2007, respectively.In the summer of 2004, he visited the Mathematics of Communications Departmentat Bell Laboratories under the DIMACS Special Focus Program inComputational Information Theory and Coding. During 2007-2012, he held researchpositions with the Claude Shannon Institute, University College Dublin,with the School of Physical and Mathematical Sciences, Nanyang TechnologicalUniversity, Singapore, with the Coordinated Science Laboratory, University ofIllinois at Urbana-Champaign, Urbana, and with the Department of Electricaland Computer Engineering, McGill University, Montreal. He is now a SeniorLecturer with the Institute of Computer Science, University of Tartu.Dr. Skachek is a recipient of the Permanent Excellent Faculty Instructoraward, given by Technion.Yeow Meng Chee (SM’08) received the B.Math. degree in computer scienceand combinatorics and optimization and the M.Math. and Ph.D. degrees in computerscience from the University of Waterloo, Waterloo, ON, Canada, in 1988,1989, and 1996, respectively.Currently, he is an Associate Professor at the Division of Mathematical Sciences,School of Physical and Mathematical Sciences, Nanyang TechnologicalUniversity, Singapore. Prior to this, he was Program Director of Interactive DigitalMedia R&D in the Media Development Authority of Singapore, PostdoctoralFellow at the University of Waterloo and IBM’s Zürich Research Laboratory,General Manager of the Singapore Computer Emergency Response Team,and Deputy Director of Strategic Programs at the Infocomm Development Authority,Singapore.His research interest lies in the interplay between combinatorics and computerscience/engineering, particularly combinatorial design theory, coding theory,extremal set systems, and electronic design automation.

Error Correction for Index Coding With Side Information - Spms

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?