30.07.2015 Views

Actas JP2011 - Universidad de La Laguna

Actas JP2011 - Universidad de La Laguna

Actas JP2011 - Universidad de La Laguna

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Actas</strong> XXII Jornadas <strong>de</strong> Paralelismo (<strong>JP2011</strong>) , <strong>La</strong> <strong>La</strong>guna, Tenerife, 7-9 septiembre 2011LI $11, 5 //set max. retries = 5LI $13, HW_OFLOW //reg 13 has err. co<strong>de</strong>J $TX$ABORT:MFTM $12, $TM2 //check error co<strong>de</strong>BEQ $12, $13, $ERR //jump if HW overflowADDIU $10, $10, 1 //retries++SLTU $12, $10, $11 //max. retries?BEQZ $12, $ERR2 //jump if max. retriesRDdoneRD hitRDcheckRD reqUncachedRDRD MissReadyUncachedRD reqRD reqWR reqRD MissFor WRWRcheckUncachedWR reqRD reqfor WRUncachedWRWR done$TX:XBEGIN($ABORT) //provi<strong>de</strong> abort addressXLW $8, 0($a0) //transactional LD wordADDi $8, $8, 1 //a++XSW $8, 0($a0) //transactional ST wordXCOMMIT//if abort go to $ABORTFig. 2. TMbox MIPS assembly for atomic{a++} (NOPs andbranch <strong>de</strong>lay slots are not inclu<strong>de</strong>d).Abort?Commit/abortTMbusCheckCommit?TMlockBusLock_busFailWaitMemRDRD doneWRbackNo invalidates,MemWriteWaitMemWRWR cancel on invalidateCommit/abortdoneread with the MFTM (move from TM) instruction.$TM0 register contains the abort address, $TM1 hasa copy of the stack pointer for restoring when atransaction is restarted, $TM2 contains the bit fieldfor the abort (overflow, contention or explicit) and$TM3 stores a 20-bit abort co<strong>de</strong> that is provi<strong>de</strong>d byTinySTM, eg. abort due to malloc/syscall/interruptinsi<strong>de</strong> a transaction, or maximum number of retriesreached etc.Aborts in TMbox are processed like an interrupt,but they do not cause any traps, instead they jump tothe abort address and restore the $sp (stack pointer)in or<strong>de</strong>r to restart the transactions. Regular loadsand stores should not be used with addresses previouslyaccessed in transactional mo<strong>de</strong>, therefore it isleft to the software to provi<strong>de</strong> isolation of transactionaldata if <strong>de</strong>sired. LL/SC can be used simultaneouslywith TM provi<strong>de</strong>d that they do not accessthe same address.Figure 2 shows an atomic increment in TMboxMIPS assembly. In this simple example, the abortco<strong>de</strong> is responsable for checking if the transactionhas been retried a maximum number of times, and ifthere is a hardware overflow (the TM cache is full),and in this case jumps to an error handling co<strong>de</strong> (notshown).B. Bus ExtensionsTo support HTM, we ad<strong>de</strong>d a new type of request,namely COMMIT REQ, and a new response type,LOCK BUS. When a commit request arrives to theDDR, it causes a backwards LOCK BUS message onthe ring which <strong>de</strong>stroys any incoming write requestsfrom the opposite direction, and locks the bus togrant exclusive access to perform a serialized commitaction. All writes are then committed through the“channel” created, after which the bus is unlockedwith another LOCK BUS message, resuming normaloperation. More efficient schemes can be supportedin the future to enable parallel commits [3].C. Cache ExtensionsThe cache state machine reuses the same hardwarefor transactional and non-transactional loadsand stores, however a transactional bit dictates ifLock_bus OKTMwriteInvalidate all writeset entries in cacheWR Commit DoneStart WR commitFig. 3. Cache state diagram. Some transitions (LL/SC) arenot shown for visibility.the line should go to the TM cache or not. Apartfrom regular cached RD/WR, uncached accesses arealso supported, as shown in Figure 3. Cache missesfirst make a memory read request to bring the lineand wait in WaitMemRD state. In case of a store,the WRback and WaitMemWR states manage thememory write operations. While in these two states,if an invalidation arrives to the same address, thewrite will be cancelled. In case of a store-conditionalinstruction, the write will not be re-issued, and theLL/SC will have failed. Otherwise, the cache FSMwill re-issue the write after such a write-cancellationon invalidation.While processing a transactional store insi<strong>de</strong> of anatomic block, an incoming invalidation to the sameaddress causes an abort and possibly the restart ofthe transaction. Currently our HTM system supportslazy version management: the memory is updatedat commit-time at the end of transactions, asopposed to having in-place updates and keeping anundo log for aborting. We also provi<strong>de</strong> lazy conflict<strong>de</strong>tection which implies that data inconsistenciesare <strong>de</strong>tected only after the speculative data iscommitted to the memory. Each transactional writesuccessfully committed causes an invalidation signal,which aborts the transactions that already havethose lines in the TM cache. So a transaction canonly be aborted due to data conflicts during transactionexecution (between XBEGIN and XCOM-MIT/XABORT).To support HTM, the cache state machine is exten<strong>de</strong>dwith three new states, TMbusCheck, TMlockBusand TMwrite. One ad<strong>de</strong>d functionalityis to dictate the locking of the bus prior to committing.Another duty is performing burst writesin case of a successful commit which runs throughthe TMwrite-WRback-WaitMemWR-TMwrite loop.The TMwrite state is also responsible for the gang<strong>JP2011</strong>-286

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!