30.07.2015 Views

Actas JP2011 - Universidad de La Laguna

Actas JP2011 - Universidad de La Laguna

Actas JP2011 - Universidad de La Laguna

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Actas</strong> XXII Jornadas <strong>de</strong> Paralelismo (<strong>JP2011</strong>) , <strong>La</strong> <strong>La</strong>guna, Tenerife, 7-9 septiembre 2011Fig. 10.Managing read requests for a private blockherency protocols are <strong>de</strong>fined with the simple <strong>de</strong>finitionof events, actions, and transitions, easily co<strong>de</strong>din a text-based file and following simple rules. Thenetwork is also <strong>de</strong>fined with many parameters as thetopology, the routing algorithm, and the pipelinedswitch <strong>de</strong>sign.gMemNoCsim has been <strong>de</strong>veloped to overcome theproblem of long simulation time required by complexsystem-wi<strong>de</strong> environments like GEMS/SIMICS.With gMemNoCsim, memory access traces are usedto speed up simulation time. To keep simulationaccuracy we instrument the traces with the inclusionof memory access <strong>de</strong>pen<strong>de</strong>ncies which temporarilyblock processors following synchronization eventsand barriers. As future work we plan to analyzethe impact of memory <strong>de</strong>pen<strong>de</strong>ncies computation onsimulation accuracy. Also, we plan to use the platformas an effective tool for network and memoryhierarchy co-<strong>de</strong>sign.AcknowlegmentThis work was supported by the Spanish MEC andMICINN, as well as European Commission FEDERfunds, un<strong>de</strong>r Grant TIN2009-14475-C04-01. It wasalso partly supported by the project NaNoC (projectlabel 248972) which is fun<strong>de</strong>d by the European Commissionwithin the Research Programme FP7.Fig. 11. Managing read requests when a migratory sharingpattern has been <strong>de</strong>tected by the L2 cache bankto the same memory location (variable). Reads andwrites are performed sequentially following the migratorysharing pattern. As can be seen, executiontime is halved by the improved coherency protocol.In<strong>de</strong>ed, practically all the write operations have beenperformed in the updated protocol in the exclusivemo<strong>de</strong>, thus having exclusive access and not requiringany coherence action to be performed. This translatesalso to half the number of packets injected intothe network.V. ConclusionsIn this paper we have presented the gMemNoCsimsimulator. Memory coherency and on-chip networkcan be co-<strong>de</strong>signed with the new tool as it allows a<strong>de</strong>tailed and accurate mo<strong>de</strong>ling of both components.Besi<strong>de</strong>s, memory controllers are also mo<strong>de</strong>led. Co-Inv. Inv. with MSCycles 1269949 645099Stores in Shd mo<strong>de</strong> 9999 1Stores in Excl mo<strong>de</strong> 1 9999Injected packets 229997 110021L1 misses 19999 10001L1 hits 1 9999TABLE IIStatistics for two different protocolimplementations in gMemNoCsimReferences[1] Luca Benini, Giovanni De Micheli Networks on chips:technology and tools, Aca<strong>de</strong>mic Press, 2006.[2] Jose Flich, Davi<strong>de</strong> Bertozzi Designing Network On-ChipArchitectures in the Nanoscale Era, Chapman & Hall/CrcComputational Science Series, 2010.[3] P. S. Magnusson, M. Christensson, and J. Eskilson, et al.Simics: A full system simulation platform., IEEE Computer,35, Feb. 2002. IEEE Computer, 35, 2002.[4] M. M. Martin, D. J. Sorin, and B. M. Beckmann, et al. .Multifacet general execution-driven multiprocessor simulator(GEMS) toolset., Computer Architecture News, 33,2005 .[5] Niket Agarwal, Li-Shiuan Peh and Niraj K. Jha. Garnet:A Detailed Interconnect Mo<strong>de</strong>l Insi<strong>de</strong> a Full-SystemSimulation Framework, CE-P08-001, Dept. of ElectricalEngineering, Princeton University, 2008.[6] S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta.The SPLASH-2 programs: Characterization and methodologicalconsi<strong>de</strong>rations., 22nd Int. Symp. on ComputerArchitecture (ISCA), 1995.[7] Christian Bienia, Sanjeev Kumar, Jaswin<strong>de</strong>r Pal Singhand Kai Li. The PARSEC Benchmark Suite: Characterizationand Architectural Implications, Proceedings ofthe 17th International Conference on Parallel Architecturesand Compilation Techniques, 2008.[8] Per Stenstrom, Mats Brorsson, <strong>La</strong>rs Sandberg An adaptivecache coherence protocol optimized for migratory sharing,ISCA ’93 Proceedings of the 20th annual internationalsymposium on computer architecture , 1993.<strong>JP2011</strong>-580

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!