FY2010 - Oak Ridge National Laboratory

More documents

Recommendations

Info

Director’s R&D Fund— Ultrascale Computing and Data Science Perumalla, K. S. 2010. µπ: A Scalable and Transparent System for Simulating MPI Programs. ICST International Conference on Simulation Tools and Techniques. Torremolinos, Italy. Perumalla, K. S., and C. Carothers. 2010. Compiler-based Automation Approaches to Reverse Computation. Reverse Computation Workshop (in conjunction with IEEE/ACM/SCS PADS'10), Atlanta, GA, USA, IEEE Computer Society. Perumalla, K. S., and S. K. Seal (2010). Reversible Parallel Discrete Event Execution of Large-scale Epidemic Outbreak Models. IEEE/ACM/SCS International Workshop on Principles of Advanced and Distributed Simulation. Atlanta, GA, USA, IEEE Computer Society (Best Paper Finalist). 05550 Computational Biology Toolbox for Ultrascale Computing Igor B. Jouline, Bhanu Rekepalli, Andrey A. Gorin, and Christian Halloy Project Description Insufficient capability to translate the exponentially growing genomic data into useful knowledge is the single most pressing grand challenge in biology. The goal of this project is to dramatically improve biological function prediction by building new and improved models for mining genomic data. This goal will be achieved by using most sensitive data mining tools organized in a robust, massively parallel computational infrastructure. We will port these tools to a Cray XT5 supercomputer and adopt their usage for developing cloud computing, thus enabling mining not only the existing genomic data, but also the future data sets that will be larger by orders of magnitude. There are two types of the project deliverables: (i) a newly developed toolbox containing most useful computational biology software implemented for Cray supercomputers and (ii) a set of new and improved models for biological function prediction that will become available worldwide through major national and international databases. By investing in this project, ORNL will seize the opportunity to become a leader in ultrascale computational biology and will position our team strategically to successfully compete for major funding from the <strong>National</strong> Institutes of Health (NIH) and DOE. Mission Relevance This project aims at establishing ORNL as a world leader in dynamic knowledge discovery based on capabilities for handling diverse genomic data. It will also contribute to developing focused research communities in biology, because the computational biology toolbox developed by the project will be used by a large community of biomedical scientists. This project also addresses the major problems of the DOE (bioenergy) and NIH (human health), because improved biological function prediction is urgently needed to solve these problems. Results and Accomplishments Nearly all deliverables planned for the Year 1 have been met or exceeded. BLAST. We optimized the BLAST code. First we installed the BLAST tool on the Kraken supercomputer and profiled the code to understand the I/O. The database broadcasting to all the nodes in the job was optimized and then the I/O was optimized in two phases. First, a buffer was created in which all the input query sequences were stored and a dynamic load balancing algorithm was designed to distribute the work optimally between all the cores of the node. Second, the outputs from each core were put into separate 90
Director’s R&D Fund— Ultrascale Computing and Data Science buffer to produce continuity of the results. Then, all buffers were combined and sent to the lustre file system in optimal chunks. This helped us to scale the code to 50,000 cores. HMMER. We made code changes to HMMER3.0 that achieved a 100× speedup on Kraken. We tested an ideally parallel approach with each core working on a different dataset with the entire reference database in its own memory. With this implementation we were able to scale up to only 16,000 cores as the bandwidth saturates with more communications. This implementation was improved by having a dedicated node for I/O, where all the nodes send the results to this node and once the buffer is full the results are then sent to lustre file system in optimal chunks. With this implementation we can scale parallel HMMER3 achieving near linear scales until 48,000 cores. 05561 Evaluating the Role of Cloud Computing for Scientific Discovery Rob Gillen and Sudharshan S. Vazhkudai Project Description Delivering Open Scientific Interfaces to the Cloud for Climate and Biology. We will make available petascale biology and climate datasets through open cloud interfaces and integrate them with all three major cloud providers (Google, Amazon, and Microsoft). Through a partnership with the vendors and the scientific community, we will develop optimum methods to process these datasets with distributed cloud resources. We will further explore open parallel extensions to all three vendors’ cloud computing APIs. Mission Relevance The DOE has an interest in mid-sized computing through the use of cloud computing. Our project specifically targets this area by conducting initial exploratory work to study the suitability of cloud platforms for high-performance computing applications, data, and workloads. In addition to DOE, the <strong>National</strong> Science FoundationNSF also directs a broad scope of work related to cloud computing which is very germane to our proposed work. Results and Accomplishments In the area of data movement, we evaluated cloud data movement APIs and performed code enhancements to the vendor-supplied storage access libraries. We developed parallelized file transfers, chunked transfers, cloud-local data proxies, adaptive compression, etc. We made minor changes to the vendor-supplied libraries to not only reduce the total amount of bits transferred but also to significantly improve bandwidth utilization when accessing the entire file, thereby reducing overall transfer time. We evaluated the intra-cloud data accesses further by studying transfers from EC2 to data stored in S3 and compared it against transfers from EC2 to a local distributed file system. Finally, we tested and evaluated the existing Fuse over S3 provider and built a Fuse over Azure provider. Supporting our goals of publishing data in the cloud, we developed a set of tools that, in conjunction with Walrus (the storage component of Eucalyptus, an open source cloud infrastructure), enables an organization to take any data available to the Walrus server and expose it in situ as an Amazon S3- compatible storage endpoint. Additionally, we published a subset of the U.S. contribution to the CMIP3 climate data set and exposed it as an ODATA service. 91
Page 1 and 2:
ORNL/PPA-2011/1 Laboratory Directed
Page 3:
ORNL/PPA-2011/1 Oak Ridge National
Page 6 and 7:
NEUTRON SCIENCES ..................
Page 8 and 9:
NATIONAL SECURITY SCIENCE AND TECHN
Page 10 and 11:
05887 Controlling the Catalytic Pro
Page 13 and 14:
Introduction INTRODUCTION The Labor
Page 15 and 16:
Introduction projects for next-gene
Page 17 and 18:
Introduction ― Develop new mode
Page 19 and 20:
Introduction To select the best and
Page 21 and 22:
Introduction Fig. 2. Distribution o
Page 23:
SUMMARIES OF PROJECTS SUPPORTED THR
Page 26 and 27:
Director’s R&D Fund— Science fo
Page 28 and 29:
Page 30 and 31:
Page 32 and 33:
Page 34 and 35:
Page 36 and 37:
Page 38 and 39:
Page 40 and 41:
Page 42 and 43:
Page 44 and 45:
Page 46 and 47:
Page 48 and 49:
Page 50 and 51:
Page 53 and 54: Director’s R&D Fund— Neutron Sc
Page 65 and 66: 05306 Structure and Structure Evolu
Page 67 and 68: 05404 Asynchronous In Situ Neutron
Page 77 and 78: Director’s R&D Fund— Ultrascale
Page 101: Director’s R&D Fund— Ultrascale
Page 105 and 106: Director’s R&D Fund— Systems Bi
Page 117: Director’s R&D Fund— Systems Bi
Page 120 and 121: Director’s R&D Fund— Advanced E
Page 135 and 136: Director’s R&D Fund— Emerging S
Page 137 and 138: Director’s R&D Fund— Emerging S
Page 139: Director’s R&D Fund— Emerging S
Page 142 and 143: Director’s R&D Fund— Understand
Page 153 and 154:
Director’s R&D Fund— National S
Page 155 and 156:
Page 157 and 158:
Page 159 and 160:
Page 161 and 162:
Page 163:
05573 Rapid Radiochemistry Applicat
Page 166 and 167:
Director’s R&D Fund— Energy Sto
Page 168 and 169:
Page 170 and 171:
Page 172 and 173:
Page 174 and 175:
Page 176 and 177:
Director’s R&D Fund— General Re
Page 178 and 179:
Director’s R&D Fund— General de
Page 180 and 181:
Director’s R&D Fund— General su
Page 182 and 183:
Director’s R&D Fund— General 05
Page 184 and 185:
Director’s R&D Fund— General 20
Page 187 and 188:
Seed Money Fund— Biosciences Divi
Page 189 and 190:
Page 191 and 192:
Page 193:
Page 196 and 197:
Seed Money Fund— Center for Nanop
Page 199 and 200:
Seed Money Fund— Chemical Science
Page 201 and 202:
Page 203 and 204:
Page 205:
Page 208 and 209:
Seed Money Fund— Computational Sc
Page 210 and 211:
Seed Money Fund— Computational Sc
Page 213 and 214:
Seed Money Fund— Computer Science
Page 215 and 216:
Seed Money Fund— Energy and Trans
Page 217 and 218:
Page 219:
Page 222 and 223:
Seed Money Fund— Environmental Sc
Page 224 and 225:
Seed Money Fund— Environmental Sc
Page 227 and 228:
Seed Money Fund— Fusion Energy Di
Page 229 and 230:
Seed Money Fund— Materials Scienc
Page 231 and 232:
Page 233 and 234:
Page 235 and 236:
Page 237 and 238:
Page 239 and 240:
Page 241 and 242:
Seed Money Fund— Measurement Scie
Page 243 and 244:
Page 245 and 246:
Page 247 and 248:
05858 Fabrication of Ultrathin Grap
Page 249 and 250:
Page 251:
Page 254 and 255:
Seed Money Fund— Global Nuclear S
Page 256 and 257:
Seed Money Fund— Neutron Scatteri
Page 259 and 260:
Seed Money Fund— Reactor and Nucl
Page 261 and 262:
Page 263 and 264:
Page 265:
Page 268 and 269:
Seed Money Fund— Physics Division
Page 270 and 271:
Seed Money Fund— Physics Division
Page 272 and 273:
Seed Money Fund— Research Acceler
Page 275 and 276:
Laboratory-Wide Fellowships— Wein
Page 277 and 278:
Page 279 and 280:
Page 281:
Page 284 and 285:
Laboratory-Wide Fellowships— Wign
Page 286 and 287:
Laboratory-Wide Fellowships— Wign
Page 288 and 289:
Index of Project Contributors Coope
Page 290 and 291:
Index of Project Contributors Mille
Page 292 and 293:
Index of Project Contributors Yang,
Page 294:
Index of Project Numbers 05501 ....
show all

FY2010 - Oak Ridge National Laboratory

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?