Program 2010 - Facing the Multicore-Challenge

Previous Conference Program 2010

Facing the Multicore-Challenge I
March 17 - 19, 2010
Heidelberg Academy of Sciences

Wednesday, March 17

13:30h - 14:00h	Registration
14:00h - 14:15h	Welcome Hermann H. Hahn, President of the Heidelberg Academy of Sciences Willy Jäger, Conference mentor
	Tutorials - Afternoon Session
14:15h - 15:00h	Multicore to Manycore: Technologies and Programming Concepts Jan-Philipp Weiss, Karlsruhe Institute of Technology, Germany
15:00h - 15:30h	Coffee Break
15:30h - 16:45h	Programming Ct – Part I: Scaling Towards Future Multicore Michael Klemm, Intel, Germany
16:45h - 17:15h	Coffee Break
17:15h - 18:30h	Programming Ct – Part II: Porting Applications to Multicore Michael Klemm, Intel, Germany

	Evening Session
19:00h - 20:00h	Evening Reception Bel Etage of the Heidelberg Academy of Sciences
20:00h - 20:15h	Conference Opening
20:15h - 21:15 h	- Invited Talk - Analyzing Massive Social Networks Using Multicore and Multithreaded Architectures David A. Bader, Georgia Institute of Technology, USA (Abstract)

Thursday, March 18

	Morning Session: Computer Architecture and Parallel Programming
8:30h - 8:50h	Registration
8:50h - 9:00h	Welcome and Information
9:00h - 10:00h	- Invited Talk - MareIncognito: A Perspective Towards Exascale Jesus Labarta, Barcelona Supercomputing Centre, Spain (Abstract)
10:00h -10:30h	Coffee Break and Poster Session
10:30h - 10:55h	RapidMind: Portability across Architectures and its Limitations Iris Christadler, Leibniz Supercomputing Centre Munich, Germany
10:55h - 11:20h	A Majority-Based Control Scheme for Way-Adaptable Caches Masayuki Sato, Tohoku University, Japan
11:20h - 11:45h	Improved Scalability by Using Hardware-Aware Thread Affinities Sven Mallach, University of Cologne, Germany
11:45h - 12:10h	Thread Creation for Self-aware Parallel Systems Oliver Mattes, Karlsruhe Institute of Technology, Germany

12:10h - 13:50h

Lunch Break
Kulturbrauerei Heidelberg

	Afternoon Session: Applications on Multicore I
13.50h - 14:15h	Putting Personality Into High Performance Computing John Leidel, Convey Computer, USA
14:15h - 14:45h	Where Does Manycore Lead Us? Moderated Discussion
14:45h - 15:10h	G-means Improved for Cell BE Environment Aislan Foina, Barcelona Supercomputing Centre, Spain
15:10h - 15:40h	Coffee Break and Poster Session
15:40h - 16:05h	Parallel 3D Multigrid Methods on the STI Cell BE Architecture Fabian Oboril, Karlsruhe Institute of Technology
16:05h - 16:30h	FPGA vs. Multi-Core CPUs vs. GPUs: Hands-on Experience with a Sorting Application Cristian Grozea, Fraunhofer Institute FIRST Berlin, Germany
16:30h - 17:10h	Short Talks - Session I (see Table at bottom)
17:10h - 17:30h	Coffee Break
17:30h - 18:30h	Short Talks - Session II (see Table at bottom)

18:45h - 20:15h

Guided Tour: Old City of Heidelberg

20:15h-22:30h

Conference Dinner
Vetter's Alt Heidelberger Brauhaus

Friday, March 19

	Morning Session: Applications on Multicore II
8:50h - 9:00h	Welcome and Information
9:00h -10:00h	- Invited Talk - The Natural Parallelism Robert Strzodka, MPI Informatik, Saarbrücken, Germany (Abstract)
10:00h - 10:30h	Coffee Break and Poster Session (see Table at bottom)
10:30h - 10:55h	Lattice-Boltzmann Simulation of the Shallow-Water Equations with Fluid-Structure Interaction on Multi- and Manycore Processors Markus Geveler, Technical University of Dortmund, Germany
10:55h -11:20h	Applying Classic Feedback Control for Enhancing the Fault-Tolerance of Parallel Pipeline Workflows on Multi-Core Systems Tudor Ionescu, University of Stuttgart, Germany
11:20h - 12:05h	Programming for Manycore - Challenges and Solutions Moderated Discussion

12:05h - 13:30h

Lunch Break
Kulturbrauerei Heidelberg

	Afternoon Session: GPGPU Computing
13:30h - 13:55h	Considering GPGPU for HPC Centers: Is it Worth the Effort? Hans Hacker, Technical University Munich, Germany
13:55h - 14:20h	Real-time Image Segmentation on a GPU Alexey Abramov, University of Goettingen, Germany
14:20h - 14:45h	Parallel Volume Rendering Implementation on Graphics Cards using CUDA Jens Fangerau, University of Heidelberg, Germany
14:45h - 15:00h	Coffee Break
15:00h - 15:50h	Short Talks - Session III (see Table at bottom)
15:50h - 16:00h	Conference Closing and Farewell

Abstracts of Invited Talks

Analyzing Massive Social Networks using Multicore and Multithreaded Architectures
Wednesday, March 17, 20:15h-21:15h
David A. Bader, Georgia Institute of Technology, USA

Emerging real-world graph problems include detecting community structure in large social networks, improving the resilience of the electric power grid, and detecting and preventing disease in human populations. Unlike traditional applications in computational science and engineering, solving these problems at scale often raises new challenges because of sparsity and the lack of locality in the data, the need for additional research on scalable algorithms and development of frameworks for solving these problems on high performance computers, and the need for improved models that also capture the noise and bias inherent in the torrential data streams. The explosion of real-world graph data poses a substantial challenge: How can we analyze constantly changing graphs with billions of vertices? Our approach leverages the Cray XMT's fine-grained parallelism and flat memory model to scale to massive graphs. On the Cray XMT, our static graph characterization package GraphCT summarizes such massive graphs, and our ongoing STINGER streaming work updates clustering coefficients on massive graphs at a rate of tens of thousands updates per second.

Short Bio of David A. Bader:
David A. Bader is a Full Professor in the School of Computational Science and Engineering, College of Computing, at Georgia Institute of Technology. Dr. Bader has also served as Director of the Sony-Toshiba-IBM Center of Competence for the Cell Broadband Engine Processor. He received his Ph.D. in 1996 from The University of Maryland, was awarded a National Science Foundation (NSF) Postdoctoral Research Associateship in Experimental Computer Science. He is an NSF CAREER Award recipient, an investigator on several NSF and NIH awards, was a distinguished speaker in the IEEE Computer Society Distinguished Visitors Program, and a member of the IBM PERCS team for the DARPA High Productivity Computing Systems program. Dr. Bader serves on the Research Advisory Council for Internet2, the Steering Committees of the IPDPS and HiPC conferences, and is the General Chair of IPDPS 2010 and Chair of SIAM PP12. He is an associate editor for several high impact publications including the ACM Journal of Experimental Algorithmics (JEA), IEEE DSOnline, and Parallel Computing, has been an associate editor for the IEEE Transactions on Parallel and Distributed Systems (TPDS), is an IEEE Fellow and a Member of the ACM. Dr. Bader's interests are at the intersection of high-performance computing and computational biology and genomics. He has co-chaired a series of meetings, the IEEE International Workshop on High-Performance Computational Biology (HiCOMB), co-organized the NSF Workshop on Petascale Computing in the Biological Sciences, written several book chapters, and co-edited special issues of the Journal of Parallel and Distributed Computing (JPDC) and IEEE TPDS on high-performance computational biology. He has co-authored over 100 articles in peer-reviewed journals and conferences, and his main areas of research are in parallel algorithms, combinatorial optimization, and computational biology and genomics.

MareIncognito: A Perspective Towards Exascale
Thursday, March 18, 9:00h -10:00h
Jesus Labarta, Barcelona Supercomputing Centre, Spain

MareIncognito is a cooperative project between IBM and the Barcelona Supercomputing Center (BSC) targeting the design of relevant technologies on the way towards exascale. The initial challenge of the project was to study the potential design of a system based on a next generation of Cell processors. Even so, the approaches pursued are general purpose, applicable to a wide range of accelerator and homogeneous multicores and holistically addressing a large number of components relevant in the design of such systems.
The programming model is probably the most important issue when facing the multicore era. We need to offer support for asynchronous data flow execution and decouple the way source code looks like and the way the program is executed and its operations (tasks) scheduled. In order to ensure a reasonable migration path for programmers the execution model should be exposed to them through a syntactical and semantic structure that is not very far away from current practice. We are developing the StarSs programming model which we think addresses some the challenges of targeting the future heterogeneous / hierarchical multicore systems at the node level. It also integrates nicely into coarser level programming models such as MPI
and what is more important in ways that propagate the asynchronous dataflow execution to the whole application. We are also investigating how some of the features of StarSs can be integrated in OpenMP.
At the architecture level, interconnect and memory subsystem are two key components. We are studying in detail the behavior of current interconnect systems and in particular contention issues. The question is to investigate better ways to use the raw bandwidth that we already have in our systems and can expect to grow in the future. Better understanding of the interactions
between the raw transport mechanisms, the communication protocols and synchronization behavior of applications should lead to avoid an exploding need for bandwidth that is often claimed. The use of the asynchronous execution model that StarSs offers can help in this direction as a very high overlap between communication and computation should be possible. A similar
effect or reducing sensitivity to latency as well as the actual off chip bandwidth required should be supported by the StarSs model.
The talk will present how we target the above issues, with special details on the StarSs programming model and the underlying idea of the project of how tight cooperation between architecture, run time, programming model, resource management and application are needed in order to achieve in the future the exascale performance.

The Natural Parallelism
Friday, March 19, 9:00h-10:00h
Robert Strzodka, MPI Informatik, Saarbruecken, Germany

With the advent of multi-core processors a new unwanted way of parallel programming is required which is seen as a major challenge. This talk will argue in exactly the opposite direction that our accustomed programming paradigm has been unwanted for years and parallel processing is the natural scheduling and execution model on all levels of hardware.
Sequential processing is a long outdated illusionary software concept and we will expose its artificiality and absurdity with appropriate analogies of everyday life. Multi-core appears as a challenge only when looking at it from the crooked illusion of sequential processing. There are other important aspects such as specialization or data movement, and admittedly large scale parallelism has also some issues which we will discuss. But the main problem is changing our mindset and helping others to do so with better education so that parallelism comes to us as a friend and not enemy.

Short Bio of Robert Strzodka:
Robert Strzodka is the head of the research group Integrative Scientific Computing at the Max Planck Institute for Computer Science in Saarbrücken since 2007. His research focuses on efficient interactions of mathematic, algorithmic and architectural aspects in heterogeneous high performance computing. Previously, Robert was a visiting assistant professor in computer science at the Stanford University and until 2005 a postdoc at the Center of Advanced European Studies and Research in Bonn. He received his doctorate in numerical mathematics from the University of Duisburg-Essen in 2004.

	Overview of Short Talks
Session I Thursday, March 18, 2010 16:30h - 17:10h	Performance Modeling and Multicore-aware Optimization for 3D Parallel Lattice Boltzmann Simulations, Johannes Habich, University of Erlangen-Nuremberg, Germany A Multicore Implementation of the Lattice Boltzmann Method for Non-uniform Grids, Kostyantyn Kucher, TU Braunschweig, Germany A Lattice Boltzmann CUDA-GPU-Implementation on Non-uniform Grids, Martin Schönherr, TU Braunschweig, Germany Survey of the QPACE Architecture, Nils Meyer, University of Regensburg, Germany
Session II Thursday, March 18, 2010 17:30h - 18:30h	A Pipelined, Multicore-aware Approach to Parallel Temporal Blocking of Stencil Codes for Shared and Distributed Memory, Markus Wittman, University of Erlangen-Nuremberg, Germany Autotuning Parallel Stencil Computations, Mathias Christen, University of Basel, Switzerland Mixed Precision in Computational Fluid Dynamics - An Error Correcting Approach for Solving Linear Systems, Hartwig Anzt, Karlsruhe Institute of Technology, Germany MPI, OpenMP and CUDA Approaches for Solving Large Sparse Linear Systems of Equations, Benedikt Galler, Karlsruhe Institute of Technology, Germany Large Sparse Exact Matching Algorithms for Massive Graph Analysis, Madan Sathe, University of Basel, Switzerland Numerical Simulation in Computational Finance: Option Pricing with Monte Carlo Methods, Philipp Werner, Karlsruhe Institute of Technology, Germany
Session III Friday, March 19, 2010 15:00 - 15:50h	Applying Software Engineering Methods and Tools to Scientific Research Projects-ATLAS Project, Hoda Naguib, TU Munich, Germany Applying Software Engineering Methods and Tools to the SeisSol Project, Yang Li, TU Munich, Germany Single Pattern Multi Value LU Decomposition - Basic Ideas and Parallelization, Martin Köhler, TU Chemnitz, Germany Efficient Stereo-image-sequence Segmentation on the GPUs, Alexey Abramov, University of Goettingen, Germany Moist Planetary Boundary Layer Simulation Using OpenGL and GLSL, Stefan Horn, Leibniz Institute for Tropospheric Research, Leipzig, Germany

Overview of Posters

C.M.E.S.S. Solving Large Scale Matrix Equations on Mulitcore Processors, Jens Saak, TU Chemnitz, Germany
Efficient Stereo-image-sequence Segmentation on the GPUs, Alexey Abramov, University of Goettingen, Germany
Mixed Precision in Computational Fluid Dynamics - An Error Correcting Approach for Solving Linear Systems, Hartwig Anzt, Karlsruhe Institute of Technology, Germany
MPI, OpenMP and CUDA Approaches for Solving Large Sparse Linear Systems of Equations, Benedikt Galler, Karlsruhe Institute of Technology, Germany
Numerical Simulation in Computational Finance: Option Pricing with Monte Carlo Methods, Philipp Werner, Karlsruhe Institute of Technology, Germany
The QPACE Network Processor, Thilo Maurer, University of Regensburg, Germany
Implementation of a Sparse Linear System Solver Utilizing Commodity GPU Hardware and Application to Combustion Simulation, Markus Meingast, TU Berlin, Germany