Previous Conference Program 2010

Facing the Multicore-Challenge I

March 17 - 19, 2010
Heidelberg Academy of Sciences
                       



Wednesday, March 17


13:30h - 14:00h Registration
14:00h - 14:15h  Welcome
Hermann H. Hahn, President of the Heidelberg Academy of Sciences
Willy Jäger, Conference mentor

Tutorials - Afternoon Session
14:15h - 15:00h

Multicore to Manycore: Technologies and Programming Concepts
Jan-Philipp Weiss, Karlsruhe Institute of Technology, Germany

15:00h - 15:30h
Coffee Break
15:30h - 16:45h

Programming Ct – Part I: Scaling Towards Future Multicore
Michael Klemm, Intel, Germany

16:45h - 17:15h
Coffee Break
17:15h - 18:30h

Programming Ct – Part II: Porting Applications to Multicore
Michael Klemm, Intel, Germany


   Evening Session
19:00h - 20:00h

Evening Reception
Bel Etage of the Heidelberg Academy of Sciences                                   

20:00h - 20:15h
Conference Opening
20:15h - 21:15 h 

- Invited Talk -
Analyzing Massive Social Networks Using
Multicore and Multithreaded Architectures

David A. Bader, Georgia Institute of Technology, USA

(Abstract)



Thursday, March 18

  Morning Session: Computer Architecture and Parallel Programming
 8:30h - 8:50h
Registration
 8:50h - 9:00h
Welcome and Information
 9:00h - 10:00h 

- Invited Talk -
MareIncognito: A Perspective Towards Exascale
Jesus Labarta, Barcelona Supercomputing Centre, Spain

(Abstract)

10:00h -10:30h
Coffee Break and Poster Session
10:30h - 10:55h 

RapidMind: Portability across Architectures and its Limitations
Iris Christadler, Leibniz Supercomputing Centre Munich, Germany

10:55h - 11:20h

A Majority-Based Control Scheme for Way-Adaptable Caches
Masayuki Sato, Tohoku University, Japan

11:20h - 11:45h

Improved Scalability by Using Hardware-Aware Thread Affinities
Sven Mallach, University of Cologne, Germany

11:45h - 12:10h

Thread Creation for Self-aware Parallel Systems
Oliver Mattes, Karlsruhe Institute of Technology, Germany



12:10h - 13:50h Lunch Break
Kulturbrauerei Heidelberg                                                                                                       


   Afternoon Session: Applications on Multicore I                                    
13.50h - 14:15h
Putting Personality Into High Performance Computing
John Leidel, Convey Computer, USA

14:15h - 14:45h

Where Does Manycore Lead Us?
Moderated Discussion

14:45h - 15:10h

G-means Improved for Cell BE Environment
Aislan Foina, Barcelona Supercomputing Centre, Spain

15:10h - 15:40h Coffee Break and Poster Session
15:40h - 16:05h

Parallel 3D Multigrid Methods on the STI Cell BE Architecture
Fabian Oboril, Karlsruhe Institute of Technology

16:05h - 16:30h

FPGA vs. Multi-Core CPUs vs. GPUs: 
Hands-on Experience with a Sorting Application
Cristian Grozea, Fraunhofer Institute FIRST Berlin, Germany

16:30h - 17:10h

Short Talks - Session I (see Table at bottom)

17:10h - 17:30h
Coffee Break
17:30h - 18:30h
Short Talks - Session II (see Table at bottom)

18:45h - 20:15h  Guided Tour: Old City of Heidelberg                                                                                

20:15h-22:30h  Conference Dinner
Vetter's Alt Heidelberger Brauhaus                                                                                                     


Friday, March 19

 

Morning Session: Applications on Multicore II                    

 8:50h - 9:00h           Welcome and Information
 9:00h -10:00h 

- Invited Talk -
The Natural Parallelism
Robert Strzodka, MPI Informatik, Saarbrücken, Germany

(Abstract)

10:00h - 10:30h
Coffee Break and Poster Session (see Table at bottom)
10:30h - 10:55h

Lattice-Boltzmann Simulation of the Shallow-Water Equations
with Fluid-Structure Interaction on Multi- and Manycore Processors
Markus Geveler, Technical University of Dortmund, Germany

10:55h -11:20h

Applying Classic Feedback Control for Enhancing the Fault-Tolerance
of Parallel Pipeline Workflows on Multi-Core Systems

Tudor Ionescu, University of Stuttgart, Germany

11:20h - 12:05h

Programming for Manycore - Challenges and Solutions                                                                                
Moderated Discussion


12:05h - 13:30h  Lunch Break
Kulturbrauerei Heidelberg                                                                                                                                    

  Afternoon Session: GPGPU Computing      
13:30h - 13:55h  Considering GPGPU for HPC Centers: Is it Worth the Effort?                                        
Hans Hacker, Technical University Munich, Germany  
13:55h - 14:20h  Real-time Image Segmentation on a GPU
Alexey Abramov, University of Goettingen, Germany  
14:20h - 14:45h Parallel Volume Rendering Implementation on
Graphics Cards using CUDA

Jens Fangerau, University of Heidelberg, Germany
 
14:45h - 15:00h
Coffee Break
15:00h - 15:50h Short Talks - Session III (see Table at bottom)
15:50h - 16:00h Conference Closing and Farewell


 Abstracts of Invited Talks

Analyzing Massive Social Networks using Multicore and Multithreaded Architectures
Wednesday, March 17, 20:15h-21:15h
David A. Bader, Georgia Institute of Technology, USA

Emerging real-world graph problems include detecting community structure in large social networks, improving the resilience of the electric power grid, and detecting and preventing disease in human populations.  Unlike traditional applications in computational science and engineering, solving these problems at scale often raises new challenges because of sparsity and the lack of locality in the data, the need for additional research on scalable algorithms and development of frameworks for solving these problems on high performance computers, and the need for improved models that also capture the noise and bias inherent in the torrential data streams.  The explosion of real-world graph data poses a substantial challenge: How can we analyze constantly changing graphs with billions of vertices?  Our approach leverages the Cray XMT's fine-grained parallelism and flat memory model to scale to massive graphs.  On the Cray XMT, our static graph characterization package GraphCT summarizes such massive graphs, and our ongoing STINGER streaming work updates clustering coefficients on massive graphs at a rate of tens of thousands updates per second.

Short Bio of David A. Bader:
David A. Bader is a Full Professor in the School of Computational Science and Engineering, College of Computing, at Georgia Institute of Technology.  Dr. Bader has also served as Director of the Sony-Toshiba-IBM Center of Competence for the Cell Broadband Engine Processor.  He received his Ph.D. in 1996 from The University of Maryland, was awarded a National Science Foundation (NSF) Postdoctoral Research Associateship in Experimental Computer Science.  He is an NSF CAREER Award recipient, an investigator on several NSF and NIH awards, was a distinguished speaker in the IEEE Computer Society Distinguished Visitors Program, and a member of the IBM PERCS team for the DARPA High Productivity Computing Systems program.  Dr. Bader serves on the Research Advisory Council for Internet2, the Steering Committees of the IPDPS and HiPC conferences, and is the General Chair of IPDPS 2010 and Chair of SIAM PP12.  He is an associate editor for several high impact publications including the ACM Journal of Experimental Algorithmics (JEA), IEEE DSOnline, and Parallel Computing, has been an associate editor for the IEEE Transactions on Parallel and Distributed Systems (TPDS), is an IEEE Fellow and a Member of the ACM. Dr. Bader's interests are at the intersection of high-performance computing and computational biology and genomics.  He has co-chaired a series of meetings, the IEEE International Workshop on High-Performance Computational Biology (HiCOMB), co-organized the NSF Workshop on Petascale Computing in the Biological Sciences, written several book chapters, and co-edited special issues of the Journal of Parallel and Distributed Computing (JPDC) and IEEE TPDS on high-performance computational biology. He has co-authored over 100 articles in peer-reviewed journals and conferences, and his main areas of research are in parallel algorithms, combinatorial optimization, and computational biology and genomics.
 

MareIncognito: A Perspective Towards Exascale
Thursday, March 18, 9:00h -10:00h
Jesus Labarta, Barcelona Supercomputing Centre, Spain

MareIncognito is a cooperative project between IBM and the Barcelona Supercomputing Center (BSC) targeting the design of relevant technologies on the way towards exascale. The initial challenge of the project was to study the potential design of a system based on a next generation of Cell processors. Even so, the approaches pursued are general purpose, applicable to a wide range of accelerator and homogeneous multicores and holistically addressing a large number of components relevant in the design of such systems.
The programming model is probably the most important issue when facing the multicore era. We need to offer support for asynchronous data flow execution and decouple the way source code looks like and the way the program is executed and its operations (tasks) scheduled. In order to ensure a reasonable migration path for programmers the execution model should be exposed to them through a syntactical and semantic structure that is not very far away from current practice. We are developing the StarSs programming model which we think addresses some the challenges of targeting the future heterogeneous / hierarchical multicore systems at the node level. It also integrates nicely into coarser level programming models such as MPI
and what is more important in ways that propagate the asynchronous dataflow execution to the whole application. We are also investigating how some of the features of StarSs can be integrated in OpenMP.
At the architecture level, interconnect and memory subsystem are two key components. We are studying in detail the behavior of current interconnect systems and in particular contention issues. The question is to investigate better ways to use the raw bandwidth that we already have in our systems and can expect to grow in the future. Better understanding of the interactions
between the raw transport mechanisms, the communication protocols and synchronization behavior of applications should lead to avoid an exploding need for bandwidth that is often claimed. The use of the asynchronous execution model that StarSs offers can help in this direction as a very high overlap between communication and computation should be possible. A similar
effect or reducing sensitivity to latency as well as the actual off chip bandwidth required should be supported by the StarSs model.
The talk will present how we target the above issues, with special details on the StarSs programming model and the underlying idea of the project of how tight cooperation between architecture, run time, programming model, resource management and application are needed in order to achieve in the future the exascale performance.

 

The Natural Parallelism
Friday, March 19, 9:00h-10:00h
Robert Strzodka, MPI Informatik, Saarbruecken, Germany

With the advent of multi-core processors a new unwanted way of parallel programming is required which is seen as a major challenge. This talk will argue in exactly the opposite direction that our accustomed programming paradigm has been unwanted for years and parallel processing is the natural scheduling and execution model on all levels of hardware.
Sequential processing is a long outdated illusionary software concept and we will expose its artificiality and absurdity with appropriate analogies of everyday life. Multi-core appears as a challenge only when looking at it from the crooked illusion of sequential processing. There are other important aspects such as specialization or data movement, and admittedly large scale parallelism has also some issues which we will discuss. But the main problem is changing our mindset and helping others to do so with better education so that parallelism comes to us as a friend and not enemy.

Short Bio of Robert Strzodka:
Robert Strzodka is the head of the research group Integrative Scientific Computing at the Max Planck Institute for Computer Science in Saarbrücken since 2007. His research focuses on efficient interactions of mathematic, algorithmic and architectural aspects in heterogeneous high performance computing. Previously, Robert was a visiting assistant professor in computer science at the Stanford University and until 2005 a postdoc at the Center of Advanced European Studies and Research in Bonn. He received his doctorate in numerical mathematics from the University of Duisburg-Essen in 2004. 





Overview of Short Talks 
Session I
Thursday, March 18, 2010
16:30h - 17:10h
  • Performance Modeling and Multicore-aware Optimization for 3D Parallel Lattice Boltzmann Simulations, Johannes Habich, University of Erlangen-Nuremberg, Germany
  • A Multicore Implementation of the Lattice Boltzmann Method for Non-uniform Grids, Kostyantyn Kucher, TU Braunschweig, Germany
  • A Lattice Boltzmann CUDA-GPU-Implementation on Non-uniform Grids, Martin Schönherr, TU Braunschweig, Germany
  • Survey of the QPACE Architecture, Nils Meyer, University of Regensburg, Germany
Session II
Thursday, March 18, 2010
17:30h - 18:30h
  • A Pipelined, Multicore-aware Approach to Parallel Temporal Blocking of Stencil
    Codes for Shared and Distributed Memory, Markus Wittman, University of Erlangen-Nuremberg, Germany
  • Autotuning Parallel Stencil Computations, Mathias Christen, University of Basel, Switzerland
  • Mixed Precision in Computational Fluid Dynamics - An Error Correcting Approach
    for Solving Linear Systems, Hartwig Anzt, Karlsruhe Institute of Technology, Germany
  • MPI, OpenMP and CUDA Approaches for Solving Large Sparse Linear Systems of
    Equations, Benedikt Galler, Karlsruhe Institute of Technology, Germany
  • Large Sparse Exact Matching Algorithms for Massive Graph Analysis, Madan Sathe, University of Basel, Switzerland
  • Numerical Simulation in Computational Finance: Option Pricing with Monte Carlo
    Methods, Philipp Werner, Karlsruhe Institute of Technology, Germany
Session III
Friday, March 19, 2010
15:00 - 15:50h
  • Applying Software Engineering Methods and Tools to Scientific Research
    Projects-ATLAS Project, Hoda Naguib, TU Munich, Germany
  • Applying Software Engineering Methods and Tools to the SeisSol Project, Yang Li, TU Munich, Germany 
  • Single Pattern Multi Value LU Decomposition - Basic Ideas and Parallelization, Martin Köhler, TU Chemnitz, Germany
  • Efficient Stereo-image-sequence Segmentation on the GPUs, Alexey Abramov, University of Goettingen, Germany
  • Moist Planetary Boundary Layer Simulation Using OpenGL and GLSL, Stefan Horn, Leibniz Institute for Tropospheric Research, Leipzig, Germany


Overview of Posters 
  • C.M.E.S.S. Solving Large Scale Matrix Equations on Mulitcore Processors, Jens Saak, TU Chemnitz, Germany
  • Efficient Stereo-image-sequence Segmentation on the GPUs, Alexey Abramov, University of Goettingen, Germany
  • Mixed Precision in Computational Fluid Dynamics - An Error Correcting Approach for Solving Linear Systems, Hartwig Anzt, Karlsruhe Institute of Technology, Germany
  • MPI, OpenMP and CUDA Approaches for Solving Large Sparse Linear Systems of Equations, Benedikt Galler, Karlsruhe Institute of Technology, Germany
  • Numerical Simulation in Computational Finance: Option Pricing with Monte Carlo Methods, Philipp Werner, Karlsruhe Institute of Technology, Germany
  • The QPACE Network Processor, Thilo Maurer, University of Regensburg, Germany
  • Implementation of a Sparse Linear System Solver Utilizing Commodity GPU Hardware and Application to Combustion Simulation, Markus Meingast, TU Berlin, Germany