The image “” cannot be displayed, because it contains errors.
Paradyn / Dyninst Week
University of Maryland
College Park, Maryland
April 27-28, 2009




Local Arrangements



Technical Talks (Monday 04/27/09)

Event Location :

Adele H. Stamp Student Union
Conference Map

Scheduled Talks:

Updated 03/31/09

8:00-8:30 Breakfast
8:30-8:45 Welcome, Introductions, and Overview
Jeff Hollingsworth and Bart Miller, Universities of Maryland & Wisconsin

Dyninst as a Binary Rewriter
Matt Legendre, University of Wisconsin

We present a new feature in Dyninst, the static binary rewriter. The binary rewriter allows Dyninst users to operate on binaries on disk, as opposed to the typical case of operating on an in-memory process. Operating on an on-disk binary allows users to seperate the acts of rewriting a binary from running a rewritten binary. This can be especially advantageous when a binary needs to run on a different system than where the rewriting happens. This talk will discuss both the interface and implementation of the binary rewriter.


The Long and (Un)Winding Road
Mike Fagan, Rice University

The HPCToolkit project is suite of tools for profile-based performance analysis of applications. One of the main tools in the suite is hpcrun: a tool for collecting call path profiles via statistical sampling. To accomplish this task, hpcrun must be able to unwind the call stack from any point in the program, including procedure prologs and epilogs. Accurate unwinding requires accurate information about where the return address, frame pointer, and other procedure linkage information is stored. Since compilers cannot be counted on to supply this information, hpcrun computes the necessary unwind information using binary analysis. Furthermore, the analysis is performed at run-time. This talk outlines the HPCToolkit methods employed to produce the unwind information. The emphasis is on the x86 architecture, but the methods are general.



Nick Rutar, University of Maryland

Parallel programs are increasingly being written using programming frameworks and other environments that allow parallel constructs to be programmed with greater ease. The data structures used allow the modeling of complex mathematical structures like linear systems and partial differential equations using high-level programming abstractions. While this allows programmers to model complex systems in a more intuitive way, it also makes the debugging and profiling of these systems more difficult due to the complexity of mapping these high level abstractions down to the low level parallel programming constructs. This work discusses mapping mechanisms, called variable blame, for creating these mappings and using them to assist in the profiling and debugging of programs created using advanced parallel programming techniques. We discuss a prototype implementation of the system and use this system in the profiling of three programs.

10:15-10:45 Break

Using MRNet and StackwalkerAPI to Deliver Scalable Analysis of Crashing Applications On Cray XT Systems
Bob Moench, Cray

As HPC systems have gotten ever larger, the amount of information associated with a failing parallel application has grown beyond what the beleaguered applications developer has the time, resources, and wherewithal to analyze. The "complete" picture, delivered by a core file for each of tens of thousands of processes, swamps both the hardware and the user's comprehension. Yet the single core file of the original failing process is often not sufficient to study the problem. What is presented here is a middle ground that is both manageable and sufficient.


Integrating Compiler Optimization into Active Harmony
Ananta Tiwari, University of Maryland

We describe a scalable and general-purpose framework for auto-tuning compiler-generated code. We combine Active Harmony's parallel search backend with the CHiLL compiler transformation framework to generate in parallel a set of alternative implementations of computation kernels and automatically select the one with the best-performing implementation. The resulting system achieves performance of compiler-generated code comparable to the fully automated version of the ATLAS library for the tested kernels. Performance for various kernels is 1.4 to 3.6 times faster than the native Intel compiler without search. Our search algorithm simultaneously evaluates different combinations of compiler optimizations and converges to solutions in only a few tens of search-steps.

11:45-12:15 Towards an Autonomous MRNet
Dorian Arnold, University of New Mexico

Until recently, MRNet has only supported static network topologies specified, to varying extents, by the MRNet-enabled tool/application developer or user. In this talk, I will discuss recent extensions to the MRNet infrastructure made to support fault-tolerance and dynamic topology configurations. I will also talk about current and future research in run-time monitoring, modeling and re-configurations that will lead eventually to an MRNet infrastructure that autonomously adapts its configuration to address functional and performance failures as well as changes in offered loads.

12:15-1:30 Lunch
1:30-2:30 The Deconstruction of Dyninst: Current, New, and Upcoming Components
Bill Williams, Matt Legendre, and Nate Rosemblum, University of Wisconsin

Porting Dyninst to VxWorks
Ray Chen, University of Maryland

How do you get Dyninst running in environments such as embedded systems, where having the smallest possible footprint is a priority?

When missing structures typically available in traditional *nix systems such as a user shell, file system, or virtual memory, Dyninst must modified to meet these unique demands. This talk outlines the challenges of porting Dyninst to VxWorks: an operating system built on a highly scalable, deterministic, hard real-time kernel.

3:00-3:30 Break

Dynamic Instrumentation of Dynamically Changing Code
Kevin Roundy, University of Wisconsin

Abstract: One of the biggest advantages that Dyninst provides with respect to other binary instrumentation tools is its ability to statically analyze the code in the binary and provide a control flow graph as a guide to instrumenting the program. Unfortunately, most malware binaries cannot be instrumented through this approach because they make much of their code statically un-analyzable. The most prevalent anti-analysis technique in use by malware is "code packing", wherein all or part of the binary's code is compressed (or encrypted) and packaged with a loop that decompresses it into memory at runtime. Malware binaries can cause Dyninst's analysis to be incorrect as well, primarily by overwriting code that has already been analyzed. Our approach to these anti-analysis techniques is to discover dynamically generated and modified code at runtime and update Dyninst's control-flow-graph representation of the program in response to changes in its code. By updating our analysis we are able to give Dyninst-based analysis and instrumentation tools access to malicious binary code, despite its being dynamically generated or modified.


Parsing Stripped Code and Compiler Identification
Nate Rosenblum, University of Wisconsin

Program binaries, contrary to expectations, contain a wealth of information that can be used to infer various properties of the binary's provenance, such as the details of the tool chain that produced it. One direct application of provenance characteristics is identification of the source compiler that produced a binary. We describe the challenges of robust parsing for instrumentation purposes that necessitate compiler-specific information, and introduce a model for binary code that facilitates compiler identification. Our model replaces heuristic solutions to finding code multi-compiler binaries, allowing more accurate fine-grained identification of code in stripped binaries. Our compiler identification techniques are general, and are actively being extended to investigate recovery of other types of provenance data.


Cinquecento: A Programming Language for Debugging
Vic Zandy & Dan Ridge, IDA CCS

Subtle bugs in complex software systems, especially distributed systems, are often difficult to reproduce and diagnose manually. We have developed a programming language, called Cinquecento, for writing programs that help programmers debug software systems.

The key innovations of Cinquecento are a first-class abstraction that represents a program in execution, a C-based syntactic interface to this abstraction, and language mechanisms for tailoring new instances of this abstraction to unanticipated, heterogeneous environments. These novel features are embedded in an otherwise simple dynamic language with conventional functional semantics.

In this talk, we'll introduce and illustrate the main ideas of Cinquecento.

5:00-5:15 Wrapup
Jeff & Bart