NSF Progress Report March 2000:

                                                                                     (6 months into the project)
 

Overview:

The NSF KDI grant PHY 99-79985 was Awarded to Rutgers University, University of Chicago, University of Illinois and Washington University for the development of an Astrophysics Simulation Collabortory (ASC), to enable large scale simulations in astrophysics.  The driving application is the simulation of accretion induced collapses of neutron stars.  The start date of the KDI grant was September 1, 1999.
 

Management and work Plan Established September, 1999:


Management Plan:

Ian Foster: Overseeing Grid computation aspects (Globus) of the  ASC

Manish Parashar: Overseeing AMR (GrACE) development, and their integration into the ASC

Mike Norman: Overseeing integration of Zeus and astrophysics applications into the ASC

John Shalf: Overseeing visualization development and their integration into the ASC.

Ed Seidel: Overseeing development and integration of Cactus toolkit  into the ASC, interface with the Gigabit and EU  network   projects

Wai-Mo Suen: Overall management of the project, and overseeing general relativistic neutron star application code  development


Work Team Assembled:



Washington University in St. Louis

  Mark Miller; Research Scientist,
                          Area of responsibility: General Relativistic Hydrodynamics Treatments, Hyperbolic Treatments

  Malcolm Tobias; Research Associate,
                        Area of responsibility: Initial data Treatment (IVP), Wash U system Administration

 Achamveedu Gopakumar; Research Associate,
                        Area of responsibility: Equation of State Table coupling

  Edwin Evans; Graduate Student,
                        Area of responsibility: AMR Application code development

  Philip Gressman; Student,
                         Area of responsibility: Newtonian Shock-Capturing Code development, NS GR simulations



University of Chicago:

Matei Ripeanu, Graduate student,
                          Area of responsibility: Metacomputing in ASC



University of Illinois:

 Gregory Daues; Research Associate,
                          Area of responsibility: ASC GUI Development

 Pakshing Li; Research Associate,
                          Area of responsibility: Zeus integration and application code development

 Galina Pushkareva; Student,
                           Area of responsibility: Visualization development (LCA Vision)

 Brad Miksa; Student,
                           Area of Responsibility: Visualization development (LCA Vision), web interface



Rutgers University:

 Snigda Verma; Graduate Student,
                         Area of responsibility:  GrACE API

 Sivpariya Ramanathan
                         Area of responsibility: Runtime for Scalable Distributed AMR



Progress in Building AMR Capability in the ASC:

The year one goal of the ACS AMR effort is the integration of the AMR (Adaptive Mesh Refinement)
data-management framework into the ACS to enable scalable, distributed 3D AMR capability within the
central code. This capability is being built on the GrACE computational engine. The
AMR effort consists of the  ASC AMR driver PAGH (Parallel Adaptive Grid Hierarchy), and
the adaptive GrACE runtime for performance and scalability management. The current
status and future plans for these efforts are outlined below.

GRACE AND PAGH: THE ASC AMR DATA-MANAGEMENT AND DRIVER LAYERS

GrACE is an adaptive computational and data-management engine for enabling
distributed adaptive mesh-refinement computations on structured grids, and forms the
basis of the  ASC AMR effort. It builds on the DAGH infrastructure and extends it to
provide a virtual, semantically specialized distributed (and dynamic) shared memory
infrastructure with distributed objects specialized to adaptive grid hierarchies and
grid functions.  For more information, see www.caip.rutgers.edu/TASSL/GrACE.
PAGH (Parallel Adaptive Grid Hierarchy), developed by Erik Schnetter working with others in the ASC project (Evans, Goodale and Parashar), is a GrACE-based driver routine designed and developed for the ASC. It provides an interface for memory management and I/O for the grid functions used in a simulation. On parallel machines, the driver also manages necessary communication between the individual processing nodes.

PAGH replaces the standard Cactus driver PUGH (Parallel Uniform Grid
Hierarchy) which is limited to uniform structured grids. PAGH extends the driver to
distributed adaptive grid hierarchies. It also provides facilities for adaptive mesh
refinement (AMR) using the Berger-Oliger AMR formulation, where the refined regions
are rectangular boxes that have to be completely inside a coarser box. PAGH restricts
the refined regions to have an integer refinement factor and to be aligned with
respect to the coarser regions.  The number of refinement levels and the number of
refined regions per level are not restricted. The location and the size of the refined
regions can be changed at runtime.  Based on a user-defined truncation error
estimator, refined regions automatically track regions of high error and adapt to the
computational needs.  To facilitate the truncation error estimation, PAGH can provide
a shadow hierarchy, i.e. a twin to the computational domain with a lower resolution.
The truncation error can be estimated by comparing the results obtained on both
domains. This capability uses the support for a Shadow hierarchy provided by GrACE.
PAGH parameterizes all technical aspects of the refinement procedure and allows them
to be changed by the user.  These aspects include certain space/time tradeoffs, and
the prolongation and restriction stencils used to transport data between the
refinement levels. Reasonable defaults are provided for most applications.

PAGH is suited for all 3D applications that need to selectively enhance the spatial
resolution to meet their numerical requirements within the bounds given by the
available computational resources. The implementation of PAGH is complete, and has
been tested with Cactus application thorns for fixed refinement conditions (i.e. all
refinements are performed at runtime). The adaptive refinements capabilities are
currently being tested. Our goal is to have PAGH ultimately replace PUGH as the
default Cactus driver.  Note that PAGH is not suited for applications where the
numerical properties of the system are not known, so that no measure of the truncation
error can be given.

ADAPTIVE RUNTIME SUPPORT FOR PERFORMANCE AND SCALABILITY MANAGEMENT

Adaptive runtime support for AMR applications is currently being developed by
Sivapriya Ramanathan, a graduate student at Rutgers University supported by the grant. This effort is aimed
at designing system and application sensitive distribution/load-balancing framework
for distributed adaptive grid hierarchies that underlie parallel adaptive
mesh-refinement (AMR) techniques. The framework uses application and system state
information to select the appropriate distribution scheme at run-time. The selection
is driven by an application-centric performance characterization of dynamic
partitioning and load-balancing techniques, and is governed by rules defined in a
policy database. The primary motivation for the framework is the design and
development of policy driven tools for automated configuration and run-time management
of distributed adaptive applications on dynamic and heterogeneous networked computing
environments. We have currently implemented a prototype multithread runtime engine
that dynamically selects the number of processors to be used based on the current
load, between the granularity of refined patches to ensure favorable
computation/communication ratio based on current latencies, and between distribution
schemes based on a characterization of application phases. We are current evaluating
this implementation using stand along AMR applications. The adaptive runtime will be
integrated with the ASC. Further information can be found at
www.caip.rutgers.edu/TASSL.

SPACETIME-GRHYDRO APPLICATION AMR DEVELOPMENT

The spacetime evolution and relativistic hydro codes (see below) had been modified (Evans at Wash U) to make use of the new timelevel structure in Cactus to allow them to use the AMR driver layer PAGH. The modified spacetime-GRHydro codes was tested with a new version of the unigrid driver layer PUGH (a version with time level awareness) for validation and performance. We first demonstrated the two versions of PUGH provided exactly the same results. Then the spacetime-GRHydro codes where tested with this new version of PUGH and PAGH. We have demonstrated that, for the case of a single, stationary neutron star evolved with fully coupled general relativity and hydrodynamics, PUGH and PAGH gave the same results, with one refinement level. In the AMR PAGH version, the single processor performance also remains as high as in the original unigrid version. Testing with multiple processors and with multiple fixed refinement levels are presently in progress. The scaling performance on multiple processors are being studied. 



 
 

Progress in Building Meta-computing Capability in the ASC:

DISTRIBUTED EXECUTION OF SIMULATION CODES

Historically, large scientific simulations have been performed almost
exclusively on dedicated supercomputer systems.
The ASC  aims at making use of the so-called computational Grids to increase the
accessibility and reduce the cost of simulation science by allowing
the astrophysics communities to pool computational resources for large scale simulations.
Here a "Grid" means a heterogeneous collection of
computational and network resources with time-varying availability,
but supporting standard mechanisms for resource location,
characterization, reservation, allocation, management, and so forth.
In the construction of the ASC, these standard mechanisms are provided by the
Globus toolkit (http://www.globus.org/).

Grid mechanisms allow the ASC to improve the
efficiency with which resources are used by matching individual
simulations to the ``best'' available resource; in addition, large
simulations can couple multiple distributed resources to perform
computations too large for a single system. However, the complexity,
heterogeneity, and dynamic nature of the Grid environment represent
major barriers to practical use by applications.

The initial goal of our work in this area is to investigate techniques
that can allow the application codes in the ASC to execute efficiently on
heterogeneous mixtures of network-connected computers. Our basic
approach involves modifying the PUGH driver of the Cactus computational toolkit (by Ripeanu) to support adaptive decompositions that address load-balancing problems that arise in heterogeneous environments.

We have obtained good initial results on a network of workstations
within the computer science department at the University of Chicago.
Evaluation on larger systems is a topic of current work.

RESULTS TO DATE

Our principal efforts to date have focused on modifying the Cactus computational toolkit to support load-adapted mapping of grid points to processors. This work was carried out by the Chicago team supported by the grant, and integrated into the Cactus distribution (by Allen and Goodale in Potsdam). We explored various grid partitioning strategies, with the objective of keeping the communication pattern simple and avoiding excessive modification to the PUGH code. We devised a grid partitioning strategy in which the number of grid points on a processor is proportional to its available processing power. Specifically, the heuristic computes the problem decomposition and the number of processors available, orders the processors by their available power, computes neighbors for each processor, and sets the grid slice size according to the least powerful processor on the slice (see Figure 1). This algorithm, although quite simple, ensures that slow processors do not get overloaded with computational tasks.
 

FIGURE 1: Example of the dynamic decompositions supported by our techniques.
To measure the efficiency of our new algorithm, we developed a
parallel performance model for Cactus running in a heterogeneous
time-sharing environment.  Traditionally, efficiency assumes identical
dedicated processors.  For a heterogeneous environment, however, one
must consider both the relative computational power of each processor
and the share of time that will be used on each processor for solving
the problem.  The model we formulated decomposes the grid space over
one to three dimensions such that a processor will have to communicate
on two directions for each dimension.
 

We then compared the execution times of Cactus with the original PUGH
partitioner and with the new load-aware algorithm.  The heterogeneous
testbed used Linux systems over a FastEthernet local area network.
The results showed constant better performance by the load-aware
approach (see Figures 2 and 3).  Moreover, on average, the load-aware
approach improved efficiency by approximately 100 percent over the
original PUGH partitioner.

FIGURE 2: Execution time for the original and load-aware versions of Cactus when running on a network of Ethernet-connected Linux workstations. The load-aware version achieves better overall performance and scalability.
FIGURE 3: Execution time for the original and load-aware versions of

Cactus when running on 8 processors, as the load on one processor is
varied.  The performance of the load-aware version declines only
slightly, while the original Cactus slows down dramatically.
FUTURE WORK

Our results indicate that even a trivial load-aware partitioning algorithm yields good results. Nevertheless, considerable work remains to be done. Our work to date has involved a fairly small heterogeneous environment. We need to optimize the modified Cactus toolkit and execute it on large-scale Grid environments, involving multiple supercomputers. (Using this load-aware partitioning modification to PUGH, the Chicago and Potsdam teams have already started to experiment with running Cactus simulations across large clusters of supercomputers.) We also need to enhance the Cactus toolkit so that it can operate efficiently and flexibly in both departmental clusters and large-scale Grid environments; our goal is to enable the code to configure itself to the environment automatically.

Other future research and development will focus on
(1) performance optimizations targeted at alternative solvers,
(2) support for adaptive mesh refinement, and
(3) support for dynamic resource acquisition.



 
 

Progress in Building Visualization Capability in the ASC:

Visualization tools are central to the ASC project. We need them to aid with code development as we try to understand algorithms with complex spatial-temporal features like the AMR/GrACE development. They are also important for understanding the results of production codes. And we expect with the developments in Cactus infrastructure that they will be a critically important means of data reduction so that we can evaluate the progress of simulation codes as they run in-situ on widely distributed resources; killing, steering or restarting the code as necessary rather than wasting precious supercomputing resources. The ASC is using features of two visualization tools for these purposes; Amira and LCA Vision.
 
 

AMIRA

Amira (http://amira.zib.de) is a proprietary visualization package developed by the Konrad Zuse Institute (ZIB) in Berlin Germany, which has agreed to provide the package to be used in the ASC. The ASC is not supporting any development on Amira, but is developing the interface needed for its usage in the ASC with the help from ZIB researchers (Werner Benger and Ralf Kaehler). Presently multi-level AMR data sets generated by Cactus representing scalar wave evolution have been successfully viewed with Amira. The capability to view multi-level spacetime-GRHydro data is being tested. Amira has also been extensively developed for remote I/O and visualization for use with the ASC. These developments, including HDF5 readers, are made available for use with LCA Vision, described below.

LCA VISION

LCA Vision was started in the NCSA Laboratory for Computational Astrophysics as an updated version of their successful 4D2 interactive visualization tool that was distributed with the Zeus code (http://zeus.ncsa.uiuc.edu/lca_intro_4d2.html). The development of the visualization tool was supported by the ASC grant beginning Sept. 99. It is a non-proprietary open-source project based on the freely available VTK (http://www.kitware.com) and FLTK (http://www.fltk.org/) toolkits. Students at UIUC Gala Pushkareva and Brad Miksa are supported under the ASC grant to extend the capabilities of this package to support AMR visualization. The code is distributed to the group and to researchers outside of this project on the webpage http://zeus.ncsa.uiuc.edu/~miksa/LCAVision.html Like many of the technologies developed under this grant, we expect them to have an impact far outside of the community where they started. The code is also available through the CVS revision control system thereby providing nearly automatic updates for our collaborators.
 
 
Figure 4: Snapshot of LCA Vision in operation. The AMR dataset was produced by a production AMR/Cosmology code. The wireframe boxes are the bounding boxes for the hierarchically nested adaptive grids. The semitransparent white box is a nested grid which can be selected either by the 3D cursor (in red) or by a scrolled list of available grids. The GUI panels on the left display the list of grids on each level of refinement (for selection), and a list of tools that can be used for performing visualization on these grids either as the entire hierarchy or as individual unigrids. The blue surface in the image is a multi-resolution isosurface that descends through the entire depth of the grid hierarchy.

In the ASC project, the year 1 mission of LCA Vision is to support the integration of the GrACE AMR toolkit with the general relativistic application codes. By December 99, we have succeeded in adding the capacity to import individual AMR grids into the unigrid visualization system. This is invaluable for people investigating and debugging the behavior of the AMR code, enabling direct comparisons between neighboring grids.

For the next two years, we will focus on developing the remote visualization and steering capabilities in the ASC based on LCA Vision. We will be leveraging off of work by the German Gigabit Testbed project (TIKSL, http://www.zib.de/Visual/projects/TIKSL/, Co-I Seidel) and the NCSA HDF5 project to support distributed file access and HDF5 modifications to support direct connection to running codes. The capabilities developed include general hyperslab selection of arbitrary grid functions resident in memory of a running Cactus simulation, retrieval of this data to the local visualization client, and remote access to hyperslabs of output data already written to files on remote machines using a DPSS file server. Integration of these tools into LCA Vision will move it from a standalone application into a fully-Grid-aware component of our web portal design in the ASC.

Progress on Visualization Tools (September 1999 to March 2000)

Visualization Milestones for the remainder of Year 1 of ASC (April 2000 to September 2000)




 
 


Progress in building Web Interface Capability in the ASC:

PORTAL GUI DESIGN

A portal presents a unified interface to diverse and physically distributed resources. It attempts to make these many disparate pieces appear to be executing right on the portal user's computer. Our ASC portal has two faces; a collection of desktop tools and interfaces as well as an entirely web-hosted interface. The desktop tools are designed to appeal to advanced users in computational physics and include many separate applications (the cactus configuration interface, the Java-based Globus Resource Manager, and the LCA Vision and Amira visualization tools). These applications provide a wide degree of configuration flexibility and create a centralized view of a complicated and disparate set of resources. However even with the level of integration these tools provide, their flexibility also presents a daunting number of choices to novice users. So we are also pursuing a simplified web-hosted interface to support the most important portal functionalities. The goal of the web hosted interface is to provide a view of the ASC that is easily accessible from any web browser. We expect to achieve these goals with initial prototypes deployed by the end of 2000.

THE GENERIC WORKBENCH (Desktop tool for Globus Resource Access)

The Generic Workbench is a standalone Java JDK 1.2-based application written by Jason Novotny (NLANR,NASA-Ames) and Greg Daues (at NCSA, supported by the ASC grant) that makes use of Java CoG (Commodity Grid Kit) and Globus system (Argonne National Laboratory) to create a connection between the user's workstation and the Grid resources. It strives to provide all of the functionality of the Globus commandline tools without the hassle and complexity of their default commandline interfaces. The user can access Grid resources, run programs, manipulate and view code input/output, etc. without learning the details of the arguments of the Globus command line programs or their associated RSL (Resource Specification Language) scripts.
 
 
Figure 5: A screenshot of the Generic Workbench user interface. This shows the configuration of the NCSA host modi4 as a remotely accessible Grid resource. Any machine that the user has access privileges to on the Grid and is running Globus services can be viewed through this interface as a list of resources that the user can submit batch jobs to from their desktop machine. 

The current version of the GenericWorkbench provides the following capabilities:

Progress on Generic Workbench, Year 1 (Sept 1999 to March 2000): Milestones for Generic Workbench by the end of Year 1 (April 2000 to Sept 2000) THE WEB PORTAL GUI

The ASC web interface Specification 1.0 lays out the key elements of the GUI design and is under review by the entire ASC group. The portal interface will use a mix of JDK 1.1.7 and DHTML for the user interface. JDK 1.2 plug-ins for web browsers are being investigated, but do not appear to be ready enough yet to offer a good substrate for current development (we expect progress in this area by year 2). This initial interface will be used for testing the functionality of the interface and augmenting the design to fit user needs. Eventually it will be unified with the Generic Workbench technology which currently must exist as a standalone desktop tool for lack of universal availability of JDK 1.2-enabled browsers.
Figure 6: Blueprint of Web Portal Specification. Web services will be organized under the primary areas of "User Services", "Configuration Management", "Parameter File Management", "Run and Monitor Simulation", and "View/Manage Simulation Results". The entry to the system will require two levels of authentication; the first to establish a secure encrypted connection to the webserver and the second to authenticate the user to the webserver for the Globus Security Infrastructure.

Progress on the Web Interface (September 1999 to March 2000)

Major Project Milestones for the Web GUI for Year 1 of ASC (April 2000 to September 2000)

 Progress in Building Neutron Star Simulation Capability in the ASC:

The integration of the general relativistic and Newtonian Astrophysics application codes into the ASC is
one of the main goals for the first year of the project.   By september 2000, in the Phase One ASC, we expect
to have the capability to carry out neutron star simulations using Newtonian gravity as well as Einstein theory,
coupled to general relativistic hydrodynamics, and realistic equation of state tables.

The integration of the application codes into the ASC is carried out through the Cactus computational toolkit (http://www.cactuscode.org/). The toolkit is a computational layer for solving a general class of non-linear
partial differential equations (not just the Einstein or hydrodynamic equations), using finite differencing methods.
It was extracted and re-designed based on the Cactus relativity code co-developed by the NCSA/Potsdam/Wash U
collaboration.  The use of the toolkit facilitates the coupling of the application codes with GrACE (for AMR),
Globus (for grid computing) and Vision/Amira (for AMR visualization) in the ASC.

The Cactus computational toolkit was first beta released in October of 1999, and is presently being actively
developed for the ASC and other computational projects, including the EU network project of co-I Seidel (http://www.aei-potsdam.mpg.de/research/astro/eu_network/index.html). The development of the computational
toolkit is mainly carried out by our collaborators at AEI (Goodale, Allen, Lanfermann, Benger, Bruegmann,
Alcubierre, not supported by the ASC grant), and with support from the ASC grant to Wash U (Evans, Tobias,
and Lamping). The beta 6 version of the toolkit was released in February of 2000.

The integration of the application codes on top of the Cactus computational toolkit with an object oriented
programming design requires changes in all our existing application code components, and in particular their
interfaces.   This has been the major task of our ASC application code team for the last few months.  Most of
the application code components have now been brought into the new interface design, with added features for
the neutron star applications planned for the ASC; the validation of some of the
code components are presently being carried out.   In the following we give a brief summary of progress in
this direction (as of March 2000).

1. Spacetime evolution routines for evolving the ADM set and the conformal ADM set of equations
have been converted (ADM and BSSN by Allen and Alcubierre, and  Conf-Hyp by Miller).  These routines
have been validated and used to evolve black holes and gravitational waves.   Further tests and applications
are in progress.  The Conf-Hyp routine has been made time-level-awared for AMR treatment (by Evans);
validation is in progress (by Evans).

2. General relativistic hydrodynamic (GR-Hydro) evolution routine (MAHC) has been converted and
validated (by Miller).  It has also been extended to accept arbitrary nuclear equation of state (by Miller).
It has been AMR enabled (by Evans), and validation is in progress.

3. The spacetime evolution routines and the GR-Hydro evolution routine have been coupled in a second
order accurate manner with the new interface scheme and validated (by Miller).

4. The Newtonian gravity artificial viscosity radiation hydrodynamic code (Zeus) is being integrated (by Lee).
The Cactus computational toolkit has been extended to accept staggering grid support needed by Zeus
(by Lafenmann).  The construction of the routinesZeus_Start (initializes all variables and boundary
values according to the input parameter file, then start or restart the run),  Zeus_Source (updates source terms
in evolution), Zeus_Transport (update transport part in a directionally split manner, and Zeus_TimeStep
(computes the new timestep for explicit calculations) have been finished.  Testbeds including a standard
spherical blast wave problem are being carried out (routine Zeus_Blast).  More tests, such as shock tube
test and supersonic jet, follow.

5. The Newtonian gravity high-resolution-shock-capturing (HRSC)  hydrodynamic code (Newton-HRSC)
has been converted and validated (by Gressman).

The above covers all evolution routines planned for the Phase One ASC.

6. The routine for treating the Hamiltonian and momentum constraints for general relativistic initial data (IVP)
was converted and being validated (by Tobias).  The multi-grid scalar and vector elliptic equation
solver (BAM) on which IVP was constructed was converted and validated (by Bruegmann).  IVP can make
use of other elliptic solvers; the interface to the PetSC elliptic equation library was converted (by Lanfermann)
and an over relaxation solver was also converted and validated (by Lamping).

7. The routines for creating neutron star initial data, including data for neutron star binary systems in
quasi-equilibrium (MAHC_init, Quasi_Equil, Resid_Quasi_Equil) have been converted and validated (by Miller).

The above are for generating initial data in general relativistic simulations.

8. The routines for handling coordinate conditions, including maximal slicing (Maximal, by Lanfermann)
and minimal distortion shift (MinDist_Shift, by Miller) has been converted.  Validation is in progress.

9. Routines  for analyzing spacetime data, including constraint analyzer, apparent horizon finder and various wave
analysis tools have been converted and validated (AHFinder by Alcubierre; ADMConstraints and Extract
by Allen, PsiKadelia by Baker).

10. Routines for simulation control, including time step analysis, interpolation, checkpointing, I/O and downsizing
have been converted and validated (Courant, Interp and  checkpoint by Allen).

11. Nuclear equation of state routines converted (by Gopakumar, Evans, Goodale and Miller), and is being
validated (by Gopakumar).  Various realistic EOS tables are being imported (by Gopakumar).

This includes the complete set of application code components planned to be in Phase One ASC.  We expect
the validation process to conclude by summer of 2000.  The ASC application codes would then be able to
take advantage of the visualization tools and grid computing infrastructure, and begin simulations with
adaptive mesh refinement.