Poster Session and networking
**Moderators**
## Jan E. Odegard

Executive Director Ken Kennedy Institute/ Associate Vice President Research Computing, Rice University

Jan E. Odegard Executive Director, Ken Kennedy Institute for Information Technology and Associate Vice President, Research Computing & Cyberinfrastructure at Rice University. Dr. Odegard joined Rice University in 2002, and has over 15 years of experience supporting and enabling research... Read More →

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC

BRC

Poster: A Data-centric Profiler for Parallel Programs, Xu Liu, Rice University

An asymptotic approximation of the Dirichlet to Neumann (DtN) map of high contrast composite media with perfectly conducting inclusions that are close to touching is presented. The result is an explicit characterization of the map in the asymptotic limit of the distance between the inclusions tending to zero.The approximation of DtN map is applied to nonoverlapping domain decomposition methods as preconditioners.

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

BRC Exhibit Hall

Poster: A Stability Monitoring Scheme of Drill-String Vibration based on Numerical Simulations of Wave Propagations, Yu Liu, Rice University

**DOWNLOAD POSTER PDF**

The goal of this research is to develop a real-time stability monitoring scheme of the bottom-hole-assembly (BHA) vibration in drill-string using highly efficient numerical analysis of the lateral wave information. Lateral vibrations are considered to be severely destructive to drill-string operations but lateral vibrations or waves cannot be detected on the surface due to the strong damping environment and its highly dispersive nature. On the other hand, axial acoustic waves have been constructively utilized to transmit information through drill-string. In this study, the drill-string is modeled as a linear beam structure under gravitational field effects. An iterative wavelet-based spectral finite element method is developed to obtain a high fidelity response. Its high computational efficiency and capability to parallel computing outperform other existing methods. Numerical simulations of the lateral wave propagation at the BHA are first conducted and a time-frequency analysis technique is applied to the response in order to identify the relationship between the position of the neutral point and the dispersive properties of the lateral wave. Next, axial acoustic wave propagation through the upper drill pipe is conducted to explore the banded transmission properties of the drill-string introduced by periodic joints. Based on the results, a new monitoring scheme is proposed to monitor the stability condition of the vibration of drill-string based on a combination of lateral wave analysis at the BHA and the axial acoustic telemetry technique.

**Speakers**
## Yu Liu

The goal of this research is to develop a real-time stability monitoring scheme of the bottom-hole-assembly (BHA) vibration in drill-string using highly efficient numerical analysis of the lateral wave information. Lateral vibrations are considered to be severely destructive to drill-string operations but lateral vibrations or waves cannot be detected on the surface due to the strong damping environment and its highly dispersive nature. On the other hand, axial acoustic waves have been constructively utilized to transmit information through drill-string. In this study, the drill-string is modeled as a linear beam structure under gravitational field effects. An iterative wavelet-based spectral finite element method is developed to obtain a high fidelity response. Its high computational efficiency and capability to parallel computing outperform other existing methods. Numerical simulations of the lateral wave propagation at the BHA are first conducted and a time-frequency analysis technique is applied to the response in order to identify the relationship between the position of the neutral point and the dispersive properties of the lateral wave. Next, axial acoustic wave propagation through the upper drill pipe is conducted to explore the banded transmission properties of the drill-string introduced by periodic joints. Based on the results, a new monitoring scheme is proposed to monitor the stability condition of the vibration of drill-string based on a combination of lateral wave analysis at the BHA and the axial acoustic telemetry technique.

PhD Student, Rice University

I am currently a PhD student in mechanical engineering and expected to graduate in November this year. I am currently working on a project of stability monitoring of drill-string vibration and wave propagation.

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

BRC Exhibit Hall

Poster: Accelerating an iterative Helmholtz solver with FPGAs, Art Petrenko, University of British Columbia

**DOWNLOAD POSTER PDF**

We implement the Kaczmarz row-projection algorithm (Kaczmarz (1937)) on a CPU host + FPGA accelerator platform using techniques of dataflow programming. This algorithm is then used as the preconditioning step in CGMN, a modified version of the conjugate gradients method (Björck and Elfving (1979)) that we use to solve the time-harmonic acoustic isotropic constant-density wave equation. Using one accelerator we speed-up the solution of the wave equation for one source by 2× compared with one Intel core.

**Speakers**
## Art Petrenko

We implement the Kaczmarz row-projection algorithm (Kaczmarz (1937)) on a CPU host + FPGA accelerator platform using techniques of dataflow programming. This algorithm is then used as the preconditioning step in CGMN, a modified version of the conjugate gradients method (Björck and Elfving (1979)) that we use to solve the time-harmonic acoustic isotropic constant-density wave equation. Using one accelerator we speed-up the solution of the wave equation for one source by 2× compared with one Intel core.

Graduate Student, University of British Columbia

I am currently developing an implementation of an algorithm for iteratively solving large systems of linear equations using a reconfigurable computing platform. The purpose is to model propagation of seismic waves in the frequency domain as part of full-waveform inversion. The platform... Read More →

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

BRC Exhibit Hall

Poster: An aggregation algebraic multigrid method on many and multi core architectures, Rajesh Gandham, Rice University

**DOWNLOAD POSTER PDF**

We present an efficient, robust aggregation based algebraic multigrid preconditioning tech-nique for the solution of large sparse linear systems. These linear systems arise from the dis- cretization elliptic PDEs in various applications. Some of the applications include reservoir simulations, heat equations and incompressible Navier-Stokes equations. Algebraic multigrid methods provide grid independent convergence for these problems, making them one among the best for the solutions of elliptic PDEs in practical applications. The method involves two stages, setup and solve. In the setup stage, hierarchical coarse grids are constructed through aggregation of the fine grid nodes. These aggregations are obtained using a set of maximal independent nodes from the fine grid nodes. The aggregations are combined with a piecewise constant (unsmooth) interpolation from the coarse grid solution to the fine grid solution, ensuring low setup and interpolation cost. The grid independent convergence is achieved by using recursive Krylov iterations (K-cycles) in the solve stage. An efficient combination of K-cycles and standard multigrid V-cycles is used as the preconditioner for the Conjugate Gradient method. We perform the setup on CPU using C++ and STL libraries and solve using the kernels written in a unified threading language OCCA for performance portability of the implementations on traditional CPU and modern many core GPU architectures. We present a comparison of performance of OCCA kernels when cross compiled with OpenCL, CUDA, and OpenMP at runtime on GPUs and CPUs.

**Speakers**
## Rajesh Gandham

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

We present an efficient, robust aggregation based algebraic multigrid preconditioning tech-nique for the solution of large sparse linear systems. These linear systems arise from the dis- cretization elliptic PDEs in various applications. Some of the applications include reservoir simulations, heat equations and incompressible Navier-Stokes equations. Algebraic multigrid methods provide grid independent convergence for these problems, making them one among the best for the solutions of elliptic PDEs in practical applications. The method involves two stages, setup and solve. In the setup stage, hierarchical coarse grids are constructed through aggregation of the fine grid nodes. These aggregations are obtained using a set of maximal independent nodes from the fine grid nodes. The aggregations are combined with a piecewise constant (unsmooth) interpolation from the coarse grid solution to the fine grid solution, ensuring low setup and interpolation cost. The grid independent convergence is achieved by using recursive Krylov iterations (K-cycles) in the solve stage. An efficient combination of K-cycles and standard multigrid V-cycles is used as the preconditioner for the Conjugate Gradient method. We perform the setup on CPU using C++ and STL libraries and solve using the kernels written in a unified threading language OCCA for performance portability of the implementations on traditional CPU and modern many core GPU architectures. We present a comparison of performance of OCCA kernels when cross compiled with OpenCL, CUDA, and OpenMP at runtime on GPUs and CPUs.

Graduate Student, Rice University

I am a PhD student working with Dr. Tim Warburton in the department of Computational and Applied Mathematics at Rice University. I am passionate about developing fast and scalable algorithms and implementations for large scale scientific computing applications. I am particularly interested... Read More →

BRC Exhibit Hall

Poster: An Approximate Inverse to Extended Born Modeling Operator, Jie Hou, Rice University

**DOWNLOAD POSTER PDF**

In seismic imaging, one tries to recover the subsurface reflector information from seismic reflection data. It usually depends on the linearized model of Born approximation. This process is essentially to compute the inverse to Born Modeling Operator. However, the common imaging technique, Reverse Time Migration(RTM), is only the adjoint of modeling operator. Though it can position the reflectors correctly, the migration operator will not produce the correct amplitudes or wavelet. An inversion will be the true-amplitude Reverse Time Migration. The "true-amplitude” here is meant in the ray-theoretic(asymptotic) sense. True amplitude migration was first developed for Kirchhoff migration by compensating the amplitudes. Ten Kroode(2012) gave a wave-equation-based Kirchhoff operator, which is an approximate inverse of the extended modeling operator in 3D. Inspired by him, I try to derive the approximate inverse mathematically in 2D. In this project, I apply asymptotic ray theory to depth-oriented modeling/migration operator using progressing wave expression. Then the Normal Operator is analyzed using principle of stationary phase. I determine that the adjoint operator differs from an asymptotic inverse only by application of several velocity-independent filters, which I identify explicitly. In addition to the theoretical derivation, I provide a numerical implementation and illustrate that effectiveness of the asymptotic inverse via computational examples.This is very rewarding. Because the amplitude information itself, on one hand, is very useful to detect the reservoir. On the other hand, this new operator can be used as a preconditioner for Full Waveform Inversion, a process which iteratively improves an initial model by matching the measured data and modeled data. The new preconditioner will speed up the the convergence of iterations dramatically.

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

In seismic imaging, one tries to recover the subsurface reflector information from seismic reflection data. It usually depends on the linearized model of Born approximation. This process is essentially to compute the inverse to Born Modeling Operator. However, the common imaging technique, Reverse Time Migration(RTM), is only the adjoint of modeling operator. Though it can position the reflectors correctly, the migration operator will not produce the correct amplitudes or wavelet. An inversion will be the true-amplitude Reverse Time Migration. The "true-amplitude” here is meant in the ray-theoretic(asymptotic) sense. True amplitude migration was first developed for Kirchhoff migration by compensating the amplitudes. Ten Kroode(2012) gave a wave-equation-based Kirchhoff operator, which is an approximate inverse of the extended modeling operator in 3D. Inspired by him, I try to derive the approximate inverse mathematically in 2D. In this project, I apply asymptotic ray theory to depth-oriented modeling/migration operator using progressing wave expression. Then the Normal Operator is analyzed using principle of stationary phase. I determine that the adjoint operator differs from an asymptotic inverse only by application of several velocity-independent filters, which I identify explicitly. In addition to the theoretical derivation, I provide a numerical implementation and illustrate that effectiveness of the asymptotic inverse via computational examples.This is very rewarding. Because the amplitude information itself, on one hand, is very useful to detect the reservoir. On the other hand, this new operator can be used as a preconditioner for Full Waveform Inversion, a process which iteratively improves an initial model by matching the measured data and modeled data. The new preconditioner will speed up the the convergence of iterations dramatically.

BRC Exhibit Hall

Poster: An Efficient Numerical Algorithm for Flash Calculations with Graphic Processor Units (GPU) in Compositional Reservoir Simulations, Guan Qin, University of Houston

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

Light oil and gas condensate reservoirs, as well as some enhanced oil recovery processes, usually exhibate complicated phase behavior that invovles the significant changes in fluid compositions through their production life cycle. Numerical modeling of such reservoirs requires compositional simulation that solves coupled problem of multi-phase and multi-component flow and mass exchanges among different phases. Equation of State (EOS) based flash calculation is usually employed for the calculation of phase partition of the fluid compostion in compositional simulation and could consume up to 40% of the total simulation time. Such a significant computational cost should be considered and mitigated in the implementation of the compositional simulation in parallel computing architectures. The recent breakthrough in the utilization of graphical processing unit (GPU) as a highly parallel programmable processor provides a low-cost high performance parallel computing platform. In this paper, we have proposed and developed a GPU-based algorithm for the EOS based flash calculation to improve the numerical efficiency of compositional simulations. EOS based flash calculations involve various types of data and operations. By exploiting dataflow nature of the flash calculation algorithm, the number of external references can be reduced due to better caching behavior and data reusability, and the memory bandwidth bottleneck can be alleviated. We first optimized the simulation code to reduce the overall operation counts. In addition, a new data structure was designed and implemented to achieve coalesced access to the global memory. Further optimization work was done for better utilizing the constant memory, the shared memory and the registers on GPUs, based on data characteristics. Three compositional simulation cases, including refined SPE3 and SPE5 cases, were tested. We achieved speedup factors from 15.4 to 24.9 for the flash calculation and successfully reduces the cost of the flash calculations to a trivial level, 1%~2% of the total computational time.

BRC Exhibit Hall

Poster: Approximating Traveling Salesman and p-Median Solutions using Linear Relaxations, Caleb Fast, Rice University

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

This poster presents a method for developing improved approximation algorithms for two common problems from operations research, the traveling salesman problem (TSP) and the p-median problem (PMP). The PMP and the TSP are of interest because of their exceptionally wide applicability. The TSP is used for many problems, such as planning routes for geological survey or collection vehicles, as well as other problems, such as DNA sequencing. The PMP meanwhile is used for facility location problems, such as determining locations for fuel stations. These problems are commonly attacked using the linear relaxation of an integer programming formulation. However, in neither problem is the error bound of the relaxation well-known. This poster presents a method both for understanding the errors of the relaxations, and for finding approximate integral solutions to the problems when the linear solution is half-integral. The approximate solutions found are better than current state-of-the-art algorithms. For each problem, the method is based on solving a different, ideally tractable problem, and then using the solution of the simpler problem to compute a solution of the original. In the case of the TSP, I solve a matching problem on the support graph of the linear relaxation, while in the case of the PMP, I solve a dominating set problem. The solutions of these problems show which fractional edges of the support graph should be included in the optimal solution and which should be removed.

BRC Exhibit Hall

Poster: Collective Transport of a Large Object in a Distributed Configuration Space, Golnaz Habibi, Rice University

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

Object transport has a lot of applications in industries and agriculture as well as disaster relief and warehouses applications. This poster presents a novel distributed algorithm for multi-robot systems to collectively transport a large object while avoiding obstacles in unknown environment. Given the size of the object, path planner robots generate the minimum cost path from start to the goal position by using a distributed Bellman-Ford algorithm. Then, transporter robots carry the object through the path. A transport is safe if it is obstacle free. We define transport cost as the cost of translation and rotation of the object. This study is trying to trade-off between the cost and safety of the transport by using distributed configuration space and tree-based path planing. We have implemented our algorithm both in simulation and real environments. As results show, our approach is robust to the size and shape of the object and provides a safe and efficient transport in unknown environments.

BRC Exhibit Hall

Poster: Convergence of Discontinuous Galerkin Methods for Poroelasticity Equations, Jun Tan, Rice University

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

In reservoir engineering and environmental engineering, people are con- cerned about poroelasticity, the modeling of coupled fluid-solid processes. The Biot model is one important mathematical model of coupled fluid-solid processes. This model involves the coupling between a transport law and a balance law, and thus it can model the fluid transport in porous media and predict the deformation of the solid. This work provides a theoretical analysis of a new numerical method for solving the poroelasticity equations. We approximate the pressure, displacement and dilatation by the discontinuous Galerkin method, which includes symmetric, nonsymmetric and incomplete interior penalty Galerkin cases. We show convergence of the mathod by deriving error estimates. Numerical examples are given.

BRC Exhibit Hall

Poster: Deformable Complex Network for Refining Low Resolution Structures, Chong Zhang, Rice University

**Speakers** *CZ*
## Chong Zhang

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

In macromolecular X-ray crystallography, it is often desirable to build more accurate atomic models based on lower resolution experimental diffraction data. In this study, we report a refinement algorithm called the deformable complex network (DCN), which is developed by including a novel angular-network-based restraint in target function in addition to what used in deformable elastic network (DEN) model (Nature 464:1218 (2010)). Our results demonstrate that, across a wide range of low-resolution structures, significant improvements were achieved in terms of multiple refinement criteria, such as the Rfree value, overfitting and Ramachandran statistics etc.

Student in Computational Applied Physics, Rice Quantum Institute/Applied Physics Program

Graduated in Dec. 2013 with a Ph.D. in Computational Applied Physics.
Passionate about integrating high performance computing, physics and mathematics to tackle industrial challenges in science and engineering.

BRC Exhibit Hall

Poster: Discontinuous Galerkin method for miscible displacement simulations, Jizhou Li, Rice University

**DOWNLOAD POSTER PDF**

During the miscible displacement process, a solvent fluid is injected into a porous medium; it mixes with a resident fluid. The fluid mixture moves in the porous medium as a single phase flow, with a velocity that follows Darcy's law. Furthermore, the solvent concentration satisfies a convection-dominated parabolic problem, with a diffusion-dispersion tensor that depends on the fluid velocity in a nonlinear fashion. The fluid pressure equation is coupled with the concentration equation. These essential aspects constitute the miscible displacement problem. This problem arises in many applications, such as production of trapped oil in reservoirs by enhanced oil recovery. The poster will present numerical simulations of miscible displacement by using discontinuous Galerkin method. The high order numerical discretization maintains mass conservation and demonstrates low sensitivity to grid distortions. The numerical method is implemented on the parallel architecture using overlapping domain decomposition. Simulation results show the robustness of the method, as well as efficiency on a parallel cluster.

**Speakers**

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

During the miscible displacement process, a solvent fluid is injected into a porous medium; it mixes with a resident fluid. The fluid mixture moves in the porous medium as a single phase flow, with a velocity that follows Darcy's law. Furthermore, the solvent concentration satisfies a convection-dominated parabolic problem, with a diffusion-dispersion tensor that depends on the fluid velocity in a nonlinear fashion. The fluid pressure equation is coupled with the concentration equation. These essential aspects constitute the miscible displacement problem. This problem arises in many applications, such as production of trapped oil in reservoirs by enhanced oil recovery. The poster will present numerical simulations of miscible displacement by using discontinuous Galerkin method. The high order numerical discretization maintains mass conservation and demonstrates low sensitivity to grid distortions. The numerical method is implemented on the parallel architecture using overlapping domain decomposition. Simulation results show the robustness of the method, as well as efficiency on a parallel cluster.

BRC Exhibit Hall

Poster: Fragility assessment of above ground petroleum storage tanks under storm surge, Sabarethinam Kameshwar, Rice University

**DOWNLOAD POSTER**

ABSTRACT: Design guidelines for above ground storage tanks (AST) such as American Petroleum Institute (API) 620 and API 650 provide design details to prevent failure form internal liquid pressure, internal suction (vacuum), winds and earthquake loads. However, these design codes lack descriptive guidelines to prevent failure of tanks due to loads form hurricane storm surge. Due to the lack of such code provisions, failure of such ASTs due to surge loads was observed during hurricane Katrina, Rita, Ike and Gustav. During hurricane Katrina alone, tank failure led to release of 8 million gallons of crude petroleum products in to the surrounding environment. Spillage of petrochemical and other hazardous material leads to substantial economic losses due to loss of products, cleanup activities, lawsuits and repair/reconstruction of damaged tanks. Therefore, this study aims to assess fragility of ASTs subjected to storm surge loads in order to provide a basis for future design codes. Flotation and buckling have been identified as the major failure modes for tanks during storm surges. A probabilistic analysis is performed for floatation and buckling for anchored and un-anchored tanks. Random variables are identified for floatation analysis and random fields are used to generate geometric imperfections of ASTs to facilitate buckling analysis. A probabilistic analysis calls for a large number of time consuming simulations. For this purpose, the supercomputing facilities managed by the Research Computing Support Group at Rice University are used to perform the simulations. Using logistic regression, fragility of an AST typical to Houston ship channel area is assessed and measures to prevent future failure are suggested.

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

ABSTRACT: Design guidelines for above ground storage tanks (AST) such as American Petroleum Institute (API) 620 and API 650 provide design details to prevent failure form internal liquid pressure, internal suction (vacuum), winds and earthquake loads. However, these design codes lack descriptive guidelines to prevent failure of tanks due to loads form hurricane storm surge. Due to the lack of such code provisions, failure of such ASTs due to surge loads was observed during hurricane Katrina, Rita, Ike and Gustav. During hurricane Katrina alone, tank failure led to release of 8 million gallons of crude petroleum products in to the surrounding environment. Spillage of petrochemical and other hazardous material leads to substantial economic losses due to loss of products, cleanup activities, lawsuits and repair/reconstruction of damaged tanks. Therefore, this study aims to assess fragility of ASTs subjected to storm surge loads in order to provide a basis for future design codes. Flotation and buckling have been identified as the major failure modes for tanks during storm surges. A probabilistic analysis is performed for floatation and buckling for anchored and un-anchored tanks. Random variables are identified for floatation analysis and random fields are used to generate geometric imperfections of ASTs to facilitate buckling analysis. A probabilistic analysis calls for a large number of time consuming simulations. For this purpose, the supercomputing facilities managed by the Research Computing Support Group at Rice University are used to perform the simulations. Using logistic regression, fragility of an AST typical to Houston ship channel area is assessed and measures to prevent future failure are suggested.

BRC Exhibit Hall

Poster: GPU accelerated Lattice Boltzmann Method in Core Sample Analysis, Zheng Wang, Rice University

**DOWNLOAD POSTER PDF**

Lattice Bolzmann Method(LBM) is a relatively new computational method for fluid simulation. From a microscopic and mesoscopic prospective, LBM discretizes the velocity into finite directions and simulates particles in collision and propagation processes. Since the LBM is particularly successful in handling complex boundary conditions, it is now used in the oil industry. In my project, LBM is applied in core sample analysis. The core sample firstly goes through a scanner and is reconstructed into digital data. By applying LBM on the digital data, we can find out some physical properties of rocks, such as Renolds number and permeability which are very useful in oil industry. However, the LBM is usually computationally intensive. An ordinary serial code is not adequate in implementing a high resolution LBM. To simulate the fluid in a short time, we employ the technology of GPU programming which is the trend in high performance computing. By using GPU, the parallel code can be hundreds of times faster than a serial code and this allows us to apply LBM in more general cases. In addition, a combination of GPU and MPI enables the code to run on cluster and solves the memory issue.

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

Lattice Bolzmann Method(LBM) is a relatively new computational method for fluid simulation. From a microscopic and mesoscopic prospective, LBM discretizes the velocity into finite directions and simulates particles in collision and propagation processes. Since the LBM is particularly successful in handling complex boundary conditions, it is now used in the oil industry. In my project, LBM is applied in core sample analysis. The core sample firstly goes through a scanner and is reconstructed into digital data. By applying LBM on the digital data, we can find out some physical properties of rocks, such as Renolds number and permeability which are very useful in oil industry. However, the LBM is usually computationally intensive. An ordinary serial code is not adequate in implementing a high resolution LBM. To simulate the fluid in a short time, we employ the technology of GPU programming which is the trend in high performance computing. By using GPU, the parallel code can be hundreds of times faster than a serial code and this allows us to apply LBM in more general cases. In addition, a combination of GPU and MPI enables the code to run on cluster and solves the memory issue.

BRC Exhibit Hall

Poster: Kalman filtering for large-scale problems, Timur Takhtaganov, Rice University

**DOWNLOAD POSTER PDF**

Object transport has a lot of applications in industries and agriculture as well as disaster relief and warehouses applications. This poster presents a novel distributed algorithm for multi-robot systems to collectively transport a large object while avoiding obstacles in unknown environment. Given the size of the object, path planner robots generate the minimum cost path from start to the goal position by using a distributed Bellman-Ford algorithm. Then, transporter robots carry the object through the path. A transport is safe if it is obstacle free. We define transport cost as the cost of translation and rotation of the object. This study is trying to trade-off between the cost and safety of the transport by using distributed configuration space and tree-based path planing. We have implemented our algorithm both in simulation and real environments. As results show, our approach is robust to the size and shape of the object and provides a safe and efficient transport in unknown environments.

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

Object transport has a lot of applications in industries and agriculture as well as disaster relief and warehouses applications. This poster presents a novel distributed algorithm for multi-robot systems to collectively transport a large object while avoiding obstacles in unknown environment. Given the size of the object, path planner robots generate the minimum cost path from start to the goal position by using a distributed Bellman-Ford algorithm. Then, transporter robots carry the object through the path. A transport is safe if it is obstacle free. We define transport cost as the cost of translation and rotation of the object. This study is trying to trade-off between the cost and safety of the transport by using distributed configuration space and tree-based path planing. We have implemented our algorithm both in simulation and real environments. As results show, our approach is robust to the size and shape of the object and provides a safe and efficient transport in unknown environments.

BRC Exhibit Hall

Poster: Modeling and computational challenges for transient finite element computations at dynamic gas-liquid-solid contact lines, Alex Lee, Rice University

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

Fluid flows with complex rheology and dynamic interfaces are important in many industrial applications, such as optimizing liquid printing and coating processes for manufacturing nanomaterials, or determining the flow of oil/brine systems in porous rockbed for oil recovery. Such transient flows featuring the evolution of solid-liquid-gas interfaces remain difficult to compute, especially when the liquid has a complex rheology. The difficulties are due to both high computing demands and the complexity of modeling various aspects of the system. For example, OpenMP-parallelized calculations for liquid transfer efficiency of a gravure printing process took roughly 8 cpu core-years on the Rice HPC clusters to obtain data published in [1]; this was in the simplified case of a static contact line model and non-dynamic gas phase. The physics of three-phase interfaces are still poorly understood so that there are currently several strategies tailored to specific purposes. We adopt an implementation of Navier's slip law that fits naturally into our transient Petrov-Galerkin Finite Element Method, and allows more realistic physics to be applied at the contact line [2]. Still, there are several computational challenges to be addressed, including mesh resolution local to the contact line, dynamic contact angle modeling, incorporation of the conformation tensor based model for viscoelastic liquids, and stability of time integration for such tightly coupled systems. In the contexts of our gravure printing problem and some toy problems---including one of potential interest in oil recovery---we will discuss our current advances and demonstrate our successes in addressing these challenges. [1] Lee, J. A., Rothstein, J. P., & Pasquali, M. (2013). J. Non-Newtonian Fluid Mech., 199, 1–11. [2] Sprittles, J. E., & Shikhmurzaev, Y. D. (2011). Int. J. Numer. Methods Fluids, 68(10), 1257–1298.

BRC Exhibit Hall

Poster: Non-blocking Data Structures for High-performance Computing in Oil and Gas Industry, Zhipeng Wang, Rice University

Concurrent data structures are becoming more and more popular in parallel computing as they are widely used in operating systems and concurrent programming. There are two types of algorithms to implement the concurrent data structures: blocking and non-blocking. Blocking algorithms are essentially lock-based algorithms which allow the sequential order for processes to complete operations on the shared data structures. However, on the asynchronous multiprocessor systems when there are more numbers of threads than numbers of cores, they suffer huge performance degradation as the result of scheduling preemption, cache misses and page faults etc. Non-blocking algorithms could tolerate those problems and potentially achieve high concurrency while maintaining low overhead, so they are more robust in the multithreaded programming models. Reservoir Simulations play a crucial role in oil and gas industry, and high concurrency and parallelization of numerical calculations without significant performance degradation in the multiprocessor systems are becoming urgent for large-scale reservoir simulations. Here we implement our new non-blocking algorithm for concurrent data structures (FIFO queues etc.) used for solving the discrete linear equation system (the discrete energy and mass balance equations) on the IBM Power7 with up to 128 threads. The results show that our new designed non-blocking algorithm could enhance the performance of the parallel program on both of the systems with dedicated processors and with multiprogrammed processors. Our research shows that there are significant potential applications of the non-blocking concurrent data structures on the high performance computing in oil and gas industry.

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC - Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

BRC - Exhibit Hall

Poster: OCCA: A unified approach to multi-threading languages, David Medina, Rice University

**Speakers**
## David Medina

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

Graduate Student, Rice Unviersity

My name is David Medina, I'm in my fourth year of the PhD program in the Computational and Applied Mathematics department at Rice University. I'm working under the advisement of Dr. Tim Warburton on high order numerical method applications using graphical processing units (GPUs... Read More →

BRC Exhibit Hall

Poster: On the approximation of the \emph{DtN} map for high contrast media and its application to domain decomposition methods, Yingpei Wang, Rice University

**Speakers**
## Yingpei Wang

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

The Kalman Filter uses a sequence of noisy observations of a system over time to produce a sequence of approximations of the state of the system. Many variations of the original Kalman Filter have been proposed and applied in a wide variety of sciences and engineering fields, such as aerospace, meteorology, geophysics, oceanology, and reservoir simulation. However, often no connections between newly proposed variants and existing ones are made and no comparisons are given. I will evaluate and compare the efficiency of recently proposed Krylov space approximate Kalman filters and of the Ensemble Kalman filter on large-scale time dependent partial differential equation models. In addition, my work establishes theoretical connections between different variations of the Kalman filter, identifies their relative advantages and weaknesses, and exposes opportunities for algorithmic improvements. Future work will include analysis and implementation of these algorithmic improvements to increase the efficiency of Kalman Filter for large-scale nonlinear problems.

6100 Main St, Houston, TX 77005, Rice University

I am fifth year graduate student in Department of Computational and Applied Mathematics at Rice University. My research focuses on computing, numerical analysis and partial differential equations. I will graduate in May 2014 and I am looking for a job in related area.

BRC Exhibit Hall

Poster: Optical properties of the Split Ring's Nanostructure, Yang Cao, Rice University

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

Pairs of split-rings resonators (SRRs) in a scalable manner over large areas have been created based on a new and facile patterning technique, which combines conventional colloidal lithography and stretchable poly (dimethylsiloxane) (PMDS) stamps. The polarization-dependent plasmonic resonances of SRRs can be tunable in a wide wavelength region, from visible to the infrared by controlling the gap sizes. Theoretical calculation based on the finite element methods shows the perfect agreement with the experiment. This novel device has a potential application in the optical sensor fields.

BRC Exhibit Hall

Poster: Performance Challenges for Emerging HPC Systems, Milind Chabbi, Rice University

**DOWNLOAD POSTER PDF**

Today’s supercomputers are complex and enormous in scale. As a result, harnessing their full power is exceedingly difficult. Application performance problems of interest on HPC systems include both node-level and system-wide performance issues. Node-level performance issues include utilization of memory hierarchies as well as instruction, vector, and/or thread-level parallelism, as well as power consumption. System-wide problems include load imbalance and serialization across nodes, as well as communication and I/O bottlenecks. Failure to avoid or tolerate these issues can lead to major system-wide performance bottlenecks at scale. Even worse, one application may experience performance problems that result from heavy resource consumption by other jobs. In the face of all of this complexity, tools are essential for identifying code regions that consume excessive resources (e.g., time or power) relative to the work they accomplish, quantifying their impact, and diagnosing the root causes of their inefficiency. Unique processors, memory hierarchies, accelerators, network topologies, and software stacks each require different tool support for measurement and analysis. Effective performance tools for today’s supercomputers require support ranging from hardware to application domain. To date, performance tools have focused on post-mortem analysis of application performance to pinpoint and resolve causes of performance losses. For exascale systems faced with scarce resources (especially power), efficient resource management will require programs, libraries, runtime systems, and the operating system to analyze their own performance on the fly and initiate changes, e.g., migration of work or frequency scaling, to reduce resource consumption or improve utilization. As a result, exascale systems will need new software support to analyze performance measurements on the fly and policies to determine how to react. Designing the necessary performance tools’ interfaces for measurement, analysis, and control, as well as the mechanisms to support them is a key ingredient for the success of exascale systems.

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

Today’s supercomputers are complex and enormous in scale. As a result, harnessing their full power is exceedingly difficult. Application performance problems of interest on HPC systems include both node-level and system-wide performance issues. Node-level performance issues include utilization of memory hierarchies as well as instruction, vector, and/or thread-level parallelism, as well as power consumption. System-wide problems include load imbalance and serialization across nodes, as well as communication and I/O bottlenecks. Failure to avoid or tolerate these issues can lead to major system-wide performance bottlenecks at scale. Even worse, one application may experience performance problems that result from heavy resource consumption by other jobs. In the face of all of this complexity, tools are essential for identifying code regions that consume excessive resources (e.g., time or power) relative to the work they accomplish, quantifying their impact, and diagnosing the root causes of their inefficiency. Unique processors, memory hierarchies, accelerators, network topologies, and software stacks each require different tool support for measurement and analysis. Effective performance tools for today’s supercomputers require support ranging from hardware to application domain. To date, performance tools have focused on post-mortem analysis of application performance to pinpoint and resolve causes of performance losses. For exascale systems faced with scarce resources (especially power), efficient resource management will require programs, libraries, runtime systems, and the operating system to analyze their own performance on the fly and initiate changes, e.g., migration of work or frequency scaling, to reduce resource consumption or improve utilization. As a result, exascale systems will need new software support to analyze performance measurements on the fly and policies to determine how to react. Designing the necessary performance tools’ interfaces for measurement, analysis, and control, as well as the mechanisms to support them is a key ingredient for the success of exascale systems.

BRC Exhibit Hall

Poster: Predicting Solubility Parameters of Asphaltene Molecular Models Using Molecular Simulations, Mohan Boggara, Rice University

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

Production of crude oil routinely suffers from asphaltene precipitation and deposition in the reservoir and in the wellbore. Such depositions lead to significant operational and economic losses of millions of dollars to the energy industry. To predict and prevent asphaltene depositions, thorough thermo-physical characterization of crude oils is an important and significant endeavor. Current state-of-the-art industry standard thermo-physical characterization of crude oils still suffers from the lack of good predictive models and experiments. Our group focuses on the development and implementation of advanced thermodynamic models and experiments to address the thermo-physical characterization and the phase behavior of crude oils. In order to improve predictive capabilities of the thermodynamic models molecular level understanding of crude oil mixtures is the key. Specific focus of the work presented here is on predicting the solubility parameters (SP) of asphaltene models. SP is one of the key molecular parameters that can be directly related to derived thermodynamic properties as well as properties such as density and refractive index (RI). Two molecules with close SP are expected to mix at molecular level. Previous work in the group has shown a simple relationship between RI and density (one-third rule) that allows calculation of either property at any temperature and pressure by measuring their values at some standard T&P conditions. Using MD simulations, we will predict SP for various solvents and validate against experimental data in the literature. More importantly, we will predict SP for putative asphaltene models and use them as starting points for our in-house experiments in measuring solubility parameters (via density and RI measurements) of various crude oils containing asphaltenes. Overall, this work will serve as a starting point in validating the accuracy of available asphaltene models and using such validated molecular models to study the dynamics/kinetics of asphaltene precipitation and aggregation.

BRC Exhibit Hall

Poster: Predictive theory of nanocarbon growth: doping, defects, chirality, Vasilii Artyukhov, Rice University

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

We present our “nanoreactor” model for the kinetics of CVD graphene growth that bridges first-principles atomistic calculations and crystal growth theory. The model explains numerous experimental features such as hexagonal graphene island shapes and absence of lattice defects. It elucidates the roles of metal catalyst in shaping the lattice of graphene [1]. We demonstrate how the original model can be extended to include other chemical species by studying the effect of B-, N-, S-doping on the growth and formation of defects [2]. Our theory is validated by atomistic Monte Carlo simulations of growth of graphene islands. These simulations are further used to study the formation of grain boundaries during coalescence of misoriented islands, for which we uncover and explain the transition between straight-line and wiggling grain boundary shapes [3]. Finally, to elucidate how the energetics of carbon nanotubes may determine the chirality at nucleation, we undertake large-scale calculations of all possible nanotube caps across the whole chiral-angle range, obeying the isolated-pentagon rule. We confirm that the intrinsic energies of the caps are almost chirality-independent, leaving open possibilities for different chirality control strategies [4]. 1. V. I. Artyukhov, Y. Liu, and B. I. Yakobson, PNAS 109, 15136 (2012). 2. V. I. Artyukhov, T.R. Galeev, and B. I. Yakobson, in preparation. 3. K. V. Bets, V. I. Artyukhov, and B. I. Yakobson, in preparation. 4. E. S. Penev, V. I. Artyukhov, and B. I. Yakobson, ACS Nano (in press).

BRC Exhibit Hall

Poster: Sampling Techniques for Boolean Satisfiability, Kuldeep Meel, Rice University

**DOWNLOAD POSTER**

Boolean satisfiability (SAT) has played a key role in diverse areas spanning testing, formal verification, planning, optimization, inferencing and the like. Apart from the classical problem of checking boolean satisfiability, the problems of generating satisfying uniformly at random, and of counting the total number of satisfying assignments have also attracted significant theoretical and practical interest over the years. Prior work offered heuristic approaches with very weak or no guarantee of performance, and theoretical approaches with proven guarantees, but poor performance in practice. We propose a novel approach based on limited-independence hashing that allows us to design algorithms for both problems, with strong theoretical guarantees and scalability extending to thousands of variables. Based on this approach, we present two practical algorithms, UniWit: a near uniform generator and ApproxMC: the first scalable approximate model counter, along with reference implementations. Our algorithms work by issuing polynomial calls to SAT solver. We demonstrate scalability of our algorithms over a large set of benchmarks arising from different application domains.

**Speakers** *KS*
## Kuldeep S. Meel

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

Boolean satisfiability (SAT) has played a key role in diverse areas spanning testing, formal verification, planning, optimization, inferencing and the like. Apart from the classical problem of checking boolean satisfiability, the problems of generating satisfying uniformly at random, and of counting the total number of satisfying assignments have also attracted significant theoretical and practical interest over the years. Prior work offered heuristic approaches with very weak or no guarantee of performance, and theoretical approaches with proven guarantees, but poor performance in practice. We propose a novel approach based on limited-independence hashing that allows us to design algorithms for both problems, with strong theoretical guarantees and scalability extending to thousands of variables. Based on this approach, we present two practical algorithms, UniWit: a near uniform generator and ApproxMC: the first scalable approximate model counter, along with reference implementations. Our algorithms work by issuing polynomial calls to SAT solver. We demonstrate scalability of our algorithms over a large set of benchmarks arising from different application domains.

Graduate Student, Rice University

Kuldeep is a PhD student in Rice working with Prof. Moshe Vardi and Supratik Chakraborty and obtained his B.Tech. from IIT Bombay in 2012. His research broadly falls into the intersection of program synthesis, computer-aided verification and formal methods. He is the recipient of... Read More →

BRC Exhibit Hall

Poster: SimSQL: A software for large scale Bayesian Machine Learning, Zhuhua Cai, Rice University

**DOWNLOAD POSTER PDF**

This paper describes the SimSQL system, which allows for SQL-based specification, simulation, and querying of database-valued Markov chains, i.e., chains whose value at any time step comprises the contents of an entire database. SimSQL extends the earlier Monte Carlo database system (MCDB), which permitted Monte Carlo simulation of static database-valued random variables. Like MCDB, SimSQL uses user-specified “VG functions” to generate the simulated data values that are the building blocks of a simulated database. The enhanced functionality of SimSQL is enabled by the ability to parametrize VG functions using stochastic tables, so that one stochastic database can be used to parametrize the generation of another stochastic database, which can parametrize another, and so on. Other key extensions include the ability to explicitly define recursive versions of a stochastic table and the ability to execute the simulation in a MapReduce environment. We focus on applying SimSQL to Bayesian machine learning.

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

This paper describes the SimSQL system, which allows for SQL-based specification, simulation, and querying of database-valued Markov chains, i.e., chains whose value at any time step comprises the contents of an entire database. SimSQL extends the earlier Monte Carlo database system (MCDB), which permitted Monte Carlo simulation of static database-valued random variables. Like MCDB, SimSQL uses user-specified “VG functions” to generate the simulated data values that are the building blocks of a simulated database. The enhanced functionality of SimSQL is enabled by the ability to parametrize VG functions using stochastic tables, so that one stochastic database can be used to parametrize the generation of another stochastic database, which can parametrize another, and so on. Other key extensions include the ability to explicitly define recursive versions of a stochastic table and the ability to execute the simulation in a MapReduce environment. We focus on applying SimSQL to Bayesian machine learning.

BRC Exhibit Hall

Poster: Solution of the black-oil problem by discontinuous Galerkin methods, Richard Rankin, Rice University

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

Black-oil is a commonly used model for simulating compressible flow of a water-gas-oil system in reservoirs. It is an example of a three-component three-phase flow. The phases are liquid, vapor and aqueous and the components are oil, gas and water. In this model, the gas component can exist in both the liquid and vapor phases. The water component only exists in the aqueous phase and the oil component only exists in the liquid phase. Consequently, the aqueous phase does not exchange mass with the liquid or vapor phases but the liquid and vapor phases can exchange mass. For the black-oil problem, we choose for primary unknowns the pressure of the liquid phase, the saturation of the aqueous phase and the saturation of the vapor phase. The saturation of the liquid phase can be obtained from the saturations of the other two phases by using the fact that the sum of the saturations of the three phases must equal one. The spatial discretization is based on the interior penalty discontinuous Galerkin method. At each time step, the equations for the primary unknowns are solved sequentially but each equation remains nonlinear with respect to its primary unknown. In several numerical examples we test the robustness of the method by varying the physical input data.

BRC Exhibit Hall

Poster: Stochastic Approaches for Nonlinear Drillstring Dynamic Analyses, Eleazar Marquez, Rice University

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

Encapsulated by critical drilling factors such as hydraulic power, weight-on-bit (WOB), drill-bit rotary velocity, and circulating fluids, conventional drilling operations urgently demand for alternative, reliable performance enhancing techniques capable of simultaneously reducing catastrophic events and operational time. Extensive [conventional] rotary drilling assessments regarding assembly vibration irregularities, bit-wear, fatigue, buckling, whirling, well-bore damage, and equipment failure invade scholastic and industrial spheres, targeting phenomenological characterization as a medium to augment rate-of-penetration (ROP). Recent investigational trends elucidate reasonable ROP increments centered on vibration-assisted drilling (VAD) methodology, where a transferring of high-frequency low-amplitude excitation into low-frequency high-amplitude response transpires by superimposing an axial vibratory source on the drill-string. The proposed rig-suspended dynamical model subjected to monochromatic deterministic and stochastic excitations, and exposed to a variety of material and geometric nonlinearities, captures the response (ROP) for any established downhole condition and formation type upon integrating percussion VAD technology; a two-step process deliberately percolates – proper mathematical representation of drill-string, VAD source, and rock-formation apprehending predominant physical attributes; and delineating appropriate [vibratory] source position within the drill-string, which warrants maximal penetration rates and eliminates tuning of mass near the natural frequency. Hypothetically formulating adequate physical parameters for the equation of motion implies incorporating finite element techniques, where the flexibility of the drill-string and elastic characteristics of the well-bore/formation are accounted along the axial and lateral directions. Modeling drill-string dynamics, nonetheless, postulates executing advanced numerical simulation techniques, particularly during the integration of the two thousand-degree-of-freedom oscillator exposed to mass, damping, and stiffness nonlinearities. In synthesizing compatible time histories through the adaptation of Kanai-Tajimi power spectrum, auto-regressive-moving-average (ARMA) filter is constituted. The method of Statistical Linearization and the method of Monte Carlo simulation are included within the stochastic vibration analyses.

BRC Exhibit Hall

Poster: Subsurface extended full waveform inversion, Lei Fu, Rice University

**DOWNLOAD POSTER PDF**

Least-squares waveform inversion is approved to be capable of reconstructing remarkably detailed models of subsurface structure. Unlike conventional tomography or migration techniques that only make use of specific portion of the seismic data, seismic waveform inversion takes into account essentially any physics of seismic wave propagation that can be modelled. However, without extremely low frequency data or good initial model that contains the long-scale structure information, seismic waveform inversion is very likely to be trapped by numerous spurious local minima. The extended modelling concept combines the global convergence of migration velocity analysis with the physical fidelity of waveform inversion by adding additional dimension of freedom. Using Claerbout’s survey-sinking concept, this study implements depth-oriented extended inversion by introducing a subsurface shift in the imaging condition. Synthetic experiment result demonstrates that the depth-oriented extended waveform inversion can overcome the local minima problem and successfully estimate the true velocity model by using conventional seismic field data.

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

Least-squares waveform inversion is approved to be capable of reconstructing remarkably detailed models of subsurface structure. Unlike conventional tomography or migration techniques that only make use of specific portion of the seismic data, seismic waveform inversion takes into account essentially any physics of seismic wave propagation that can be modelled. However, without extremely low frequency data or good initial model that contains the long-scale structure information, seismic waveform inversion is very likely to be trapped by numerous spurious local minima. The extended modelling concept combines the global convergence of migration velocity analysis with the physical fidelity of waveform inversion by adding additional dimension of freedom. Using Claerbout’s survey-sinking concept, this study implements depth-oriented extended inversion by introducing a subsurface shift in the imaging condition. Synthetic experiment result demonstrates that the depth-oriented extended waveform inversion can overcome the local minima problem and successfully estimate the true velocity model by using conventional seismic field data.

BRC Exhibit Hall

Poster: The Argonne Leadership Computing Facility and High Frequency Physics-Based Earthquake System Simulations, David Martin, Argonne National Laboratory

The Argonne Leadership Computing Facility (ALCF) is a supercomputing user facility supported by the U.S. Department of Energy (DOE). The ALCF provides the computational science community with a world-class computing capability dedicated to breakthrough science and engineering. Over 5 billion core hours on Mira, ALCF’s 10-petaflops Blue Gene/Q supercomputer, are made available to peer-reviewed projects, including explorations into renewable energy, studies of the affects of global climate change, and efforts to unravel the origins of the universe. Collaborators have access to a full range of services and support. ALCF offers expertise in novel computational methods and algorithms, application porting, performance tuning and scaling, petascale system management, and high-performance analysis and visualization.

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC

BRC

Poster: Vapor formation around heated nanoparticles: a molecular dynamic study, Vikram Kulkarni, Rice University

DOWNLOAD POSTER PDF

Strongly heated nanoparticles have many applications ranging from cancer therapy to efficient steam generation. Gold nanoparticles may be resonantly heated when exposed to light due to the excitation of surface plasmons. Here we simulate the thermodynamics of heat transfer from a gold nanoparticle into water. This is accomplished using molecular dynamics, a large scale brute force computational technique capable of simulating the motions of millions of atoms. We study nanoparticles of experimentally realistic size, ranging from 17-26 nm containing over a hundred thousand gold atoms immersed in millions of water molecules. We show the conditions required for the formation of a vapor bubble around the nanoparticle, including the threshold laser power and the critical size of the particle. We show explicitly that small nanoparticles may be heated to the melting point without the formation of a surrounding bubble. However for larger nanoparticles, a pronounced bubble is seen around the particle. Our results are compared to the well-known heat transfer equation, which is known to break down at the nanoscale due to the formation of interfacial thermal barriers. Our work is of vital importance for scientists and engineers who wish to utilize hot nanoparticles for changing the surrounding environment, whether for irradiating a tumor cell or to produce vapor bubbles for enhanced steam generation.

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

DOWNLOAD POSTER PDF

Strongly heated nanoparticles have many applications ranging from cancer therapy to efficient steam generation. Gold nanoparticles may be resonantly heated when exposed to light due to the excitation of surface plasmons. Here we simulate the thermodynamics of heat transfer from a gold nanoparticle into water. This is accomplished using molecular dynamics, a large scale brute force computational technique capable of simulating the motions of millions of atoms. We study nanoparticles of experimentally realistic size, ranging from 17-26 nm containing over a hundred thousand gold atoms immersed in millions of water molecules. We show the conditions required for the formation of a vapor bubble around the nanoparticle, including the threshold laser power and the critical size of the particle. We show explicitly that small nanoparticles may be heated to the melting point without the formation of a surrounding bubble. However for larger nanoparticles, a pronounced bubble is seen around the particle. Our results are compared to the well-known heat transfer equation, which is known to break down at the nanoscale due to the formation of interfacial thermal barriers. Our work is of vital importance for scientists and engineers who wish to utilize hot nanoparticles for changing the surrounding environment, whether for irradiating a tumor cell or to produce vapor bubbles for enhanced steam generation.

BRC Exhibit Hall

Poster: Waveform inversion with source coordinate model extension, Yin Huang, Rice University

**Speakers** *YH*
## Yin Huang

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

Extended full waveform inversion combines ideas of seismic inversion and migration velocity analysis using extended model concept. For the linearized extended full waveform inversion (LEFWI), the model (extended) is separated into smooth background velocity (physical) and short scale reflectivity (extended). Minimization over the reflectivity gives a reduced objective function over the background velocity, which then can be minimized over velocity. We will review this method and then show some numerical results of source coordinate model extension. Adjoint state method is used to get derivatives of this method. Without the satisfaction of the adjoint relation, the optimization may have slow convergence rate or even do not converge at all. Thus, testing of the adjoint relation is crucial to a successful implementation. I will present a method to compute a derivative and its adjoint of a generalized time step function, with an automatic differentiation tool: TAPENADE, and then show testing results of the acoustic constant density wave equation.

6100 Main St, MS-134, Houston, TX, CAAM-Rice University

I am a forth year as a PhD student at the Computational and Applied Mathematics Department at Rice University, and am now doing research on Extended Waveform Inversion, Imaging and High Performance Computing under the supervision of Dr. William Symes.
Relevant courses work includes... Read More →

BRC Exhibit Hall

Poster: Write Aside Persistence (WrAP) for Storage Class Memory, Ellis Giles, Rice University

**Speakers**
## Ellis Giles

Thursday March 6, 2014 4:30pm - 6:30pm PST

BRC Exhibit Hall*Rice University 6500 Main Street at University, Houston, TX 77030*

Emerging memory technologies like Phase Change Memory or Memristors (generically called SCM or Storage Class Memory) combine the ability to access data at byte granularity with the persistence of storage devices like hard disks or SSDs. With SCM, application developers can focus on a single storage abstraction rather than having to deal with both byte/word-grained accesses to DRAM locations and block-based accesses to file/disk ranges. By accessing data directly from SCM addresses instead of slow block I/O operations, developers can gain 1-2 orders of performance. However, this unification of storage into a single directly accessed persistent storage memory tier is a mixed blessing, as it pushes upon developers the burden of ensuring that SCM stores are ordered correctly, flushed from processor caches, and if interrupted by sudden machine stoppage, do not leave objects in SCM in inconsistent states. The complexity of ensuring properly ordered and all-or-nothing updates raises significant reliability and programmability challenges. We propose a solution called Write Aside Persistence, or WrAP, that provides durability and consistency for SCM writes, while ensuring fast paths to data in processor caches, DRAM, and persistent memory tiers. WrAP is presented in a software / hardware architecture and also as a software only approach. Simulations of transactional data structures, such as the Graph 500 Benchmark and Standard Template Library tests, indicate the potential for significant performance gains using Write Aside Persistence for atomic and durable writes to Storage Class Memory.

Doctoral Student, Rice University

I am interested in high performance and distributed computing. I am also interested in emerging and exciting technologies. For fun I enjoy scuba diving, model rocketry, and working on classic automobiles.

BRC Exhibit Hall