**Projects **

● **Application of Novel Computational Chemistry Methods for discovering energy related materials **

**Investigator**: Prof. Abhijit Chatterjee, Chemical Engg, IITB

**Summary: **Studies aimed at unravelling the types of processes that occur at atomic and mesoscales that will result in the discovery of new types of materials which will enable these energy devices to be longerlasting, cheaper and efficient. Recently, we have developed a new computational methodology for building accurate models. At the heart of this methodology lies the use of massively parallel molecular dynamics simulations. Given the promising developments in recent years in GPU computing and the linear scaling performance we observe with our inhouse codes, we are now using CPU+GPU computing for our highperformance computational needs. GPU computing helps to accelerate our research efforts.

● **Development of CUDA based methods for Polynomial Global **

**Optimization using the Bernstein Polynomial Approach**

**Investigator**: Prof. PSV Nataraj

**Summary**: We propose GPU parallelization of Bernstein algorithm and its variants.

Bernstein polynomials are most suitable for polynomial optimization due their range enclosure property. The method is cursed by dimensionality. For higher dimensional and higher degree polynomial the number of Bernstein coefficients to be computed increases exponentially. Use of GPUs (CUDA) gave a speedup upto 85X.

● **Solution of Maxwell Equation usin FTDT technique for large **

**Electromagnetic Scattering **

**Investigator:** Prof. G R Shevare, IIT Bombay

**Summary: **Application of algorithm for computation of electromagnetic field components using FDTD method Serial Code for electromagnetic scattering computation in C Code is parallelized using OpenMP, MPI and GPU (CUDA) & performance compared for various mesh sizes and core/thread availability.

While OpenMPI and MPI provide a linear scaleup, Use of GPU (CUDA) gave a scaleup of about 60x.

**Other projects of Prof. Shevare**: Porting of density based CFD solver from ‘MPIOpenMP’ to ‘MPICUDA" based on unstructured grid, Simulation of environment in a large city using CFD on GPUs

● **Circuit Simulation Acceleration on Parallel Computing Platforms **

**Investigator**: Prof. Sachin Patkar et al.

**Summary**: Based on Random Walk Approach. Uses Random Walk approach for solving Power Grid Circuits. Implemented a circuit of 25 million nodes on GPU, clocking 4 GFLOPS.GPU Platform Used: Nvidia

Tesla K40c. Paper submitted for International Parallel & Distributed

Processing Symposium, 2015. Future scope involves targeting circuits of 250 million nodes, requiring high level of memory optimization, improved analysis, and better profiling of code written for GPU.

● **Synthetic Aperture Radar (SAR) Image Processing using GPU **

**Investigator**: Prof. Sachin Patkar et al.

**Summary**: SAR is used to create fine resolution image of landscapes.

Objective is to accelerate the large number of computations involved in the

Range Migration algorithm on SAR images. GPU implementation achieved

10x speed compared to its sequential counterpart. Future Work involves using multiGPU system for a muc larger real satellite SAR data in an efficient way. GPU Platform Used: Nvidia Fermi GT635M, Tesla C2070 and Kepler K40.

● **GPU implementation of particle filter based object tracking algorithm **

**Investigator**: Prof. Sachin Patkar et al.

**Summary**: Particle filter based object tracking algorithm is implemented on NVIDIA GTX560Ti GPU using OpenCV and CUDA.3D RGB histogram is used as an object feature. Frame rate of 880 fps is obtained for object tracking using color image sequences. Paper accepted at VLSI Design Conference 2015. GPU Platform Used: Nvidia GTX560Ti.

● **GPU implementation of LDPC decoding **

**Investigator**: Prof. Sachin Patkar et al.

**Summary**: Projective geometry based LDPC decoding is under implementation for LDPC decoding. Sum product algorithm has been used for 100000 data sequences of

1052 codeword length. It has been tested for 18 different SNR values.

Coderate of 4.8 Mbps is achieved for current implementation. GPU Platform Used: Nvidia K40 GPU.

● **Point Relaxation based Circuit Simulator using GPUs**

**Investigator**: Prof. Sachin Patkar et al.

**Summary**: BREMICS simulator uses point relaxation technique with Gauss Jacobi method. Using this simulator, we have performed DC and transient analysis on circuits with different network topologies & compared results with CPU implementation. Different types of circuits, such as RC Circuits, CMOS inverter are simulated. On Fermi generation, achieved 11x speedup over CPU, with same accuracy. New partitioning scheme reduces number of Gauss Jacobi iterations. This work s done on Kepler and next generation GPU architecture.

● **GPU based parallelization of 3D Euler solver **

**Investigator**: Prof. Sachin Patkar et al.

**Summary**: Investigation of parallelism in RungeKutta method based time integration approach in the finite volume formulation of 3D Euler equations for compressible flow using GPUs. Three codes to be parallelized on GPU hardware. Initial work for all three codes on Fermi architecture. Work involves use of domain decomposition techniques to divide problem to be run on MultiGPU environment. Different advanced GPU techniques: CUDA Streams, reordering instructions, avoiding branches in code, controlling register spill, will be applied.

● **Development of a Parallel Finite element code **

**Investigators: **Prof. Seshu and Prof. S. Kulkarni

**Summary: **Activities have been initiated with the broad aim of eventually developing a parallel finite element code library. Finite element method is one of the most popular tools for computational mechanics. Recent interest in very large size finite element analysis problems focuses the need for efficient parallelization strategies. One aspect of current focus is on the efficient solution techniques (both in terms of computational time requirement and memory requirements) for the set of equations that arise in the finite element method. To this end, the authors are looking at the element by element finite element analysis which manages the memory requirements by avoiding the storage of the complete system matrix.

● **Accelerating Relational Database query processing on GPUs **

**Investigator**: Prof. P S Dhabe, Research Scholar, IIT Bombay

**Summary**: Amount of data need to be handled is growing day by day.

Relational Database Management System (DBMS) is used to handle bulk of data and finding the exact data required satisfying a given condition. A database Query is a statement requesting the data from database. Query processing on voluminous data is time consuming, on sequential machine. Query processing requires considerable amount of search through .The results shows that, DBMS query processing can be accelerated from 20X to 70X on GPUs.

● **Development of efficient Galerkin method for modern computer architectures **

**Investigator:** Prof. Shiva Gopalakrishnan

● **GPU parallelization of fuzzy hyperline segment neural network **

**Investigator:** Prof. P S Dhabe, Research Scholar, IIT Bombay

● **Efficiently parallelize cryptographic applications using Nvidia GPUs as well as multicores. **

**Investigators:** Vibhor Agrawal, Prof. Bernard Menezes

● **GPGPU Based Acceleration of Tsunami Simulation **

**Investigators:** Praveen Singariya, Prof. Virendra Singh

● **GPU server for running OpenFOAM jobs using HPCGPU solvers **

**Investigators:** Gupte Aditya, Prof. A M Pradeep

● **Use the profiler (nvvp) to better understand the hotspots and figure out a way to reduce any hotspots **

**Investigators:** Bangera Vivek, Prof. Sachin Patkar

● **Studying the behaviour of Neural Networks and implement learning algorithms using GPUs **

**Investigators:** Bansal Yamini, Prof. Sachin Patkar

● **Using spiking neural networks for learning and comparing them to other more conventional algorithms for machine learning **

**Investigators:** Santurkar Shibani, Prof. Sachin Patkar

● **The Method of Four Russians for Multiplication (M4RM) proves to be one of the most efficient algorithms to compute dense matrix multiplications over binary fields. **

**Investigators:** Dhas David, Prof. Sachin Patkar

● **Hardware accelerators for circuit design **

**Investigators:** Thomas Jeebu, Prof. Sachin Patkar

● **Parallelization of PollardRho **

**Investigators:** Rajlaxmi, Prof. Bernard Menezes

● **Molecular dynamics and simulations on biological/ chemical system **

**Investigators:** Chandan Patel, Prof. R. B. Sunoj

● **Transient heat transfer analysis of fire resistance safe **

**Investigators:** Anil Jaiswal, Prof. S. L. Bapat

● **Development of high performance satellite data analysis for disaster management **

**Investigators:** Ujwala Bhangale, Prof. Surya Durbha

**Doctoral theses related to GPUs **

● **Dhabe P. S., A new approach to global optimization based on Bernstein ****Polynomials and GPU computing, IIT Bombay **(2010 continuing)

*Thesis Supervisor:* Prof. P. S. V. Nataraj

● **J. Nayak, An Interval arithmetic library for GPU computing, IIT ****Bombay **(2011 continuing)

*Thesis Supervisor:* Prof. P. S. V. Nataraj

● **S. Unnikrishnan, Modeling, Simulation, and Control of Boilers using ****CUDA, IIT Bombay** (2011 continuing)

*Thesis Supervisor:* Prof. P. S. V. Nataraj

**Master theses related to GPUs **

● Ahmed L., **Speeding up Tsunami Waves Propagation in ocean using ****GPU, IIT Bombay**

*Thesis Supervisor:* Prof. Virendra Singh (Completed)

● Deshpande Varadendra Ravindra**,**** Implementation & Parallelization of ****FDTD code for Electromagnetic Scattering**

*Thesis supervisor:* Prof. Shevare. (Completed)

● Jeebu Thomas,** Design of relaxation based Circuit Simulator for both linear and nonlinear devices, using NVIDIA CUDA kernels and ****benchmarking performance against the conventional simulators. ***Thesis supervisor: * Prof. Patkar, S.