Research in 2014


●      Application of Novel Computational Chemistry Methods for discovering energy related materials

Investigator​: Prof. Abhijit Chatterjee, Chemical Engg, IITB

Summary: ​Studies aimed at unravelling the types of processes that occur at atomic and meso­scales that will result in the discovery of new types of materials which will enable these energy devices to be longer­lasting, cheaper and efficient. Recently, we have developed a new computational methodology for building accurate models. At the heart of this methodology lies the use of massively parallel molecular dynamics simulations. Given the promising developments in recent years in GPU computing and the linear scaling performance we observe with our in­house codes, we are now using CPU+GPU computing for our high­performance computational needs. GPU computing helps to accelerate our research efforts.


●      Development of CUDA based methods for Polynomial Global

Optimization using the Bernstein Polynomial Approach

Investigator​: Prof. PSV Nataraj

Summary​: We propose GPU parallelization of Bernstein algorithm and its variants.

Bernstein polynomials are most suitable for polynomial optimization due their range enclosure property. The method is cursed by dimensionality. For higher dimensional and higher degree polynomial the number of Bernstein coefficients to be computed increases exponentially. Use of GPUs (CUDA) gave a speedup upto 85X.


●      Solution of Maxwell Equation usin FTDT technique for large

Electromagnetic Scattering

Investigator:​ Prof. G R Shevare, IIT Bombay

Summary: A​pplication of algorithm for computation of electromagnetic field components using FDTD method Serial Code for electromagnetic scattering computation in C Code is parallelized using OpenMP, MPI and GPU (CUDA) & performance compared for various mesh sizes and core/thread availability. 

While OpenMPI and MPI provide a linear scale­up, Use of GPU (CUDA) gave a scale­up of about 60x.

Other projects of Prof. Shevare​: Porting of density based CFD solver from ‘MPI­OpenMP’ to ‘MPI­CUDA" based on unstructured grid, Simulation of environment in a large city using CFD on GPUs


●      Circuit Simulation Acceleration on Parallel Computing Platforms

Investigator​: Prof. Sachin Patkar et al. 

Summary​: Based on Random Walk Approach. Uses Random Walk approach for solving Power Grid Circuits. Implemented a circuit of 25 million nodes on GPU, clocking 4 GFLOPS.GPU Platform Used: Nvidia

Tesla K40c. Paper submitted for International Parallel & Distributed

Processing Symposium, 2015. Future scope involves targeting circuits of 250 million nodes, requiring high level of memory optimization, improved analysis, and better profiling of code written for GPU.


●      Synthetic Aperture Radar (SAR) Image Processing using GPU

Investigator​: Prof. Sachin Patkar et al. 

Summary​: SAR is used to create fine resolution image of landscapes.

Objective is to accelerate the large number of computations involved in the

Range Migration algorithm on SAR images. GPU implementation achieved

10x speed compared to its sequential counterpart. Future Work involves using multi­GPU system for a muc larger real satellite SAR data in an efficient way. GPU Platform Used: Nvidia Fermi GT635M, Tesla C2070 and Kepler K40.


●      GPU implementation of particle filter based object tracking algorithm

Investigator​: Prof. Sachin Patkar et al. 

Summary​: Particle filter based object tracking algorithm is implemented on NVIDIA GTX560Ti GPU using OpenCV and CUDA.3D RGB histogram is used as an object feature. Frame rate of 880 fps is obtained for object tracking using color image sequences. Paper accepted at VLSI Design Conference 2015. GPU Platform Used: Nvidia GTX560Ti. 


●      GPU implementation of LDPC decoding

Investigator​: Prof. Sachin Patkar et al. 

Summary​: Projective geometry based LDPC decoding is under implementation for LDPC decoding. Sum product algorithm has been used for 100000 data sequences of

1052 codeword length. It has been tested for 18 different SNR values.

Coderate of 4.8 Mbps is achieved for current implementation. GPU Platform Used: Nvidia K40 GPU.


●      Point Relaxation based Circuit Simulator using GPUs

Investigator​: Prof. Sachin Patkar et al. 

Summary​: BREMICS simulator uses point relaxation technique with Gauss Jacobi method. Using this simulator, we have performed DC and transient analysis on circuits with different network topologies & compared results with CPU implementation. Different types of circuits, such as RC Circuits, CMOS inverter are simulated. On Fermi generation, achieved 11x speedup over CPU, with same accuracy. New partitioning scheme reduces number of Gauss Jacobi iterations. This work s done on Kepler and next generation GPU architecture.


●      GPU based parallelization of 3D Euler solver

Investigator​: Prof. Sachin Patkar et al. 

Summary​: Investigation of parallelism in Runge­Kutta method based time integration approach in the finite volume formulation of 3D Euler equations for compressible flow using GPUs. Three codes to be parallelized on GPU hardware. Initial work for all three codes on Fermi architecture. Work involves use of domain decomposition techniques to divide problem to be run on Multi­GPU environment. Different advanced GPU techniques: CUDA Streams, reordering instructions, avoiding branches in code, controlling register spill, will be applied.


●      Development of a Parallel Finite element code 

Investigators: ​Prof. Seshu and Prof. S. Kulkarni

Summary: A​ctivities have been initiated with the broad aim of eventually developing a parallel finite element code library. Finite element method is one of the most popular tools for computational mechanics. Recent interest in very large size finite element analysis problems focuses the need for efficient parallelization strategies. One aspect of current focus is on the efficient solution techniques (both in terms of computational time requirement and memory requirements) for the set of equations that arise in the finite element method. To this end, the authors are looking at the element by element finite element analysis which manages the memory requirements by avoiding the storage of the complete system matrix.


●      Accelerating Relational Database query processing on GPUs

Investigator​: Prof. P S Dhabe, Research Scholar, IIT Bombay

Summary​: Amount of data need to be handled is growing day by day.

Relational Database Management System (DBMS) is used to handle bulk of data and finding the exact data required satisfying a given condition. A database Query is a statement requesting the data from database. Query processing on voluminous data is time consuming, on sequential machine. Query processing requires considerable amount of search through .The results shows that, DBMS query processing can be accelerated from 20X to 70X on GPUs.


●      Development of efficient Galerkin method for modern computer architectures

Investigator:​ Prof. Shiva Gopalakrishnan


●      GPU parallelization of fuzzy hyper­line segment neural network

Investigator:​ Prof. P S Dhabe, Research Scholar, IIT Bombay


●      Efficiently parallelize cryptographic applications using Nvidia GPUs as well as multicores.

Investigators:​ ​Vibhor Agrawal, Prof. Bernard Menezes


●      GPGPU Based Acceleration of Tsunami Simulation

Investigators:​ Praveen Singariya, Prof. Virendra Singh


●      GPU server for running OpenFOAM jobs using HPC­GPU solvers

Investigators:​  Gupte Aditya, Prof. A M Pradeep


●      Use the profiler (nvvp) to better understand the hotspots and figure out a way to reduce any hotspots

Investigators:​  Bangera Vivek, Prof. Sachin Patkar


●      Studying the behaviour of Neural Networks and implement learning algorithms using GPUs

Investigators:​  Bansal Yamini, Prof. Sachin Patkar


●      Using spiking neural networks for learning and comparing them to other more conventional algorithms for machine learning

Investigators:​ Santurkar Shibani, Prof. Sachin Patkar


●      The Method of Four Russians for Multiplication (M4RM) proves to be one of the most efficient algorithms to compute dense matrix multiplications over binary fields.

Investigators:​  Dhas David, Prof. Sachin Patkar


●      Hardware accelerators for circuit design

Investigators:​  Thomas Jeebu, Prof. Sachin Patkar


●      Parallelization of Pollard­Rho

Investigators:​  Rajlaxmi, Prof. Bernard Menezes


●      Molecular dynamics and simulations on biological/ chemical system

Investigators:​  Chandan Patel, Prof. R. B. Sunoj


●      Transient heat transfer analysis of fire resistance safe

Investigators:​ Anil Jaiswal, Prof. S. L. Bapat


●      Development of high performance satellite data analysis for disaster management

Investigators:​  Ujwala Bhangale, Prof. Surya Durbha


Doctoral theses related to GPUs

●      Dhabe P. S., A new approach to global optimization based on Bernstein Polynomials and GPU computing, IIT Bombay (​2010 ­continuing)

Thesis Supervisor:​  Prof.  P. S. V. Nataraj

●      J. Nayak, An Interval arithmetic library for GPU computing, IIT Bombay ​(2011­ continuing)

Thesis Supervisor:​ Prof. P. S. V. Nataraj

●      S. Unnikrishnan, Modeling, Simulation, and Control of Boilers using CUDA, IIT Bombay ​(2011­ continuing)

Thesis Supervisor:​  Prof.  P. S. V. Nataraj


Master theses related to GPUs

●      Ahmed L., ​Speeding up Tsunami Waves Propagation in ocean using GPU, IIT Bombay

Thesis Supervisor:​  Prof. Virendra Singh (Completed)

●      Deshpande Varadendra Ravindra, Implementation & Parallelization of FDTD code for Electromagnetic Scattering

Thesis supervisor:​ Prof. Shevare. (Completed)

●      Jeebu Thomas,​ Design of relaxation based Circuit Simulator for both linear and nonlinear devices, using NVIDIA CUDA kernels and benchmarking performance against the conventional simulators. Thesis supervisor: ​ Prof. Patkar, S.