Dhruva Abhijit Rajwade

I'm a final year Integrated Master's student at IIT Kharagpur in India, where I study Computational Biology, and will graduate with a Bachelor's+Master's degree(s) in Biotechnology and Biochemical Engineering after Spring 2025.

In my undergrad journey, I have worked with Prof. Koel Chaudhary and Prof. Soumya De at IIT Kharagur. In the summer of 2023, I got an amazing opportunity to work with Prof. Brian Ingalls at the University Of Waterloo. I worked at Caltech in the summer of '24 with Prof. Anima Anandkumar and Dr. Shengchao Liu on understanding Protein-DNA interactions using Cross-Attention and language modelling. Currently, I am wrapping up my Master's thesis on using discrete diffusion for controllable generation of DNA-binding protein sequences, and am excited to see where this project goes.

I am a recipient of the Caltech SURF fellowship (2024), the MITACS Globalink Research Internship (2023), and have also been selected for the EPFL E3 scholarship (2024) and the ThinkSwiss Research scholarship (2023).

Email  /  Scholar  /  Twitter  /  Github   

profile photo hover photo

Research

I'm primarily interested in Deep Learning for Protein Design, Gene regulation and Biological systems, Mathematical modelling of biological networks and dynamics, and AI for science. Most of my work has involved applying robust learning methods to Biological problems in an interpretable manner. Apart from Biology, I am keenly interested in and have also worked in Deep Learning for Vision (specifically SSL, secure ML and Generative modelling), as well as Causal Inference and Graph Learning. More recently, I have been interested in Fourier Neural Operators, Geometric Deep Learning, State-space models and their applications in Biology.

News

  [Dec 2024] Our paper on backdoor and adversarial attacks targetting SSL was accepted at ICASSP 2025. See you at Hyderabad!
  [Nov 2024] Our work on understanding Protein-DNA interactions using Protein and Genomics Foundation models was accepted at the MLSB, FM4Science and AIDrugX workshops, NeurIPS 2024.

Current Projects

Discrete Diffusion For Tunable DNA-binding Protein Design
[Slides]

Using Discrete Diffusion to learn the distribution of DNA-binding protein sequences. Using our work on Seq2Contact to guide the diffusion process to generate proteins with high affinity for specified DNA targets. Working on optimizing CRISPR-Cas protein design, stitching together different functional domains by inpainting a DNA-binding domain in between, generating DNA sequences for binding to a target protein by inverting the Seq2Contact model, and engineering protein-DNA interactions. This project is in collaboration with Prof. Riddhiman Dhar at IIT Kharagpur. Image shows ESMFold folded sequences sampled by our trained discrete diffusion model, with as less as 25% sequence similarity to the training set and the presence of different DNA-binding domains (through PFam scans). The protein sequences were not cherry-picked, and the structures are coloured by confidence (blue: High, red: Low).

Finding Allosteric Networks in the CAP-cAMP system using Deep Learning and Molecular Dynamics
[Slides]

Using MD simulations and trajectory analysis to infer Allosteric networks in the CAP-cAMP system (where the cAMP ligand binds to CAP protein, leading to allosteric orientation changes and finally, transcription). Current progress includes uncovering of three prospective networks using AlloReverse. Future plans include using State-space models to learn the dynamics of protein-ligand interactions, and finding a generalizable method for allosteric discovery. This work is in collaboration with Prof. Soumya De at IIT Kharagpur.

Publications

[Note: Highlighted papers indicate first authorship; (*) indicates equal contribution].

Understanding Protein-DNA Interactions by Paying Attention to Protein and Genomics Foundation Models
Dhruva Abhijit Rajwade, Erica Wang, Aryan Satpathy, Alex Brace, Hongyu Guo, Arvind Ramanathan, Shengchao Liu, Anima Anandkumar
NeurIPS 2024 Foundation Models For Science, AI for New Drug Modalities, Machine Learning in Structural Biology workshops, 2024
Paper / Code

Cross-Attention coupled with Protein and Genomics Foundation models to understand Protein-DNA interactions speeds up inference and achieves State-of-the-art performance in predicting contacts in Protein-DNA complexes (using purely sequence data for inference).

Towards Backdoor Mitigation and Adversarial Robustness in SSL
Dhruva Abhijit Rajwade*, Aryan Satpathy*, Nilaksh*
to appear at ICASSP 2025
Paper / Code

An intuitively elegant and simple defense strategy to defend against standard SSL augmentation invariant frequency based backdoor attacks. Taking a leaf out of frequency domain attacks, we also use frequency domain patching to increase model robustness in SSL.

Attenuated Total Reflectance–Fourier Transform Infrared (ATR-FTIR) Spectroscopy Combined With Deep Learning for Classification of Idiopathic Recurrent Spontaneous Miscarriage (IRSM)
Dadoma Sherpa, Dhruva Abhijit Rajwade, Imon Mitra, Souvik Biswas, Sunita Sharma, Pratip Chakraborty, Shovandeb Kalapahar, Ratna Chattopadhyay, Koel Chaudhury
Analytical Letters (Journal), 2024
Paper / Code

An extension of our previous work (see below) on using (ATR-FTIR)Spectroscopy and Deep Learning for prediction of Idiopathic Recurrent Spontaneous Miscarriage (IRSM). This work focuses on the classification of IRSM using ATR-FTIR Spectroscopy, which is a non-invasive and cost-effective technique and improves on our previous work on using Raman Spectroscopy in a similar problem setting.

Cells2Vec: Bridging the gap between experiments and simulations using causal representation learning
Dhruva Abhijit Rajwade, Atiyeh Ahmadi, Brian Ingalls
NeurIPS 2023 Causal Representation Learning Workshop, 2023
Paper / Poster / Code / Slides / Talk

Learning meaningful representations of multi-cell timeseries (Cellmodeller) simulations using causal representation learning. Current work includes extending this to real-world data, and for proxy-simulation generation for biological experiments.

Prediction of Idiopathic Recurrent Spontaneous Miscarriage using Machine Learning

[Best Paper Award]


Dadoma Sherpa, Dhruva Abhijit Rajwade, Imon Mitra, Dhruba Dhar, Sunita Sharma, Pratip Chakraborty, Koel Chaudhury
IEEE International Conference on Computer, Electrical & Communication Engineering, 2023
Paper / Code

Using Raman Spectroscopy and Machine Learning for prediction of Idiopathic Recurrent Spontaneous Miscarriage (IRSM). Improved this work using ATR-FTIR Spectroscopy in a follow-up study, and currently working on a multi-omics approach to the same problem.

Risk factors associated with mortality in hypersensitivity pneumonitis: a meta-analysis
Sanjukta Dasgupta, Anandita Bhattacharya Dhruva Abhijit Rajwade, Sushmita Roy Chowdhury, Koel Chaudhury
Expert Review of Respiratory Medicine (Journal), 2022
Paper / Code

Using different statistical tests and emperical analyses to identify risk factors associated with mortality in Hypersensitivity pneumonitis, a rare lung disease. Checking for Publication bias and heterogeneity in the data, and using meta-analysis to combine results from different studies.

Miscellaneous

Lung Segmentation and Disease Classification using Deep Learning
[Code]

Worked on using the Geneva HRCT dataset to segment lungs and classify diseases using Deep Learning. The final model uses a U-Net architecture for segmentation and a CNN for classification. The model was trained on a subset of the dataset and tested on CT scan images obtained externally through collaborations with Hospitals.

Functional Network Analysis of Calcium ion pathways in beta-islets of the Pancreas
[Code]

Worked on using Calcium ion imaging time-series data to extract functional networks in beta-islets of the pancreas. Used Voronoi Delaunay triangulation (see image) to extract the network, and used graph theory to analyze the network. Code is incomplete and will be updated soon.


This website's template is taken from Jon Barron. Do not scrape the HTML from this page itself, as it includes analytics tags that you do not want on your own website — use the github code instead. Also, consider using Leonid Keselman's Jekyll fork of this page.