Saptarshi Bej

Stays hungry, stays foolish, seeks to learn!

Research interest

Currently I am handling two different projects from diverse domains of research in SBI Rostock.

1) Research on Imbalanced datasets: In real world scenarios, datasets are often imbalanced. That is, the datasets meant for supervised learning, divides into classes, where in some classes there are a very large number of instancess, compared to the others. Training machine learning algorithms on such data is challenging. We have developed an algorithm that overcomes problems of widely used algorithms. We developed an approach that overcomes this limitation of SMOTE and its extensions, employing Localized Random Affine Shadowsampling (LoRAS) to oversample from an approximated data manifold of the minority class. We benchmarked our LoRAS algorithm with 28 publicly available datasets and can show the improved approximation of the data manifold for a given class in those datasets. In addition, we have constructed a mathematical framework to prove that LoRAS is a more effective oversampling technique since it provides a better estimate to mean of the underlying local data distribution of the minority class data space. We compared the performance of LoRAS, SMOTE, and several SMOTE extensions and observed that for imbalanced datasets LoRAS, on average generates better predictive Machine Learning (ML) models in terms of F1-score and Balanced Accuracy. I am exploring several domains where LoRAS can be applied. Also, I am exploring options to further enhance the performance of LoRAS, with a goal to learn efficiently from "small datasets".  My general interest also spans the fields of Machine learning, Deep learning and Artificial Intelligence.

2) I am also working on understanding the molecular master switches which lead to rewiring of gene regulatory networks in common inflammatory bowel diseases and neuropsychiatric diseases, such as schizophrenia. The concept of the Gut-Brain axis has been one of the most explored fields in Medical research since the inception of this Millennium. The project aims at initiating a new strategic alliance between two e:Med centres (Bonn, Kiel) and one de.NBI node (Rostock). We hypothesize that a systems-guided redefinition of gut-brain cross-disease manifestation helps to recognize and understand the immunological and neurological network perturbations of the GBA in order to identify concrete targets for therapeutic reprogramming of gene expression based on RNA-guided inactive Cas9 (CRISPR/Cas9) thus restoring normal gene expression programs in disease affected cells. Our systems medicine approach takes several layers of potential deregulation (DNA, mRNA, non-coding RNAs, TFs and proteins) into account and aims to create a multi-dimensional model. The novel findings are expected to extend the results of successful GWAS studies and to build the basis for functional gut–brain axis (GBA) experiments.

 

Projects

Research Projects

Machine Learning on Imbalanced datasets

In real world scenarios, datasets are often imbalanced. That is, the datasets meant for supervised learning, divides into classes, where in some classes there are a very large number of instancess, compared to the others. Training machine learning algorithms on such data is challenging. We have developed an algorithm that overcomes problems of widely used algorithms.

More

The TOTO Project: Towards a Theory of Tissue Organisation

 ~ In biology, the exception is the rule. ~

 ~ With our work, we are not really interested in the unique, but in what is general in the unique.~

With this project, we want to address a biological and a methodological challenge. First, we wish to clarify how the functioning of cells, and the functioning of a tissue relate to each other. Do cells exercise a degree of autonomy, or is their behavior completely determined by the functioning of the tissue? Such questions are important in understanding the emergence and progression of diseases. For example, it remains unclear whether the causative origin of colon cancer is a cell, or a consequence of tissue organization.

 

More

GB-XMap: Assessing the risk of gut-brain cross-diseases

Investigating the gut-brain-axis

The gut–brain axis (GBA) provides a bidirectional homeostatic communication between the gastrointestinal tract and the central nervous system. The interdisciplinary collaboration is going to fully explore a first comprehensive GBA cross-disease map of genetic, expression and regulatory changes associated with ulcerative colitis and schizophrenia disease entities.

More

Academic background

 2018-present Research Assistant and PhD student, SBI, Universität Rostock  Rostock
 2016-2017 Research assistant, Universität Paderborn
 2009-2014 Integrated BS-MS degree (major in Mathematics and specialization in Graph Theory), Indian Institute of Science Eduaction and Research, Kolkata

 

 

Selected publications

On extension of regular graphs

Banerjee A, Bej S

Coloring sums of extensions of certain graphs

Kok J, Bej S

Factors of edge-chromatic critical graphs: a brief survey and some equivalences

Bej S, Steffen E

Skills

  • Graph and Network Theory
  • Boolean modelling
  • Python
  • Machine learning
  • Deep Learning
  • RNA seq data analysis