Saptarshi Bej

Stays hungry, stays foolish, seeks to learn!

Research interest

Currently, I am handling several projects from diverse domains of research in SBI Rostock.

1) Research on Imbalanced datasets: In real world scenarios, datasets are often imbalanced. That is, the datasets meant for supervised learning, divides into classes, where in some classes there are a very large number of instances, compared to the others. Training machine learning algorithms on such data is challenging. We have developed several algorithms that overcomes problems of widely used algorithms. We are looking for several biological/clinical applications related to personalized treatment for our methods. Furthermore, we already developed an application of these algorithms on Single-cell technology.

2) Research on application of ML epidemiological data: I have also worked recently on application of ML based algorithms on epidemiological data. We used data from the National Health Survey performed by the government of India. From this data we investigate on patient stratification for Type II diabetes mellitus patients. In the process we developed a workflow for stratifying patients from epidemiological data with diverse data types.

3) Research on applications of Transformer networks: My current major interest is investigating applications of state-of-the-art NLP based methods such as transformers and adopting them for several bio-medical data types. I am investigating possible applications of such technologies in untargeted-metabolomics data, epidemiological text data.

4) Network analysis: I also like to work on network analysis strategies for extracting meaningful information from Protein interaction networks.

 

Projects

Research Projects

Machine Learning on Imbalanced datasets

In real world scenarios, datasets are often imbalanced. That is, the datasets meant for supervised learning, divides into classes, where in some classes there are a very large number of instancess, compared to the others. Training machine learning algorithms on such data is challenging. We have developed an algorithm that overcomes problems of widely used algorithms.

More

The TOTO Project: Towards a Theory of Tissue Organisation

 ~ In biology, the exception is the rule. ~

 ~ With our work, we are not really interested in the unique, but in what is general in the unique.~

With this project, we want to address a biological and a methodological challenge. First, we wish to clarify how the functioning of cells, and the functioning of a tissue relate to each other. Do cells exercise a degree of autonomy, or is their behavior completely determined by the functioning of the tissue? Such questions are important in understanding the emergence and progression of diseases. For example, it remains unclear whether the causative origin of colon cancer is a cell, or a consequence of tissue organization.

 

More

GB-XMap: Assessing the risk of gut-brain cross-diseases

Investigating the gut-brain-axis

The gut–brain axis (GBA) provides a bidirectional homeostatic communication between the gastrointestinal tract and the central nervous system. The interdisciplinary collaboration is going to fully explore a first comprehensive GBA cross-disease map of genetic, expression and regulatory changes associated with ulcerative colitis and schizophrenia disease entities.

More

Academic background

 2018-present Research Assistant and PhD student, SBI, Universität Rostock  Rostock
 2016-2017 Research assistant, Universität Paderborn
 2009-2014 Integrated BS-MS degree (major in Mathematics and specialization in Graph Theory), Indian Institute of Science Eduaction and Research, Kolkata

 

 

Selected publications

LoRAS: An oversampling approach for imbalanced datasets

Bej S, Davtyan N, Wolfien M, Nassar M, Wolkenhauer O

A multi-schematic classifier-independent oversampling approach for imbalanced datasets

Bej S, Schultz K, Srivastava P, Wolfien M, Wolkenhauer O

arXiv, 2021

Automated annotation of rare-cell types from single-cell RNA-sequencing data through synthetic oversampling

Bej S, Galow AM, David R, Wolfien M, Wolkenhauer O

bioRxiv 2021

Hamiltonian cycles in annular decomposable Barnette graphs (accepted in JDMSC)

Saptarshi Bej

Barnette's conjecture is an unsolved problem in graph theory. The problem states that every 3-regular (cubic), 3-connected, planar, bipartite (Barnette) graph is Hamiltonian. Partial results have been derived with restrictions on number of vertices, several properties of face-partitions and dual graphs of Barnette graphs while some studies focus just on structural characterizations of Barnette graphs. Noting that Spider web graphs are a subclass of Annular Decomposable Barnette (ADB graphs) graphs and are Hamiltonian, we study ADB graphs and their annular-connected subclass (ADB-AC graphs). We show that ADB-AC graphs can be generated from the smallest Barnette graph using recursive edge operations. We derive several conditions assuring the existence of Hamiltonian cycles in ADB-AC graphs without imposing restrictions on number of vertices, face size or any other constraints on the face partitions. We show that there can be two types of annuli in ADB-AC graphs, ring annuli and block annuli. Our main result is, ADB-AC graphs having non-singular sequences of ring annuli are Hamiltonian.

Protein-coding variants contribute to the risk of atopic dermatitis and skin-specific gene expression

Mucha S, ... Bej S, ..., Wolfien M, ..., Wolkenhauer O, ..., Ellinghaus D

On extension of regular graphs

Banerjee A, Bej S

Coloring sums of extensions of certain graphs

Kok J, Bej S

Factors of edge-chromatic critical graphs: a brief survey and some equivalences

Bej S, Steffen E

Combining uniform manifold approximation with localized affine shadowsampling improves classification of imbalanced datasets

Bej S, Srivastava P, Wolfien M, Wolkenhauer O

Part of IJCNN 2021, IEEEeXplore (accepted)

Skills

  • Graph and Network Theory
  • Boolean modelling
  • Python
  • Machine learning
  • Deep Learning
  • RNA seq data analysis

 

Awards and Distinctions

  • DAAD pries 2020 für hervorragende Leistungen ausländischer Studierender an (Universität Rostock)

Teaching Experience

  • Tutor in the 'Biosystems modelling and simulation' course offered at the University of Rostock from 2019-2020. My subject of teaching includes introduction to machine learning and deep learning and their applicability in the biomedical fields
  • Tutor in the 'Data Science with Python' undergraduate seminar course offered at the University of Rostock from 2020.