Saptarshi Bej

Stays hungry, stays foolish, seeks to learn!

Research interest

Currently, I am handling several projects from diverse domains of research in SBI Rostock.

1) Research on Imbalanced datasets: In real world scenarios, datasets are often imbalanced. That is, the datasets meant for supervised learning, divides into classes, where in some classes there are a very large number of instances, compared to the others. Training machine learning algorithms on such data is challenging. We have developed several algorithms that overcomes problems of widely used algorithms. We are looking for several biological/clinical applications related to personalized treatment for our methods. Furthermore, we already developed an application of these algorithms on Single-cell technology.

2) Research on application of ML epidemiological data: I have also worked recently on application of ML based algorithms on epidemiological data. We used data from the National Health Survey performed by the government of India. From this data we investigate on patient stratification for Type II diabetes mellitus patients. In the process we developed a workflow for stratifying patients from epidemiological data with diverse data types.

3) Research on applications of Transformer networks: My current major interest is investigating applications of state-of-the-art NLP based methods such as transformers and adopting them for several bio-medical data types. I am investigating possible applications of such technologies in untargeted-metabolomics data, epidemiological text data.

4) Network analysis: I also like to work on network analysis strategies for extracting meaningful information from Protein interaction networks.

 

Projects

Research Projects

Machine Learning on Imbalanced datasets

In real world scenarios, datasets are often imbalanced. That is, the datasets meant for supervised learning, divides into classes, where in some classes there are a very large number of instancess, compared to the others. Training machine learning algorithms on such data is challenging. We have developed an algorithm that overcomes problems of widely used algorithms.

More

The TOTO Project: Towards a Theory of Tissue Organisation

 ~ In biology, the exception is the rule. ~

 ~ With our work, we are not really interested in the unique, but in what is general in the unique.~

With this project, we want to address a biological and a methodological challenge. First, we wish to clarify how the functioning of cells, and the functioning of a tissue relate to each other. Do cells exercise a degree of autonomy, or is their behavior completely determined by the functioning of the tissue? Such questions are important in understanding the emergence and progression of diseases. For example, it remains unclear whether the causative origin of colon cancer is a cell, or a consequence of tissue organization.

 

More

GB-XMap: Assessing the risk of gut-brain cross-diseases

Investigating the gut-brain-axis

The gut–brain axis (GBA) provides a bidirectional homeostatic communication between the gastrointestinal tract and the central nervous system. The interdisciplinary collaboration is going to fully explore a first comprehensive GBA cross-disease map of genetic, expression and regulatory changes associated with ulcerative colitis and schizophrenia disease entities.

More

Academic background

 2018-present Research Assistant and PhD student, SBI, Universität Rostock  Rostock
 2016-2017 Research assistant, Universität Paderborn
 2009-2014 Integrated BS-MS degree (major in Mathematics and specialization in Graph Theory), Indian Institute of Science Eduaction and Research, Kolkata

 

 

Selected publications

LoRAS: An oversampling approach for imbalanced datasets

Bej S, Davtyan N, Wolfien M, Nassar M, Wolkenhauer O

Self-attention based models for the extraction of molecular interactions from biological texts

Prashant Srivastava, Saptarshi Bej, Kristina Yordanova, Olaf Wolkenhauer

(Accepted for publication in Biomolecules)

Preprint: https://www.preprints.org/manuscript/202110.0184/v1

Comprehensive Characterization of Multitissue Expression Landscape, Co-Expression Networks and Positive Selection in Pikeperch

Nguinkal JA, Verleih M, de los Ríos-Pérez L, Brunner RM, Sahm A, Bej S, Rebl A, Goldammer T

Cells 2021 (accepted)

A multi-schematic classifier-independent oversampling approach for imbalanced datasets

Bej S, Schultz K, Srivastava P, Wolfien M, Wolkenhauer O

IEEE Access 2021, Volume 9, 123358-123374

ISBN 10 (print): 2169-3536

ISBN 10 (online): 2169-3536

DOI: 10.1109/ACCESS.2021.3108450

URL: https://doi.org/10.1109/ACCESS.2021.3108450

Automated annotation of rare-cell types from single-cell RNA-sequencing data through synthetic oversampling

Bej S, Galow AM, David R, Wolfien M, Wolkenhauer O

bioRxiv 2021

Hamiltonian cycles in annular decomposable Barnette graphs

Bej S

JDMSC 2020 (accepted)

Protein-coding variants contribute to the risk of atopic dermatitis and skin-specific gene expression

Mucha S, ... Bej S, ..., Wolfien M, ..., Wolkenhauer O, ..., Ellinghaus D

On extension of regular graphs

Banerjee A, Bej S

Coloring sums of extensions of certain graphs

Kok J, Bej S

Factors of edge-chromatic critical graphs: a brief survey and some equivalences

Bej S, Steffen E

Combining uniform manifold approximation with localized affine shadowsampling improves classification of imbalanced datasets

Bej S, Srivastava P, Wolfien M, Wolkenhauer O

2021 International Joint Conference on Neural Networks (IJCNN), 2021, pp. 1-8,

Skills

  • Graph and Network Theory
  • Boolean modelling
  • Python
  • Machine learning
  • Deep Learning
  • RNA seq data analysis

 

Awards and Distinctions

  • DAAD pries 2020 für hervorragende Leistungen ausländischer Studierender an (Universität Rostock)

Teaching Experience

  • Tutor in the 'Biosystems modelling and simulation' course offered at the University of Rostock from 2019-2020. My subject of teaching includes introduction to machine learning and deep learning and their applicability in the biomedical fields
  • Tutor in the 'Data Science with Python' undergraduate seminar course offered at the University of Rostock from 2020.