SBI – Department of Systems Biology and Bioinformatics
Faculty of Computer Science and Electrical Engineering
University of Rostock
Ulmenstrasse 69 | 18057 Rostock
+49 381 498-7571
Currently I am handling two different projects from diverse domains of research in SBI Rostock.
1) Research on Imbalanced datasets: In real world scenarios, datasets are often imbalanced. That is, the datasets meant for supervised learning, divides into classes, where in some classes there are a very large number of instancess, compared to the others. Training machine learning algorithms on such data is challenging. We have developed an algorithm that overcomes problems of widely used algorithms. We developed an approach that overcomes this limitation of SMOTE and its extensions, employing Localized Random Affine Shadowsampling (LoRAS) to oversample from an approximated data manifold of the minority class. We benchmarked our LoRAS algorithm with 28 publicly available datasets and can show the improved approximation of the data manifold for a given class in those datasets. In addition, we have constructed a mathematical framework to prove that LoRAS is a more effective oversampling technique since it provides a better estimate to mean of the underlying local data distribution of the minority class data space. We compared the performance of LoRAS, SMOTE, and several SMOTE extensions and observed that for imbalanced datasets LoRAS, on average generates better predictive Machine Learning (ML) models in terms of F1-score and Balanced Accuracy. I am exploring several domains where LoRAS can be applied. Also, I am exploring options to further enhance the performance of LoRAS, with a goal to learn efficiently from "small datasets". My general interest also spans the fields of Machine learning, Deep learning and Artificial Intelligence.
2) I am also working on understanding the molecular master switches which lead to rewiring of gene regulatory networks in common inflammatory bowel diseases and neuropsychiatric diseases, such as schizophrenia. The concept of the Gut-Brain axis has been one of the most explored fields in Medical research since the inception of this Millennium. The project aims at initiating a new strategic alliance between two e:Med centres (Bonn, Kiel) and one de.NBI node (Rostock). We hypothesize that a systems-guided redefinition of gut-brain cross-disease manifestation helps to recognize and understand the immunological and neurological network perturbations of the GBA in order to identify concrete targets for therapeutic reprogramming of gene expression based on RNA-guided inactive Cas9 (CRISPR/Cas9) thus restoring normal gene expression programs in disease affected cells. Our systems medicine approach takes several layers of potential deregulation (DNA, mRNA, non-coding RNAs, TFs and proteins) into account and aims to create a multi-dimensional model. The novel findings are expected to extend the results of successful GWAS studies and to build the basis for functional gut–brain axis (GBA) experiments.
In real world scenarios, datasets are often imbalanced. That is, the datasets meant for supervised learning, divides into classes, where in some classes there are a very large number of instancess, compared to the others. Training machine learning algorithms on such data is challenging. We have developed an algorithm that overcomes problems of widely used algorithms.
~ In biology, the exception is the rule. ~
~ With our work, we are not really interested in the unique, but in what is general in the unique.~
With this project, we want to address a biological and a methodological challenge. First, we wish to clarify how the functioning of cells, and the functioning of a tissue relate to each other. Do cells exercise a degree of autonomy, or is their behavior completely determined by the functioning of the tissue? Such questions are important in understanding the emergence and progression of diseases. For example, it remains unclear whether the causative origin of colon cancer is a cell, or a consequence of tissue organization.
Investigating the gut-brain-axis
The gut–brain axis (GBA) provides a bidirectional homeostatic communication between the gastrointestinal tract and the central nervous system. The interdisciplinary collaboration is going to fully explore a first comprehensive GBA cross-disease map of genetic, expression and regulatory changes associated with ulcerative colitis and schizophrenia disease entities.
|2018-present||Research Assistant and PhD student, SBI, Universität Rostock Rostock|
|2016-2017||Research assistant, Universität Paderborn|
|2009-2014||Integrated BS-MS degree (major in Mathematics and specialization in Graph Theory), Indian Institute of Science Eduaction and Research, Kolkata
LoRAS: An oversampling approach for imbalanced datasets
Saptarshi Bej, Narek Davtyan, Markus Wolfien, Mariam Nassar, Olaf Wolkenhauer
** Submitted in Springer Machine Learning (Decision pending after major revision)
Protein-coding variants contribute to the risk of atopic dermatitis and skin-specific gene expression
Mucha S, ... Bej S, ..., Wolfien M, ..., Wolkenhauer O, ..., Ellinghaus D
The Journal of Allergy and Clinical Immunology
On extension of regular graphs
Banerjee A, Bej S
Coloring sums of extensions of certain graphs
Kok J, Bej S
Factors of edge-chromatic critical graphs: a brief survey and some equivalences
Bej S, Steffen E
- Graph and Network Theory
- Boolean modelling
- Machine learning
- Deep Learning
- RNA seq data analysis