Ph.D. position in Clinical Data Science


We are looking for a mathematician or computer scientist to develop algorithms for synthetic data generation in clinical taular datasets.

Institute of Computer Science, Department of Systems Biology and Bioinformatics, University of Rostock

We are an interdisciplinary and international team of scientists located on the north-east coast of Germany. We are looking for a mathematician or computer scientist to develop algorithms for synthetic data generation in clinical taular datasets.

This is a full-time position for a research scientist that can be used to gain a Ph.D. The project is funded by the German Research Foundation (DFG). The start of the project is in spring 2023 (depending on the candidates’ availability, between March and May).

The recruited candidate is expected to develop Machine Learning algorithms and workflows to solve practical problems in clinical and biomedical data science. In particular, there are two research problems of concern: (i) The multimodal nature of clinical and biomedical data in patient stratification; (ii) The generation of synthetic tabular that can facilitate federated learning, helping overcome data privacy issues.

Questions and applications should be send to (project leader) and

Further details

The recruited candidate is developing Machine Learning algorithms and workflows to solve practical problems in clinical and biomedical data science. In particular, there are two research problems of concern:

Problem 1: Clinical data is diverse in nature consisting of multiple modes such as tabular routine data, omics data, radiological images, and text data. The multi-modal nature of clinical and biomedical data makes it difficult to address tasks like patient stratification and decision-making.

Approach: The selected candidate will use diverse state-of-the-art machine learning techniques to form meaningful low-dimensional embeddings, which will be achieved by integrating data across different modalities. The developed algorithms and workflows will be deployed in the form of online tools.

Problem 2: Synthetic data generation often aids in addressing diverse practical problems prevalent in clinical data science. Such problems include scarcity of data, skewed data, and privacy issues hindering data sharing.

Approach: The selected candidate will build on existing in-house research on this topic, by developing, benchmarking, and deploying state-of-the-art algorithms for task-specific synthetic sample generation. The primary focus would be to generate synthetic tabular data that can serve as an alternative to real data, thereby facilitating federated learning.

A few publications from our team for reference

  • Bej, S., Sarkar, J., Biswas, S., Wolkenhauer, O. et al.; Identification and epidemiological characterization of Type-2 diabetes sub-population using an unsupervised machine learning approach. Nutrition Diabetes 12, 27 (2022).
  • Bej S., Schultz K., Srivastava P., Wolfien, M., Wolkenhauer O.; A multi-schematic classifier-independent oversampling approach for imbalanced datasets, IEEE Access, vol. 9, pp. 123358-123374, 2021.
  • Bej, S., Davtyan, N., Wolfien, M., Nassar, M., Wolkenhauer, O.; LoRAS: An oversampling approach for imbalanced datasets, Machine Learning vol 110, 279–301 (2021).
  • Schultz K., Bej S., Hahn W., Wolfien, M., Srivastava, P., and Wolkenhauer, O.; Convex space learning improves deep-generative oversampling for tabular imbalanced classification on smaller datasets. ArXiv, 2022.

Requirements for applicants

Essential skills

  • Applicants must have a master’s degree in computer science, mathematics, or the physical and engineering sciences
  • Applicants are required to have an excellent academic record.
  • Prior knowledge of machine learning and deep learning, at least at a fundamental level.
  • Excellent communication skills and academic writing skills are essential.
  • Applicants are required to have excellent coding skills in Python.
  • No one in our team works in isolation, and we are therefore seeking a team player. We employ and validate our work with partners in a clinical setting, which requires an interest in interdisciplinary collaborations.

Bonus skills

  • A deeper theoretical understanding of probability and statistics, linear algebra, and computational optimization is a plus.
  • Knowledge of efficient computation is a plus.
  • A GitHub profile and a knack for software-based tool designing and deployment is a plus.
  • An interest in teaching is a plus. There are no teaching duties required with this post, but we care about sharing our expertise and also about the career development of the team members.
  • Prior experience in scientific writing is a plus.

What the University offers

  • Employment relationship according to the provisions of the collective agreement for the public service of the federal states (TV-L) with the employer state of Mecklenburg-Western Pomerania, represented by the University of Rostock
  • Salary with salary group 13 if the personal and collective agreement requirements are met.
  • 30 days annual leave and annual bonus; additional pension plan (VBL).
  • Flexible working hours.
  • A wide range of offers for health promotion and for the compatibility of family and work, e.g. through our family office or our health management URgesund.
  • Variety of further training opportunities, including language courses, IT courses, and seminars for professional development.
  • Opportunities to participate in a wide range of university sports.

What the team offers

We are an interdisciplinary and international team with a flat hierarchy. We support and complement each other. The department has an outstanding track record in research, teaching, and developing the career of young scientists. We are consistently ranked top in evaluations. A large network of academic and industry partners provides a good basis for developing careers in academia and industry.

The University as an employer

Founded in 1419, the University of Rostock is one of the oldest Universities in Northern Europe. The region around Rostock is a popular tourist destination, with old Hanseatic towns, scenic landscapes, and beaches. The Department of Systems Biology & Bioinformatics was founded in 2004 and has a long successful track record in using mathematical modeling and data analysis in the life sciences. The department is part of the Institute of Computer Science.

Equal opportunities are important to us. We welcome applications from suitable severely disabled or equivalent people. We strive to increase the proportion of women in research and teaching and therefore encourage relevantly qualified women to apply. Furthermore, we welcome applications from people of other nationalities or with a migration background

Application Procedure

We look forward to receiving your online application with complete, informative documents (cover letter, resume, diploma with indication of final grade) by 22.01.2023 at the latest. Only applications received via our application portal and containing all requested documents will be considered.

To clarify your questions please contact Saptarshi Bej ( or Olaf Wolkenhauer (