The University of Milano-Bicocca (https://en.unimib.it/) is a young,
multidisciplinary university active in various fields: economics and
statistics, legal, scientific, medical, sociological, psychological,
and pedagogical. It is an innovative university which has created an
extensive network of collaborations that includes many world-famous
universities, research centers and top corporations.
In the Times Higher Education rankings 2015 dedicated to the best
hundred universities with less than 50 years, the University was ranked 24th in the world, 1st in Italy.
The Department of Informatics, Systems and Communications (DISCo) is a
leading Computer Science research and teaching unit in Italy. The
Bioinformatics and Experimental Algorithmics (BIAS) research lab
invites applications for a Doctoral Student in Data Structures and
Algorithms for Graph pangenomes, under the supervision of Professor
Paola Bonizzoni.
The position is funded by the Innovative Training Network (ITN)
project “Algorithms for PAngenome Computational Analysis (ALPACA)” of
the Horizon 2020 Marie Skłodowska-Curie Actions (MSCA) Work Programme.
The candidate will join the BIAS team of Professor Paola Bonizzoni who
is currently leading another EU-funded international project (PANGAIA)
on Data Structures and Algorithms for Graph Pangenomes.
Representations for the comparative and hierarchical analysis of pan-genomes
The representation of multiple genomes in a graph pangenome is a
computational problem that has been faced by indexing paths via
compact data structures, such as the FM-index, positional BWT and the
graph BWT. In this framework, some important questions are still
unsolved and require the development of fast and efficient algorithmic
approaches, including querying a graph-based data structure,
sequences-to-graph and graph-to-graph comparison, inferring variations
(included structural variations) between genomes. A more general
question is how to deal with multiple pangenomes, such as those
emerging in the context of metagenomics (the study of multiple species
in an environment) and transcriptomics (the study of gene expression
and transcription). The main focus of this project is on developing
representations of pangenomes that allow fast and space-efficient
queries of multiple pangenomes, such as the search for a given
substring in the pangenomes (i.e. pattern matching) and the search for
approximate matches (i.e. sequence alignment or mapping).
We will investigate the problem under the assumption that the genomes
that are encompassed in the pangenome are evolutionarily related, and
such relations are represented with a phylogenetic tree or network.
Therefore, we need to exploit ancestral relationships.
We want to overcome the limitations of the usual BWT-based indexing of
a single genome, by extending the known approaches. Moreover, we plan
to develop tools that allow to compare a set of reads, possibly a
mixture of short and long reads with a set of pangenomes as well as
other graph-based representations of gene structures. For this
purpose, graph-based representations, where several millions of colors
are used to encode the information of reads and their applications to
pangenome comparison will be investigated to propose novel data
structures in pangenomics.
Supervisor
Paola Bonizzoni (UNIMIB)
Co-supervisors
Gianluca Della Vedova (UNIMIB)
Host institution
University of Milano – Bicocca
Department of Computer Science, Systems, and Communication
PhD program
Computer Science (http://phd-computer-science.disco.unimib.it/)
PhD school (https://en.unimib.it/education/doctoral-research-phd-programmes)
Expected results
New data structures for representing pangenomes, new algorithms for
querying and comparing sets of pan-genomes, and for comparing a set of
pangenomes and a set of reads.
Planned secondments
Part of the research activities will be conducted together with our
collaborators from the ALPACA network, where the candidate will be
seconded.
Required profile
Strong background in Computer Science, Mathematics, or related fields;
good command of English. Good knowledge of a low-level programming
language (C, C++, Rust) and experience with bioinformatics and
advanced data structures (Burrows-Wheeler Transform, de Bruijn graphs)
are welcome.
Applicants must satisfy the requirements of an Early Stage Researcher
as defined by the MSC Work Programme: 1) On the starting date of your
employment with the University of Milano – Bicocca, you are in the
first four years of your research career and have not (yet) been
awarded a doctoral degree; 2) You have not resided and/or have had
your main activity (study, work, etc.) in Italy for more than 12
months during the 3 years prior to the starting date of your
anticipated employment with the University of Milano – Bicocca.
Applicants are expected to acquire the doctoral student status in the
Doctoral Programme in Computer Science at the University of Milano –
Bicocca during the standard 6-month probationary period.
Early Stage Researcher requirements and employment conditions in the
MSC Work Programme, are detailed at
https://ec.europa.eu/research/mariecurieactions/resources/document-libraries/information-note-fellows-innovative-training-networks-itn_en
Application deadline
February 15th, 2021
Starting date
The starting date is in early September 2021, with the exact date negotiable.
Salary and benefits
The position is full-time and funded for three years. The salary is
competitive and complies with the MSC Work Programme: 3500 euros per
month before taxes, consisting of Living and Mobility allowance after
compulsory deductions. A conditional Family allowance of 385 euros can
be added to the salary.
How to apply
Please submit your application by sending directly by email the
application to Professor Paola Bonizzoni (paola.bonizzoni@unimib)
cc’ing Professor Gianluca Della Vedova
(gianluca.dellavedova@unimib.it). The application shall include the
following attachments as a single pdf file (in English):
• CV with possible publications
• Cover letter describing motivation, research interests, and
declaration of satisfying the MSC Work Programme requirements for an
Early Stage Researcher
• Contact details of two potential referees who agreed to provide
letters of recommendation.
Applications will be given full consideration if received by February
15, 2021, 23:59 (CET). Applications received after this date will
still be considered, until the position is filled.
More details are available at https://algolab.eu/grants/phd-position-available/