What actually is FindingPheno?
FindingPheno is a Research and Innovation Action (RIA) combining eight different academic and industrial partners from five countries. RIAs are a specific type of international research collaboration focused on establishing new knowledge, where the partners work together to explore new ideas, technologies, or methods that could eventually become a marketable product. Our funding came from the European Union via Horizon 2020 (H2020), part of the European Framework Programme for Research and Innovation. This framework includes a succession of Work Programmes offering significant public funding to cutting edge research and new product development with the aim of promoting scientific excellence and industrial leadership while developing solutions to societal problems. H2020 was the eighth out of nine such Work Programmes, awarding funding between 2014 and 2020, and all our project activities and outputs are regulated under this programme.
OK, but really, what is FindingPheno?
FindingPheno is a group of data scientists, theoretical biologists and industrial researchers from different organisations working together to find new ways for analysing biological data. We focus on omics data, i.e. high-throughput biochemical measurements of all of the molecules of one type (DNA, mRNA, proteins, metabolites, etc) in a biological sample, using already existing data sets sourced from public repositories. Because of this focus on reuse rather than new data generation, our project does not include any wet lab or fieldwork activities allowing us to concentrate fully on the computational aspects of this task.

Why do we need FindingPheno?
Ever since the emergence of the Human Genome Project in the early 2000s, the generation of biological data has exploded, with upward trends in new technology, data volume, and data complexity forming an overwhelming tsunami of biological information. Omics data, in particular, continues to accumulate with increasing rates of publication (Fig 1) and data creation (Fig 2) within this area. Increasingly, these data sets contain information from more than one type of molecule within the same sample, known as multi-omics data, and may even include matching data for both a host organism (e.g. a plant or animal) and its associated microbiome, i.e. hologenomic data.
