Narrative information in Electronic Health Records (EHRs) contains a wealth of clinical information about treatments, diagnosis, medication and family history. In addition, the scientific literature represents a rich source of information that summarises the latest results and new research findings relevant to different diseases. These two textual sources often contain different types of valuable phenotypic information that may be complementary to each other. Combining details from each source thus has the potential to be useful in uncovering new disease-phenotypic associations. In turn, these associations can help to identify patients with high risk factors, and they can be useful in developing solutions to control the causes responsible for the development of different diseases.