ANTHEM: Development of Human Centred Healthcare AI Systems

ANTHEM (AdvaNced Technologies for Human centEred Medicine) is the programme under which I work as a Fixed term Researcher (RTDA) at the University of Milano Bicocca. My role is to build and evaluate trustworthy AI systems for healthcare, with a focus on retrieval and (large) language based systems (i.e., LLMs) such as conversational search and RAG.

A core part of my work is stakeholder driven development. I translate the needs of clinicians and other healthcare stakeholders into system requirements, evaluation protocols, and usable prototypes. This includes designing evidence grounded outputs, privacy aware workflows, and expert in the loop validation under practical constraints.

More information is available on the ANTHEM website.

Selected Sub-projects within ANTHEM

Privacy preserving clinical trial retrieval with small LLMs Lead
Uses small open source LLMs to generate effective search queries for clinical trial retrieval, designed for low compute settings and clinician oversight.
Impact: Demonstrates that small models can match or exceed expert written queries while remaining practical for resource constrained deployments and expert in the loop workflows.
Status: Published
Combining LLMs with knowledge bases for trustworthy health search Co-lead
Extracts health related claims from documents and verifies them against a knowledge base to produce a correctness signal that complements relevance based ranking.
Impact: Adds transparent trust signals by grounding correctness estimates in structured evidence rather than relying only on model judgement.
Status: Published
Factual medical QA (RAG System)Lead
Internal prototype conversational assistant for factual medical question answering, grounded in general knowledge plus an internal document collection, with citation first responses and reliability checks.
Impact: Targets higher answer reliability by design, combining retrieval, grounded generation, and structured error analysis.
Status: Internal prototype
Benchmark for phenotype centric medical NLP Lead
Benchmark that standardises medical tasks involving phenotypes across heterogeneous inputs, including structured phenotype data and raw clinical text.
Impact: Enables fast and comparable evaluation of LLMs, language models, and classical pipelines on essential tasks such as phenotype extraction, entity linking, and relation identification from text.
Status: In preparation

Selected Artifacts, Systems & Publications

Privacy preserving clinical trial retrieval with small LLMs Lead
Small open source LLMs for query generation under constraints of low compute and clinician oversight.
Link: IEEE Xplore
Combining LLMs with knowledge bases for trustworthy health search Co-lead
Knowledge grounded claim verification used as a correctness signal to support trustworthy ranking.
Link: Springer

Acknowledgement

These works were supported by the National Plan for NRRP Complementary Investments (PNC, established with the decree-law 6 May 2021, n. 59, converted by law n. 101 of 2021) in the call for the funding of research initiatives for technologies and innovative trajectories in the health and care sectors (Directorial Decree n. 931 of 06-06-2022), project n. PNC0000003, AdvaNced Technologies for Human-centrEd Medicine (ANTHEM). These works reflect only the authors’ views and opinions. Neither the Ministry for University and Research nor the European Commission can be considered responsible for them.