In healthcare, differences observed between demographic groups can generally be categorised as biological or non-biological. Non-biological differences, such as visit frequency and reporting style, are more challenging to track and can unexpectedly influence the predictive bias of machine learning algorithms. This is particularly true when dealing with complex free-text data in the mental health domain. In this talk, we will present our framework for analysing text-related bias in Natural Language Processing (NLP) models, developed for the paediatric anxiety use case with a focus on sex demographic subgroups. Our framework first measures model bias and then investigates the origins of this bias in statistical word distributions and the generalisation capacities of NLP algorithms. Motivated by these findings, we propose a data-centric bias mitigation strategy based on sentence informativeness filtering and masking gender-related words. Our approach demonstrated a bias reduction of up to 27%, improving classification parity between sex demographic groups while maintaining overall performance.