|
Protein secretion is a key process in bacteria, enabling proteins to reach extracellular spaces, other microbes, or host cells. Bacteria use diverse secretion systems, each with distinct structures, substrates, and biological roles. Effector proteins delivered by these systems manipulate host processes, from immune evasion to cytoskeletal disruption, highlighting the importance of accurate prediction for understanding bacterial pathogenicity.
We developed PLM-Effector, a hybrid framework that combines pre-trained protein language models with deep learning architectures to achieve robust, type-specific prediction of secreted proteins, including T1SE, T2SE, T3SE, T4SE, and T6SE. It systematically benchmarks multiple embeddings, evaluates both N-terminal and C-terminal regions, and identifies the most informative features for each secretion system. These features are integrated through a two-layer ensemble stacking strategy, capturing complex patterns that single-feature or single-system models often miss. By leveraging discriminative sequence representations and optimized neural models, PLM-Effector outperforms existing effector predictors across these secretion types, providing a generalizable, high-performing framework for prediction of bacterial secreted proteins.

|