Publications

You can also find my articles on my Google Scholar profile.

JULI: Jailbreak Large Language Models by Self-Introspection

Proceedings of the International Conference on Learning Representations (ICLR), 2026

We propose Jailbreaking Using LLM Introspection (JULI), which jailbreaks LLMs by manipulating the token log probabilities, using a tiny plug-in block, BiasNet.

Download here

MobHAR: Source-free Knowledge Transfer for Human Activity Recognition on Mobile Devices

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Volume 9, Issue 1, 2025

This paper presents a novel source-free domain adaptation framework for HAR that effectively handles substantial domain discrepancies across different datasets.

Download here

Artemis: Defending Against Backdoor Attacks via Distribution Shift

Proceedings of IEEE Transactions on Dependable and Secure Computing, Volume: 22, Issue: 2, March-April 2025, 2024

In this work, we propose a novel backdoor defense approach called ARTEMIS, which utilizes distribution shifts to eliminate the discrepancy between poisoned and benign samples in the feature space.

Download here

Jesson Wang

Publications

JULI: Jailbreak Large Language Models by Self-Introspection

MobHAR: Source-free Knowledge Transfer for Human Activity Recognition on Mobile Devices

Artemis: Defending Against Backdoor Attacks via Distribution Shift