Projects | Efthymios (Themis) Tsaprazlis

IARPA HIATUS: Human Interpretable Attribution of Text using Underlying Structure

Sep 2025 – Present USC ISI · UMD · UMich · UBirmingham

Formalized evaluation protocols for user and attribute privacy leakage in speech.
Developing a dataset and benchmark for authorship attribution from spoken language.
Designing methods for speaker anonymization across both acoustic and linguistic modalities.

Speaker Verification Speaker Anonymization DP

Visual Privacy Understanding

Sep 2024 – Present USC · Amazon

Received a gift award from the USC-Amazon Center on Secure and Trusted Machine Learning.
Developed a taxonomy of visual privacy risks grounded in legal frameworks.
Evaluated VLMs on privacy recognition and compositional reasoning.
Designing deployable VLM-as-a-judge models for privacy severity assessment, along with a dataset of compositional privacy risks.
Building a privacy advisory agent to detect, localize, and sanitize sensitive visual content for regulatory compliance.

Vision Language Models Benchmarking Reasoning Agentic AI

Horizon Europe - PILLAR-Robots: Purposeful Intrinsically motivated Lifelong Learning Autonomous Robots

Jan 2024 – Jul 2024 Athena RC · UDCoruna · CNR · Sorbonne · AI2Life · PAL Robotics

Developed robotic simulation pipelines
Enabled the robot to learn useful environmental information under minimal interaction during the exploration phase.
Investigated domain adaptation techniques to transfer learned behaviors from simulation to real-world robotic settings.

Vision Language Models Robotics Lifelong Learning

Enhancing Vision-Language Pre-training

Nov 2022 – Jul 2022 NTUA · UT Austin

Proposed a multimodal extension of contrastive vision–language models by incorporating dialogue as an additional modality.
Generated synthetic dialogue data to augment image–text datasets with richer contextual descriptions.
Developed a domain adaptation method to adapt models to dialogue inputs via a dual contrastive objective while preserving generalization.

Vision Language Models Self-supervised Learning Domain Adaptation Synthetic Data