BiasMap: Leveraging Cross-Attentions to Discover & Mitigate Hidden Social Biases in Text-to-Image Generation

Cross-attention attribution maps expose the representational bias that distributional fairness leaves untouched.

Rajatsubhra Chakraborty*¹, Xujun Che*¹, Depeng Xu¹, Cori Faklaris¹, Xi Niu¹, Shuhan Yuan²

¹University of North Carolina at Charlotte ²Utah State University

* Equal contribution (co-first authors)

Paper arXiv Code BibTeX

BiasMap discovery pipeline: Stable Diffusion to OVAM attribution maps to binary masks to IoU concept entanglement. — **Bias discovery via attribution maps.** For a generated face, BiasMap uses OVAM to extract cross-attention attribution maps for a demographic concept and a profession concept, binarizes them at a high-attention threshold, and measures their spatial overlap as an Intersection-over-Union (IoU) score. A high IoU means the demographic and the profession occupy the same pixels, evidence of hidden representational entanglement.

40.8%

mIoU reduction for gender entanglement

39.6%

mIoU reduction for race entanglement

0.189

best combined mIoU (FD + BiasMap)

Abstract

Bias discovery is critical for black-box generative models, especially text-to-image (TTI) models. Existing work focuses predominantly on output-level demographic distributions, which do not guarantee that concept representations are disentangled after mitigation. We propose BiasMap, a framework for uncovering latent concept-level representational biases in U-Net-based Stable Diffusion models.

BiasMap leverages cross-attention attribution maps to reveal structural entanglements between demographics (gender, race) and semantics (professions). Using these maps, we quantify spatial demographic-semantic entanglement via Intersection over Union (IoU), offering a lens into bias that remains hidden in existing fairness approaches. We further use BiasMap for mitigation through energy-guided diffusion sampling that modifies the latent noise space and minimizes the expected SoftIoU during denoising. Our findings show that existing fairness interventions may reduce the output distributional gap but often fail to disentangle concept-level coupling, whereas our method mitigates concept entanglement during generation while complementing distributional bias mitigation.

Method

Mitigation by energy-guided sampling

From a biased generation, BiasMap computes a differentiable SoftIoU between the demographic and semantic attribution maps and uses its gradient as an energy term. Combined with classifier-free guidance, this steers each denoising step toward latent states with lower concept entanglement, with no retraining and no architectural change to the diffusion model.

BiasMap mitigation pipeline via energy-guided diffusion sampling. — **Energy-guided diffusion sampling.** At each timestep the SoftIoU between attribution maps acts as a differentiable energy function; its gradient is injected as a noise correction alongside classifier-free guidance, pushing the trajectory toward disentangled representations while preserving prompt fidelity.

Qualitative Results

Where the attention actually lands

Across baselines, the profession mask and the demographic mask sit on top of each other on the face. BiasMap pushes the profession mask off the face onto professional markers while keeping the demographic mask on the face, driving the overlap (IoU) down.

Profession-Gender concept entanglement heatmaps across models. — **Profession-Gender entanglement.** Profession, gender, and overlapping attention regions across SD1.5, FairDiffusion, ITI-GEN, DiffLens, EFA, and BiasMap. BiasMap yields the lowest spatial overlap, with combined variants (IG+BM, FD+BM) lowest of all.

Profession-Race concept entanglement heatmaps across models. — **Profession-Race entanglement.** The same pattern holds for race: BiasMap and its combinations separate the demographic and semantic concepts that baseline methods leave coupled.

Key Findings

Output fairness is not latent fairness

RQ1

Bias begins inside the U-Net, before the image exists

Cross-attention attribution maps reveal bias as structured spatial patterns during diffusion, concentrated in the early down-sampling and final up-sampling 64×64 blocks, following a convex non-monotonic trend that mirrors the U-Net hierarchy.

RQ2

Balanced outputs can still hide stereotyped representations

Professions remain gendered or racialized inside the model even when output distributions look balanced. The mIoU metric exposes persistent spatial co-activation that distributional fairness measures completely miss.

RQ3

Energy-guided sampling disentangles the concepts

BiasMap steers sampling toward lower concept entanglement, achieving 40.8% mIoU reduction for gender and 39.6% for race while preserving image quality, targeting the root cause rather than adjusting outputs after the fact.

Contributions

What BiasMap adds

A bias localization method that precisely identifies and quantifies representational entanglement between demographic and semantic concepts.
Intersection-over-Union (IoU) as a metric for demographic-concept bias, complementing distribution-based metrics such as Risk Difference.
A debiasing method via energy-guided diffusion sampling that modifies the latent noise space and minimizes the expected SoftIoU during diffusion.
Empirical evidence that distribution-based mitigation leaves internal representational bias intact, validated across SD 1.5, SD 2.0, and SD 2.1.

Acknowledgements

This work was supported in part by the U.S. National Science Foundation under grant 2348391.

Citation

BibTeX

@inproceedings{biasmap2026,
  title     = {BiasMap: Leveraging Cross-Attentions to Discover and
               Mitigate Hidden Social Biases in Text-to-Image Generation},
  author    = {Chakraborty, Rajatsubhra and Che, Xujun and Xu, Depeng
               and Faklaris, Cori and Niu, Xi and Yuan, Shuhan},
  booktitle = {Proceedings of the 32nd ACM SIGKDD Conference on Knowledge
               Discovery and Data Mining (KDD)},
  year      = {2026},
  doi       = {10.1145/3770855.3818098}
}