Skip to content

Women Wearing Lipstick: Measuring Related Object to Gender Bias. Findings EMNLP 2023

Notifications You must be signed in to change notification settings

ahmedssabir/GenderScore

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

99 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Women Wearing Lipstick: Measuring the Bias Between an Object and Its Related Gender

Overview

In this paper, we investigate the impact of objects on gender bias in image captioning systems. Our results show that only gender-specific objects have a strong gender bias (e.g. woman-lipstick). In addition, we propose a visual semantic-based gender score that measures the degree of bias and can be used as a plug-in for any image captioning system. Our experiments demonstrate the utility of the gender score, since we observe that our score can measure the bias relation between a caption and its related gender; therefore, our score can be used as an additional metric to the existing Object Gender Co-Occ approach.

This repository contains the implementation of the paper Women Wearing Lipstick: Measuring the Bias Between an Object and Its Related Gender. EMNLP Findings 2023

arXiv Website shields.io huggingface huggingface

Quick Start

For a quick start please have a look at this project page, paper demo and recent demo with LLLAMA-3.2 demo

Made withJupyter

Requirements

  • Python 3.7
  • sentence_transformers 2.2.2
conda create -n gender_score python=3.7 anaconda
conda activate gender_score
pip install -U sentence-transformers 

Gender Score

In this work, we proposed two object-to-gender bias scores: (1) a direct Gender Score (GS), and (2) a [ MASK ] based Gender Score Estimation (GE). For the direct score, the model uses the visual context to predict the degree of related gender-object bias.

To run the Gender Score

python model_GS.py

using any pre-trained models as follows:

parser.add_argument('--vis', default='visual-context_label.txt',help='class-label from the classifier (CLIP)', type=str, required=True)  
parser.add_argument('--vis_prob', default='visual-context.txt', help='prob from the classifier (Resent152/CLIP)', type=str, required=True) 
parser.add_argument('--c',  default='caption.txt', help='caption from the baseline (any)', type=str, required=True) 
parser.add_argument('--GPT2model', default="gpt2", help='gpt2, gpt2-medium, gpt2-large, gpt2-xl, distilgpt2', type=str, required=False)  
parser.add_argument('--BERTmodel', default='roberta-large-nli-stsb-mean-tokens', help='all-mpnet-base-v2, multi-qa-mpnet-base-dot-v1, all-distilroberta-v1', type=str, required=False) 

To run Gender Score (e.g. man-motorcycle) we need three inputs: (1) caption $y$ with the associated gender $a$ as in ($y_{a}$), (2) object information $o$ (i.e. visual bias) extracted from the image $I$, as $o(I)$, and (3) $\text{P}(c_{o})$ probability confidence of the object i.e. bias in the image. Please refer to the paper for more details. (to extract the object visual information, please refer to this page).

input

Caption: a man sitting on a blue motorcycle in a parking lot
visual context: motor scooter
visual context prob: 0.222983188
python model_GS.py --GPT2model gpt2  --BERTmodel roberta-large-nli-stsb-mean-tokens --vis  man_motorcycle_GS/man_motorcycle_visual_context.txt --vis_prob  man_motorcycle_GS/man_motorcycle_visual_context_prob.txt --c man_motorcycle_GS/man_motorcycle_caption.txt

output gender_score_output.txt

a man sitting on a blue motorcycle in a parking lot,  object-gender_score: 0.3145708898422527

By computing the object-gender_score for women = 0.27773833243385865, we can estimate the object-to-gender bias ratio toward men at 53%.

Gender Score Estimation

Additionally, inspired by the Mask Language Model, the model can estimate the Mask gender using the bias relation between the gender and object information from the image.

Example

input

Caption: a [MASK] riding a motorcycle on a road
visual context: motor scooter
visual context prob: 0.2183
python model_GE.py --GPT2model gpt2  --BERTmodel roberta-large-nli-stsb-mean-tokens --vis  man_motorcycle_GE/visual_context_demo_motorcycle.txt --vis_prob  man_motorcycle_GE/visual_context_prob_demo_motorcycle.txt --c man_motorcycle_GE/caption_demo_motorcycle_MASK.txt

output

# object-to-m bias 
caption_m a man riding a motorcycle on a road
LM: 0.12759140133857727 # initial bias without visual
cosine distance score (sim): 0.5452305674552917 # gender object distance 
gender score_m: 0.45320714150193153

# object-to-w bias 
caption_w a woman riding a motorcycle on a road
LM: 0.11249390989542007 # initial bias without visual
cosine distance score (sim): 0.5037289261817932 # gender object distance 
gender score_w: 0.39912252800731546

# most object-to-gender bias 
object_gender_caption: a man riding a motorcycle on a road
ratio_to_m: 53.17275201306536
ratio_to_w: 46.82724798693463

Citation

The details of this repo are described in the following paper. If you find this repo useful, please kindly cite it:

@article{sabir2023women,
  title={Women Wearing Lipstick: Measuring the Bias Between an Object and Its Related Gender},
  author={Sabir, Ahmed and Padr{\'o}, Llu{\'\i}s},
  journal={arXiv preprint arXiv:2310.19130},
  year={2023}
}

Acknowledgement

The implementation of the Gender Score relies on resources from lm-score, Huggingface Transformers, and SBERT. We thank the original authors for their well organized codebase.

About

Women Wearing Lipstick: Measuring Related Object to Gender Bias. Findings EMNLP 2023

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published