---
authors:
  - Susanne RK Zajitschek
  - Felix Zajitschek
  - Russell Bonduriansky
  - Robert C Brooks
  - Will Cornwell
  - Daniel S Falster
  - Malgorzata Lagisz
  - Jeremy Mason
  - Alistair M Senior
  - Daniel WA Noble
  - Shinichi Nakagawa
  - Evolution & Ecology Research Center, School of Biological, Earth, and Environmental Sciences, University of New South Wales, Sydney, Australia; 
  - Liverpool John Moores University, School of Biological and Environmental Sciences, Liverpool, United Kingdom; 
  - European Bioinformatics Institute (EMBL- EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, United Kingdom; 
  - University of Sydney, Charles Perkins Centre, School of Life and Environmental Sciences, School of Mathematics and Statistics, Sydney, Australia; 
  - Division of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australia
bibliography: SexDiff.bib
---
    


# Sexual dimorphism in trait variability and its eco-evolutionary and statistical implications

## Abstract 
Biomedical and clinical sciences are experiencing a renewed interest in the fact that males and females differ in many anatomic, physiological, and behavioural traits. Sex differences in trait variability, however, are yet to receive similar recognition. In medical science, mammalian females are assumed to have higher trait variability due to estrous cycles (the ‘estrus-mediated variability hypothesis’); historically in biomedical research, females have been excluded for this reason. Contrastingly, evolutionary theory and associated data support the ‘greater male variability hypothesis’. Here, we test these competing hypotheses in 218 traits measured in >26,900 mice, using meta-analysis methods. Neither hypothesis could universally explain patterns in trait variability. Sex bias in variability was trait-dependent. While greater male variability was found in morphological traits, females were much more variable in immunological traits. Sex-specific variability has eco-evolutionary ramifications, including sex-dependent responses to climate change, as well as statistical implications including power analysis considering sex difference in variance.

## Introduction
Sex differences arise because selection acts on the two sexes differently, especially on traits associated with mating and reproduction (@darwin1871) (Darwin, 1871). Therefore, sex differences are widespread, a fact which is unsurprising to any evolutionary biologist. However, scientists in many (bio-)medical fields have not necessarily regarded sex as a biological factor of intrinsic interest (@clayton2016; @flanagan2014; @karp2017; @klein2015; @prendergast2014; @shansky2016) (Clayton, 2016; Flanagan, 2014; Karp et al., 2017; Klein et al., 2015; Prendergast et al., 2014; Shansky and Woolley, 2016). Therefore, many (bio-)medical studies have only been conducted with male subjects. Consequently, our knowledge is biased. For example, we know far more about drug efficacy in male compared to female subjects, contributing to a poor understanding of how the sexes respond differently to medical interventions (@nowogrodzki2017)(Nowogrodzki, 2017). This gap in knowledge is predicted to lead to overmedication and adverse drug reactions in women (@zucker2020) (Zucker and Prendergast, 2020). Only recently have (bio-)medical scientists started considering sex differences in their research (@dorris2015; @ingvorsen2017; @robinson2017; @smarr2017; @ahmad2017; @foltin2018; @thompson2018)(Dorris et al., 2015; Ingvorsen et al., 2017; Robinson et al., 2017; Smarr et al., 2017;Ahmad et al., 2017; Foltin and Evans, 2018; Thompson et al., 2018). Indeed, the National Insti- tutes of Health (NIH) have now implemented new guidelines for animal and human research study designs, requiring that sex be included as a biological variable (@clayton2016; @clayton2014; @nih2015a) (NIH, 2015a(Clayton, 2016; Clayton and Collins, 2014; NIH, 2015a). 
When comparing the sexes, biologists generally focus on mean differences in trait values, placing little or no emphasis on sex differences in trait variability (see Figure 1 for a diagram explaining differences in means and variances). Despite this, two hypotheses exist that explain why trait variability might be expected to differ between the sexes. Interestingly, these two hypotheses make opposing predictions.  

## Figure 1
 ![Fig1](./Figures/Figure1.png)

### Overview of meta-analytic methods used to detect differences in means and variances in any given trait (e.g. body size in mice).
The orange shading represents females (F), turquoise shading stands for males (M). The solid circle represents a mean trait value within the respective group. Solid lines represent standard deviation, with upper and lower bounds indicated by diamond shapes. Below, we present three types of effect sizes that can be used for comparing two groups, along with the respective formulas and interpretations. Compared to lnVR (the ratio of SD), lnCVR (the ratio of CV or relative variance) provides a more general measure of the difference in variability between two groups (mean-adjusted variability ratio).
 
 ## Figure 1—figure supplement 1 
 
 ![Fig1sup](./Figures/Figure1sup.png)
 
 ### Mean-variance relationships (log(Mean) vs log(SD, standard deviation)) across all traits for males (A) and females (B).

First, the ‘estrus-mediated variability hypothesis’ (Figure 2), which emerged in the (bio-)medical research field, assumes that the female estrous cycle (see e.g. Prendergast et al., 2014; Beery and Zucker, 2011) causes higher variability across traits in female subjects. A wide range of labile traits are presumed to co-vary with physiological changes that are induced by reproductive hormones. High variability is, therefore, expected to be particularly prominent when the stage of the estrous cycle is unknown and unaccounted for. This higher trait variability, resulting from females being at different stages of their estrous cycle, is the main reason for why female research subjects are often excluded from biomedical research trials, especially in the fields of neuroscience, physiology and pharmacology (NIH, 2015a). Female exclusion has traditionally been justified based on the grounds that including females in empirical research leads to a loss of statistical power, or that animals must be sampled across the estrous cycle for one to make valid conclusions, requiring more time and resources.    

## Figure 2
 ![Fig2](./Figures/Figure2.png)
 
### The two hypotheses (‘greater male variability’ versus ‘estrus-mediated variability’) have different predictions on how variabilities influence total observed phenotypic variance (Vtotal in the figure).
For greater male variability, the within-subject (or within-trait) variation Vwithin could be potentially negligible or is equal in males and females. This is illustrated as the shaded distributions around each individual mean (dashed vertical lines), which are of equal area for the males (turquoise) and females (orange). The greater value of Vtotal is driven by wider distribution of mean trait values in males compared to females (i.e. Vbetween, represented by a thick horizontal bar). The estrus-mediated variability hypothesis, in contrast, assumes that within-subject [or within-trait] variability is much higher in females than in males (broader orange-shaded trait distributions than turquoise distributions), while the variability of the means between individuals stays the same (thick horizontal bars).
 

Second, the ‘greater male variability hypothesis’ suggests males exhibit higher trait variability because of two different mechanisms. The first mechanism is based on males being the heteroga- metic sex in mammals. Mammalian females possess two X chromosomes, leading to an ‘averaging’ of trait expression across the genes on each chromosome. In contrast, males exhibit greater variance because expression of genes on a single X chromosome is likely to lead to more extreme trait values (Reinhold and Engqvist, 2013). The second mechanism is based on males being under stronger sex- ual selection (Pomiankowski and Moller, 1995; Cuervo and Møller, 1999; Cuervo and Møller,2001). Empirical evidence supports higher variability of traits that are sexually selected, often har- bouring high genetic variance and being condition-dependent, which makes sense as ‘condition’ as a trait is likely to be based on numerous loci (Rowe and Houle, 1996; Tomkins et al., 2004). Thus, higher genetic and, thus, phenotypic variance resulting from sexual selection is expected to charac- terise sexually selected traits. In mammals, it is likely that both mechanisms are operating concomi- tantly. So far, the ‘greater male variability hypothesis’ has gained some support in the evolutionary and psychological literature (Reinhold and Engqvist, 2013; Lehre et al., 2009).
Here, we conduct the first comprehensive test of the greater male variability and estrus-mediated variability hypotheses in mice (Figure 2; Reinhold and Engqvist, 2013; Johnson et al., 2008;Hedges and Nowell, 1995; Itoh and Arnold, 2015; Becker et al., 2016; Beery, 2018), examining sex differences in variance across 218 traits in 26,916 animals. To this end, we carry out a series of meta-analyses in two steps (Figure 3). First, we quantify the natural logarithm of the male to female coefficients of variation, CV, or relative variance (lnCVR) for each cohort (population) of mice, for dif- ferent traits, along with the variability ratio of male to female standard deviations, SD, on the log scale (lnVR, following Nakagawa et al., 2015, see Figure 1). Then, we analyse these effect sizes to quantify sex bias in variance for each trait using meta-analytic methods. To better understand our results, and match them to previously reported sex differences in trait means (Karp et al., 2017), we also quantify and analyse the log response ratio (lnRR). Next, we statistically amalgamate the trait- level results to test our hypotheses and to quantify the degree of sex bias in and across nine functional trait groups (for details on the grouping, see below). Our meta-analytic approach allows easy interpretation and comparison with earlier and future studies. Further, the proposed method using lnCVR (and lnVR) is probably the only practical method to compare variability between two sexes within and across studies (Nakagawa et al., 2015; Senior et al., 2020), as far as we are aware. Also, the use of a ratio (i.e. lnRR, lnVR, lnCVR) between two groups (males and females) naturally controls for different units (e.g. cm, g, ml) as well as for changes in traits over time and space.

## Figure 3
 ![Fig3](./Figures/Figure3.png)
 
### Workflow of data processing and meta-analysis.

## Results
Data characteristics and workflow
We used a dataset compiled by the International Mouse Phenotyping Consortium (Dickinson et al., 2016) (IMPC, dataset acquired 6/2018). To gain insight into systematic sex differences, we only included data of wildtype-strain adult mice, between 100 and 500 days of age. We removed cases with missing data, and selected measurements that were closest to 100 days of age (young adult) when multiple measurements of the same trait were available. To obtain robust estimates of sex dif- ferences, we only used data on traits that were measured in at least two different institutions (see workflow diagram, Figure 3).


 
Our dataset comprised 218 continuous traits (after initial data cleaning and pre-processing; Figure 3). It contains information from 26,916 mice from nine wildtype strains that were studied across 11 institutions. We combined mouse strain/institution information to create a biological grouping variable (referred to as ‘population’ in Figure 3B; see also Supplementary file 1, Table 1 for details), and the mean and variance of a trait for each population was quantified. We assigned traits according to related procedures into functionally and/or procedurally related trait groups to enhance interpretability (referred to as ‘functional groups’ hereafter; see also Figure 3G). Our nine functional trait groups were: behaviour, morphology, metabolism, physiology, immunology, hematology, heart, hearing and eye (for the rationale of these functional groups and related details, see Methods and Supplementary file 1, Table 3).

```{r packages1, eval=TRUE, echo=FALSE, message=FALSE, warning=FALSE, messages=FALSE, paged.print=FALSE}
library(readr)
library(dplyr)
library(devtools)
library(purrr)
library(tidyverse)
library(tidyr)
library(tibble)
library(kableExtra)
library(ggpubr)
library(ggplot2)
library(png)
library(grid)
library(splus2R)
library(metafor)
library(robumeta)

# pacman::p_load(readr, dplyr,metafor, devtools, purrr, tidyverse, tidyr, tibble, kableExtra, robumeta, ggpubr, ggplot2, png, grid, here, knitr, pander, splus2R)
```
Loading data & functions
```{r}
data <- readRDS(github://itchyshin/mice_sex_diff/export/data_clean.rds) 
procedures <- read_csv(github://itchyshin/mice_sex_diff/data/procedures.csv)
n <- length(unique(data$id))

# loads the raw data, setting some default types for various columns
load_raw <- function(filename) {
  read_csv(filename,
    col_types = cols(
      .default = col_character(),
      project_id = col_character(),
      id = col_character(),
      parameter_id = col_character(),
      age_in_days = col_integer(),
      date_of_experiment = col_datetime(format = ""),
      weight = col_double(),
      phenotyping_center_id = col_character(),
      production_center_id = col_character(),
      weight_date = col_datetime(format = ""),
      date_of_birth = col_datetime(format = ""),
      procedure_id = col_character(),
      pipeline_id = col_character(),
      biological_sample_id = col_character(),
      biological_model_id = col_character(),
      weight_days_old = col_integer(),
      datasource_id = col_character(),
      experiment_id = col_character(),
      data_point = col_double(),
      age_in_weeks = col_integer(),
      `_version_` = col_character()
    )
  )
}
# Apply some standard cleaning to the data
clean_raw_data <- function(mydata) {
  
  group <- read_csv((github://itchyshin/mice_sex_diff/data/ParameterGrouping.csv)
  
  tmp <- 
    mydata %>%
    # Filter to IMPC source (recommend by Jeremey in email to Susi on 20 Aug 2018)
    filter(datasource_name == "IMPC") %>%

# standardise trait names
    mutate(parameter_name = tolower(parameter_name)) %>%
    # remove extreme ages
    filter(age_in_days > 0 & age_in_days < 500) %>%
    # remove NAs
    filter(!is.na(data_point)) %>%
    # subset to reasonable set of variables, date_of_experiment used as an indicator of batch-level effects
    select(production_center, 
           strain_name, 
           strain_accession_id, 
           biological_sample_id, 
           pipeline_stable_id, 
           procedure_group, 
           procedure_name, 
           sex, 
           date_of_experiment, 
           age_in_days, 
           weight, 
           parameter_name, 
           data_point) %>% 
    # sort
    arrange(production_center, biological_sample_id, age_in_days)
      
    # filter to groups with > 1 centre  
    merge(tmp, 
          tmp %>% group_by(parameter_name) %>%
    summarise(center_per_trait = length(unique(production_center, na.rm = TRUE)))
        )%>%
    filter(center_per_trait >= 2) %>% 
    # Define population variable
    mutate(population = sprintf("%s-%s", production_center, strain_name)) %>% 
    # add grouping variable: these were decided based on functional groups and procedures 
    mutate(parameter_group = group$parameter[match(parameter_name, group$parameter_name)] ) %>%
    
    # Assign unique IDs (per trait)
    # each unique parameter_name (=trait,use trait variable) gets a unique number ('id')
    # We add a new variable, where redundant traits are combined
    #[note however, at this stage the dataset still contains nonsensical traits, i.e. traits that may not contain any information on variance]
    mutate(id = match(parameter_name, unique(parameter_name))) %>% 
    as_tibble()
    }
    
    data_subset_parameterid_individual_by_age <- function(mydata, parameter, age_min=0, age_center=100) {
  tmp <- mydata %>%
    filter(age_in_days >= age_min,
           id == parameter) %>%
  # Take results for single individual closest to age_center
    mutate(age_diff = abs(age_center - age_in_days)) %>%
    group_by(biological_sample_id) %>%
    filter(age_diff == min(age_diff)) %>%
    select(-age_diff)
    
  # Still some individuals with multiple records (because same individual appears under different procedures, so filter to one record)
  j <- match(unique(tmp$biological_sample_id), tmp$biological_sample_id)
  tmp[j, ] 
  }
  
  # Function re-organises the data so that the male and female data are side-by-side as columns to make effect size calculations easier.
multi_spread <- function(df, key, value) {
    # quote key
      keyq <- rlang::enquo(key)
    # break value vector into quotes
    valueq <- rlang::enquo(value)
         s <- rlang::quos(!!valueq)
        
         df                           %>% 
        gather(variable, value, !!!s) %>%
        unite(temp, !!keyq, variable) %>%
        spread(temp, value)
}
# Function will calculate population stats and exclude data that is irrelevant and not useful, it will also just convert the data straight away to wide format making it ready for effect size calculations
calculate_population_stats <- function(mydata, min_individuals = 5){
    mydata %>% 
    group_by(population,
             production_center,  
             strain_name,  
             sex) %>%
    summarise(trait  = parameter_name[1],
              x_bar  = mean(data_point, na.rm = TRUE),
              x_sd   =   sd(data_point, na.rm = TRUE),
              n_ind  = n()) %>%
    ungroup() %>%
    filter(n_ind > min_individuals) %>%   
    group_by(population) %>%
    filter(all(c("male", "female") %in% sex)) %>% # Removes any entries where only 1 sex is present
    multi_spread(sex, c(x_bar, x_sd, n_ind))      # Spreads the data out into wide format, so sexes are beside each other which is more typical of meta-analytic data
}

#This function takes the data and computes lnCVR, lnVR and ROM (lnRR) using male and female data to generate effect size statistics for meta-analyses
create_meta_analysis_effect_sizes <- function(mydata, measure = c("CVR", "VR", "ROM")){
  for(i in 1:length(measure)){
    mydata <- metafor::escalc(m1i  = male_x_bar, 
                              m2i  = female_x_bar, 
                              sd1i = male_x_sd, 
                              sd2i = female_x_sd, 
                              n1i  = male_n_ind, 
                              n2i  = female_n_ind, 
                              data = mydata, 
                           measure = measure[i], 
                         var.names = c(paste0("effect_size_", measure[i]), 
                                      paste0("sample_variance_", measure[i])),
                            append = TRUE)
    }
    return(mydata)
}


```

```{r metaanalysis prep, eval=TRUE, echo=FALSE, message=FALSE, warning=FALSE, messages=FALSE, paged.print=FALSE}
# Create dataframe to store results
results_alltraits_grouping <- 
    data.frame(tibble(id = 1:n, 
           lnCVR=0, lnCVR_lower=0, lnCVR_upper=0, lnCVR_se=0, lnCVR_I2=0,
            lnVR=0,  lnVR_lower=0,  lnVR_upper=0, lnVR_se=0, lnVR_I2=0,
            lnRR=0,  lnRR_lower=0,  lnRR_upper=0, lnRR_se=0, lnRR_I2=0,
           k=0, trait=0, male_N = 0, female_N=0, min_male_N = 0, max_male_N = 0, mean_male_N = 0,
              min_female_N = 0, max_female_N = 0, mean_female_N = 0))
for (t in 1:n) {
  tryCatch(
    {
      results <- data %>%
                 data_subset_parameterid_individual_by_age(t) %>%
                 calculate_population_stats()                 %>%
                 create_meta_analysis_effect_sizes()          %>%
                 mutate(err = seq_len(n()))
      # lnCVR,  log repsonse-ratio of the coefficient of variance
      cvr <- metafor::rma.mv(yi = effect_size_CVR, V = sample_variance_CVR, 
                             random = list(~ 1 | strain_name, 
                                           ~ 1 | production_center, 
                                           ~ 1 | err), 
                             control = list(optimizer = "optim", 
                                            optmethod = "Nelder-Mead", 
                                            maxit = 1000), 
                             verbose = F, data = results)

# lnVR, comparison of standard deviations
      cv <- metafor::rma.mv(yi = effect_size_VR, V = sample_variance_VR,
                            random = list(~ 1 | strain_name, 
                                          ~ 1 | production_center, 
                                          ~ 1 | err), 
                            control = list(optimizer = "optim", 
                                           optmethod = "Nelder-Mead", 
                                           maxit = 1000), 
                            verbose = F, data = results)
      # for means, lnRR
      means <- metafor::rma.mv(yi = effect_size_ROM, V = sample_variance_ROM, 
                               random = list(~ 1 | strain_name, 
                                             ~ 1 | production_center, 
                                             ~ 1 | err), 
                               control = list(optimizer = "optim", 
                                              optmethod = "Nelder-Mead", 
                                              maxit = 1000), 
                               verbose = F, data = results)
      
      f <- function(x) unlist(x[c("b", "ci.lb", "ci.ub", "se")])
      
      extract_I2 <- function(mod){
                  sigma2 <- mod$sigma2
                       w <- mod$vi
                       k <- mod$k
                  
                  sigmaM <- sum(w * (k-1)) / (sum(w)^2 - sum(w^2))
                  
                    I2_tot <- round(sum(sigma2) / sum(sigma2 + sigmaM), digits = 3)*100
                  return(I2_tot)
      }
      results_alltraits_grouping[t, c(2:17,19:26)] <- c(f(cvr),lnCVR_I2 = extract_I2(cvr), f(cv), lnVR_I2 = extract_I2(cv), f(means), lnRR_I2 = extract_I2(means), k=means$k, male_N = sum(results$male_n_ind), female_N = sum(results$female_n_ind), min_male_N = min(results$male_n_ind), max_male_N = max(results$male_n_ind), mean_male_N = mean(results$male_n_ind),
                                                        min_female_N = min(results$female_n_ind), max_female_N = max(results$female_n_ind), mean_female_N = mean(results$female_n_ind))
      results_alltraits_grouping[t, 18]   <- unique(results$trait)
    },
    error = function(e) {
      cat("ERROR :", t, conditionMessage(e), "\n")
    }
  )
}

results_alltraits_grouping2 <- 
  results_alltraits_grouping %>% 
  
  # Join data with results_alltraits_grouping. We filter duplicated id's to get only one unique row per id (and there is one id per parameter_name)
  left_join(by="id",
             data %>% 
             select(id, parameter_group, procedure = procedure_name, procedure_name, parameter_name) %>%  
             filter(!duplicated(id))) %>%
 
# Below we add 'procedure' (from the previously loaded 'procedures.csv') as a variable; n <- length(unique(results_alltraits_grouping2$parameter_name))) should equal 232
  left_join(by="procedure", 
            procedures %>% distinct())
# We exclude 14 parameter names for which metafor models didn't converge ("dp t cells", "mzb (cd21/35 high)"), and of parameters that don't harbour enough variation
meta_clean <- results_alltraits_grouping2 %>% 
	            filter(!parameter_name %in% 
	                     c("dp t cells", "mzb (cd21/35 high)", "number of caudal vertebrae", 
	                       "number of cervical vertebrae", "number of digits", "number of lumbar vertebrae", 
	                       "number of pelvic vertebrae", "number of ribs left","number of ribs right", 
	                       "number of signals", "number of thoracic vertebrae", "total number of acquired events in panel a",
	                       "total number of acquired events in panel b", "whole arena permanence"))
# Summary of the cleaned data for 218 traits
  sd(meta_clean$k)
  range(meta_clean$male_N)
  range(meta_clean$female_N)
  range(meta_clean$lnCVR_I2)
  range(meta_clean$lnVR_I2)
  range(meta_clean$lnRR_I2)
  range(meta_clean$k)
  mean(meta_clean$mean_male_N)
  range(meta_clean$min_male_N)
  median(meta_clean$mean_male_N)
  sd(meta_clean$mean_male_N)
  sd(meta_clean$mean_male_N) / sqrt(length(meta_clean$mean_male_N))
  mean(meta_clean$mean_female_N)
  median(meta_clean$mean_female_N)
  range(meta_clean$min_female_N)
  sd(meta_clean$mean_female_N)
  sd(meta_clean$mean_female_N) / sqrt(length(meta_clean$mean_female_N))
  
  meta1_sub <- meta_clean %>%
  # Add summary of number of parameter names in each parameter group
  group_by(parameter_group) %>%
  mutate(par_group_size = length(unique(parameter_name)), 
         sampleSize = as.numeric(k)) %>% 
  ungroup() %>% 
  # Create subsets with > 1 count (par_group_size > 1)
  filter(par_group_size > 1) # 90 observations
  
  # Create summary of number of parameter names in each parameter group, and merge back together
meta1b <-
  meta1 %>%
  group_by(parameter_group) %>% 
  summarize(par_group_size = length(unique(parameter_name, na.rm = TRUE)))
meta1$par_group_size <- meta1b$par_group_size[match(meta1$parameter_group, meta1b$parameter_group)]
# Create subsets with > 1 count (par_group_size > 1) 
           meta1_sub <- subset(meta1,par_group_size >1) # 90 observations   
meta1_sub$sampleSize <- as.numeric(meta1_sub$k)
# Nesting and meta-analyses on correlated traits, using robumeta
n_count <- meta1_sub %>%
           group_by(parameter_group) %>%
           mutate(raw_N = sum(sampleSize)) %>%
           nest() %>%
           ungroup()
model_count <- n_count %>%
  mutate(
     model_lnRR = map(data, ~ robu(.x$lnRR ~ 1,  data = .x, studynum = .x$id, modelweights = c("CORR"), rho = 0.8, small = TRUE, var.eff.size = (.x$lnRR_se)^2)),
     model_lnVR = map(data, ~ robu(.x$lnVR ~ 1,  data = .x, studynum = .x$id, modelweights = c("CORR"), rho = 0.8, small = TRUE, var.eff.size = (.x$lnVR_se)^2)),
    model_lnCVR = map(data, ~ robu(.x$lnCVR ~ 1, data = .x, studynum = .x$id, modelweights = c("CORR"), rho = 0.8, small = TRUE, var.eff.size = (.x$lnCVR_se)^2))
  )
 ```
Additional meta-analyses and preparation for plots

 ```{r extracting estimates}
#### Extracting and save parameter estimates
# Here we apply an additional function to collect the outcomes of the 'mini-meta-analysis' that has condensed our non-independent traits. Values from our second-order meta-analysis using robu-meta are then extracted
count_fun <- function(mod_sub) {
  return(c(mod_sub$reg_table$b.r, mod_sub$reg_table$CI.L, mod_sub$reg_table$CI.U, mod_sub$reg_table$SE))
} # estimate, lower ci, upper ci, SE

# Extraction of values created during meta-analyses using robumeta
robusub_RR <- model_count %>%
              transmute(parameter_group, 
                        estimatelnRR = map(model_lnRR, count_fun)) %>%
              mutate(r = map(estimatelnRR, 
                             ~ data.frame(t(.)))) %>%
              unnest(r) %>%
              select(-estimatelnRR) %>%
              purrr::set_names(c("parameter_group", "lnRR", "lnRR_lower", "lnRR_upper", "lnRR_se"))
robusub_CVR <- model_count %>%
               transmute(parameter_group, 
                         estimatelnCVR = map(model_lnCVR, count_fun)) %>%
               mutate(r = map(estimatelnCVR, 
                              ~ data.frame(t(.)))) %>%
               unnest(r) %>%
               select(-estimatelnCVR) %>%
               purrr::set_names(c("parameter_group", "lnCVR", "lnCVR_lower", "lnCVR_upper", "lnCVR_se"))
robusub_VR <- model_count %>%
              transmute(parameter_group, 
                        estimatelnVR = map(model_lnVR, count_fun)) %>%
              mutate(r = map(estimatelnVR, 
                             ~ data.frame(t(.)))) %>%
              unnest(r) %>%
              select(-estimatelnVR) %>%
              purrr::set_names(c("parameter_group", "lnVR", "lnVR_lower", "lnVR_upper", "lnVR_se"))
robu_all <- full_join(robusub_CVR, robusub_VR) %>% full_join(., robusub_RR)

#### Combining data 
# Merge the two data sets (the new [robu_all] and the initial [uncorrelated sub-traits with count = 1]) 
meta_all <- meta1 %>%
            filter(par_group_size == 1) %>%
            as_tibble()
# Step 1:  Columns are matched by name (in our case, 'parameter_group'), and any missing columns will be filled with NA
  combinedmeta <- bind_rows(robu_all, meta_all)
# glimpse(combinedmeta)
# Steps 2&3: Add information about number of traits in a parameter group, procedure, and grouping term
              metacombo <- combinedmeta
       metacombo$counts <- meta1$par_group_size[match(metacombo$parameter_group, meta1$parameter_group)] 
   metacombo$procedure2 <- meta1$procedure[match(metacombo$parameter_group, meta1$parameter_group)]
metacombo$GroupingTerm2 <- meta1$GroupingTerm[match(metacombo$parameter_group, meta1$parameter_group)]
# Clean-up, reorder, and rename 
metacombo <- metacombo[c("parameter_group", "counts","procedure2","GroupingTerm2", "lnCVR","lnCVR_lower","lnCVR_upper","lnCVR_se","lnVR","lnVR_lower","lnVR_upper","lnVR_se","lnRR","lnRR_lower","lnRR_upper","lnRR_se")] 
   names(metacombo)[names(metacombo)=="procedure2"] <- "procedure" 
names(metacombo)[names(metacombo)=="GroupingTerm2"] <- "GroupingTerm" 

metacombo_final <- metacombo %>%
                   group_by(GroupingTerm) %>%
                   nest()

metacombo_final <- metacombo_final %>% 
                   mutate(para_per_GroupingTerm = map_dbl(data, nrow))
# For all grouping terms
metacombo_final_all <- metacombo %>%
                       nest(data = everything())
# Final random effects meta-analyses within grouping terms, with SE of the estimate
overall1 <- metacombo_final %>%
  mutate(
    model_lnCVR = map(data, ~ metafor::rma.uni(
                      yi = .x$lnCVR, sei = (.x$lnCVR_upper - .x$lnCVR_lower) / (2 * 1.96),
                      control = list(optimizer = "optim", 
                                     optmethod = "Nelder-Mead", 
                                     maxit = 1000), 
                      verbose = F)),
    model_lnVR = map(data, ~ metafor::rma.uni(
                    yi = .x$lnVR, sei = (.x$lnVR_upper - .x$lnVR_lower) / (2 * 1.96),
                    control = list(optimizer = "optim", 
                                   optmethod = "Nelder-Mead", 
                                   maxit = 1000), 
                    verbose = F)),
    model_lnRR = map(data, ~ metafor::rma.uni(
                      yi = .x$lnRR, sei = (.x$lnRR_upper - .x$lnRR_lower) / (2 * 1.96),
                      control = list(optimizer = "optim", 
                                     optmethod = "Nelder-Mead", 
                                     maxit = 1000), 
                      verbose = F)))
# Final random effects meta-analyses ACROSS grouping terms, with SE of the estimate
overall_all1 <- metacombo_final_all %>%
  mutate(
    model_lnCVR = map(data, ~ metafor::rma.uni(
                      yi = .x$lnCVR, sei = (.x$lnCVR_upper - .x$lnCVR_lower) / (2 * 1.96),
                      control = list(optimizer = "optim", 
                                     optmethod = "Nelder-Mead", 
                                     maxit = 1000), 
                      verbose = F)),
    model_lnVR = map(data, ~ metafor::rma.uni(
                    yi = .x$lnVR, sei = (.x$lnVR_upper - .x$lnVR_lower) / (2 * 1.96),
                    control = list(optimizer = "optim", 
                                   optmethod = "Nelder-Mead", 
                                   maxit = 1000), 
                    verbose = F)),
    model_lnRR = map(data, ~ metafor::rma.uni(
                    yi = .x$lnRR, sei = (.x$lnRR_upper - .x$lnRR_lower) / (2 * 1.96),
                    control = list(optimizer = "optim", 
                                   optmethod = "Nelder-Mead", 
                                   maxit = 1000), 
                    verbose = F)))
                    
  # Function will take the averall results and extract the relevant trait of interest
extract_trait <- function(data, trait){
  tmp <- data %>% 
        filter(., GroupingTerm == trait) %>% 
        mutate(      lnCVR = .[[4]][[1]]$b, 
               lnCVR_lower = .[[4]][[1]]$ci.lb, 
               lnCVR_upper = .[[4]][[1]]$ci.ub, 
                  lnCVR_se = .[[4]][[1]]$se, 
                  lnCVR_I2 = .[[4]][[1]]$I2,
                      lnVR = .[[5]][[1]]$b,  
                lnVR_lower = .[[5]][[1]]$ci.lb,  
                lnVR_upper = .[[5]][[1]]$ci.ub,  
                   lnVR_se = .[[5]][[1]]$se,  
                   lnVR_I2 = .[[5]][[1]]$I2,
                      lnRR = .[[6]][[1]]$b,  
                lnRR_lower = .[[6]][[1]]$ci.lb,  
                lnRR_upper = .[[6]][[1]]$ci.ub,  
                   lnRR_se = .[[6]][[1]]$se,  
                   lnRR_I2 = .[[6]][[1]]$I2)  %>%
        select(., GroupingTerm, lnCVR:lnRR_I2)
  
  return(tmp)
}
All <- overall_all1 %>% 
       mutate(      lnCVR = .[[2]][[1]]$b, 
              lnCVR_lower = .[[2]][[1]]$ci.lb, 
              lnCVR_upper = .[[2]][[1]]$ci.ub, 
                 lnCVR_se = .[[2]][[1]]$se, 
                 lnCVR_I2 = .[[2]][[1]]$I2,
                     lnVR = .[[3]][[1]]$b,  
               lnVR_lower = .[[3]][[1]]$ci.lb,  
               lnVR_upper = .[[3]][[1]]$ci.ub,  
                  lnVR_se = .[[3]][[1]]$se,  
                  lnVR_I2 = .[[3]][[1]]$I2,
                     lnRR = .[[4]][[1]]$b,  
               lnRR_lower = .[[4]][[1]]$ci.lb,  
               lnRR_upper = .[[4]][[1]]$ci.ub,  
                  lnRR_se = .[[4]][[1]]$se,  
                  lnRR_I2 = .[[4]][[1]]$I2,)  %>%
       select(., lnCVR:lnRR_I2)
       
All <- All %>% mutate(GroupingTerm = "All")

overall2 <- bind_rows(extract_trait(overall1, "Behaviour"), 
                      extract_trait(overall1, "Morphology"), 
                      extract_trait(overall1, "Metabolism"), 
                      extract_trait(overall1, "Physiology"), 
                      extract_trait(overall1, "Immunology"), 
                      extract_trait(overall1, "Hematology"), 
                      extract_trait(overall1, "Heart"), 
                      extract_trait(overall1, "Hearing"), 
                      extract_trait(overall1, "Eye"), 
                      All) 

meta_clean$GroupingTerm <- factor(meta_clean$GroupingTerm, 
                                  levels = c("Behaviour", "Morphology", "Metabolism", "Physiology", "Immunology", "Hematology", "Heart", "Hearing", "Eye"))
meta_clean$GroupingTerm <- factor(meta_clean$GroupingTerm, 
                                  rev(levels(meta_clean$GroupingTerm)))
#Preparing data for all traits
meta.plot2.all <- meta_clean %>%
                  select(lnCVR, lnVR, lnRR, GroupingTerm) %>%
                  arrange(GroupingTerm)
#lnVR has been removed here and in the steps below, as this is only included in the supplemental figure
      meta.plot2.all.b <- gather(meta.plot2.all, trait, value, c(lnCVR, lnRR)) 
meta.plot2.all.b$trait <- factor(meta.plot2.all.b$trait, levels = c("lnCVR", "lnRR")) 
      meta.plot2.all.c <- meta.plot2.all.b %>%
                          group_by_at(vars(trait, GroupingTerm)) %>%
                          summarise(
                            malebias = sum(value > 0), femalebias = sum(value <= 0), total = malebias + femalebias,
                            malepercent = malebias * 100 / total, femalepercent = femalebias * 100 / total
                          )
meta.plot2.all.c$label <- "All traits"
# Re-structure to create stacked bar plots
meta.plot2.all.d <- as.data.frame(meta.plot2.all.c)
meta.plot2.all.e <- gather(meta.plot2.all.d, 
                           key = sex, 
                           value = percent, 
                           malepercent:femalepercent, 
                           factor_key = TRUE)
# Create new sample size variable
meta.plot2.all.e$samplesize <- with(meta.plot2.all.e, 
                                    ifelse(sex == "malepercent", 
                                           malebias, 
                                           femalebias))
# Add summary row ('All') and re-arrange rows into correct order for plotting (warnings about coercing 'id' into character vector are ok)
meta.plot2.all.f <- meta.plot2.all.e %>% 
                    group_by(trait, sex) %>% 
	                  summarise(GroupingTerm = "All", 
	                            malebias = sum(malebias),
	                            femalebias = sum(femalebias),
	                            total = malebias + femalebias, 
	                            label = "All traits", 
	                            samplesize = sum(samplesize)) %>%
	                  mutate(percent = ifelse(sex == "femalepercent", femalebias*100/(malebias+femalebias), malebias*100/(malebias+femalebias))) %>%
	                  bind_rows(meta.plot2.all.e, .) %>%
	                  mutate(rownumber = row_number()) %>%
	                  .[c(37, 1:9, 39, 10:18, 38, 19:27, 40, 28:36), ] 
  #line references in previous code line corresponding to: 
  #'lnCVR(male(All)), lnCVR(male('single grouping terms'), lnRR(male(All)), lnRR(male('single grouping terms')),
  #lnCVR(female(All)), lnCVR(female('single grouping terms'), lnRR(female(All)), lnRR(female('single grouping terms'))'
meta.plot2.all.f$GroupingTerm <- factor(meta.plot2.all.f$GroupingTerm,
                                        levels = c("Behaviour", "Morphology", "Metabolism", "Physiology", "Immunology", "Hematology", "Heart", "Hearing", "Eye", "All")) 
meta.plot2.all.f$GroupingTerm <- factor(meta.plot2.all.f$GroupingTerm, 
                                        rev(levels(meta.plot2.all.f$GroupingTerm)))
malebias_Fig2_alltraits <-
    ggplot(meta.plot2.all.f) +
    aes(x = GroupingTerm, y = percent, fill = sex) +
    ylab("Percentage") +
    geom_col() +
    geom_hline(yintercept = 50, linetype = "dashed", color = "gray40") +
    geom_text(
      data = subset(meta.plot2.all.f, samplesize != 0), aes(label = samplesize), position = position_stack(vjust = .5),
      color = "white", size = 3.5
    ) +
    facet_grid(
      cols = vars(trait), rows = vars(label), labeller = label_wrap_gen(width = 18),
      scales = "free", space = "free"
    ) +  
    scale_fill_brewer(palette = "Set2") +
    theme_bw(base_size = 18) +
    theme(
      strip.text.y = element_text(angle = 270, size = 10, margin = margin(t = 15, r = 15, b = 15, l = 15)),
      strip.text.x = element_text(size = 12),
      strip.background = element_rect(colour = NULL, linetype = "blank", fill = "gray90"),
      text = element_text(size = 14),
      panel.spacing = unit(0.5, "lines"),
      panel.border = element_blank(),
      axis.line = element_line(),
      panel.grid.major.x = element_line(linetype = "solid", colour = "gray95"),
      panel.grid.major.y = element_line(linetype = "solid", color = "gray95"),
      panel.grid.minor.y = element_blank(),
      panel.grid.minor.x = element_blank(),
      legend.position= "none",
      #axis.title.x = Percentage,
      axis.title.y = element_blank()
    ) +
    coord_flip()
    overall3 <- gather(overall2, parameter, value, c(lnCVR, lnRR), factor_key = TRUE) 
lnCVR.ci <- overall3 %>%
           filter(parameter == "lnCVR") %>%
           mutate(ci.low = lnCVR_lower, ci.high = lnCVR_upper)
lnVR.ci <- overall3 %>%
          filter(parameter == "lnVR") %>%
          mutate(ci.low = lnVR_lower, ci.high = lnVR_upper)
lnRR.ci <- overall3 %>%
          filter(parameter == "lnRR") %>%
          mutate(ci.low = lnRR_lower, ci.high = lnRR_upper)
overall4 <- bind_rows(lnCVR.ci, lnRR.ci) %>% select(GroupingTerm, parameter, value, ci.low, ci.high) 
# Re-order grouping terms
overall4$GroupingTerm <- factor(overall4$GroupingTerm, 
                                levels = c("Behaviour", "Morphology", "Metabolism", "Physiology", "Immunology", "Hematology", "Heart", "Hearing", "Eye", "All"))
overall4$GroupingTerm <- factor(overall4$GroupingTerm, 
                                rev(levels(overall4$GroupingTerm)))
overall4$label <- "All traits"
####  PLOT 4B
#adding male / female symbols for Figure 4B
# additional packages for intergating male / female symbols in plot 4
library(png)
library(grid)
  male <- readPNG(here("images", "MaleAqua.png"))
female <- readPNG(here("images", "FemaleSalmon.png"))
Metameta_Fig3_alltraits <- overall4 %>%
  ggplot(aes(y = GroupingTerm, x = value)) +
  geom_errorbarh(aes(
    xmin = ci.low,
    xmax = ci.high
  ),
  height = 0.1, show.legend = FALSE
  ) +
  geom_point(aes(shape = parameter),
    fill = "black",
    color = "black", size = 2.2,
    show.legend = FALSE
  ) +
  scale_x_continuous(
    limits = c(-0.24, 0.25),
    breaks = c(-0.2, -0.1, 0, 0.1, 0.2),
    name = "Effect size"
  ) +
  geom_vline(
    xintercept = 0,
    color = "black",
    linetype = "dashed"
  ) +
  facet_grid(
    cols = vars(parameter), rows = vars(label),
    labeller = label_wrap_gen(width = 23),
    scales = "free",
    space = "free"
  ) +
  theme_bw() +
  theme(
    strip.text.y = element_text(angle = 270, size = 10, margin = margin(t = 15, r = 15, b = 15, l = 15)),
    strip.text.x = element_text(size = 12),
    strip.background = element_rect(colour = NULL, linetype = "blank", fill = "gray90"),
    text = element_text(size = 14),
    panel.spacing = unit(0.5, "lines"),
    panel.border = element_blank(),
    axis.line = element_line(),
    panel.grid.major.x = element_line(linetype = "solid", colour = "gray95"),
    panel.grid.major.y = element_line(linetype = "solid", color = "gray95"),
    panel.grid.minor.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.title = element_blank(),
    axis.title.x = element_text(hjust = 0.5, size = 14),
    axis.title.y = element_blank()) +
  # addition of male & female symols
  annotation_custom(rasterGrob(female), xmin = - 0.2, xmax = -0.1, ymin = 2.3, ymax = 4) +
  annotation_custom(rasterGrob(male), xmin = 0.1, xmax = 0.2, ymin = 2.3, ymax = 4)

```

## Figure 4

```{r  fig.cap = "Sex bias in trait groups.
Panel (A) shows the numbers of traits that are either male-biased (turquoise) or female-biased (orange) across functional groups. The x-axes in Panel A represents the overall percentages of traits with a given direction of sex bias: orange shading when meta-analytic mean < 0 (female-biased), turquoise shading when meta-analytic mean > 0 (male-biased). White numbers inside the turquoise bars represent numbers of traits that show male bias within a given group of traits, numbers inside the orange bars represent the number of female-biased traits. Panel (B) shows effect sizes and 95% CI from separate meta-analysis for each functional group (Figure 3H). Traits that are male-biased in Panel B are shifted towards the righthand side of the zero-midline (near the turquoise male symbol), whereas female-biased traits are shifted towards the left (near the orange female symbol)."}

Fig4 <- ggarrange(malebias_Fig2_alltraits, Metameta_Fig3_alltraits,  nrow = 2, align = "v", heights = c(10, 9), labels = c("A", "B"))
Fig4

```

## Figure 4—figure supplement 1
```{r code for heterogeneity}
### Preparing heterogeneity
# Create dataframe to store results
results.allhetero.grouping <- as.data.frame(cbind(c(1:n), matrix(rep(0, n * 30), ncol = 30)))
names(results.allhetero.grouping) <- c(
  "id", "sigma2_strain.CVR", "sigma2_center.CVR", "sigma2_error.CVR", "s.nlevels.strain.CVR",
  "s.nlevels.center.CVR", "s.nlevels.error.CVR", "sigma2_strain.VR", "sigma2_center.VR", "sigma2_error.VR", "s.nlevels.strain.VR",
  "s.nlevels.center.VR", "s.nlevels.error.VR", "sigma2_strain.RR", "sigma2_center.RR", "sigma2_error.RR", "s.nlevels.strain.RR",
  "s.nlevels.center.RR", "s.nlevels.error.RR", "lnCVR", "lnCVR_lower", "lnCVR_upper", "lnCVR_se", "lnVR", "lnVR_lower", "lnVR_upper",
  "lnVR_se", "lnRR", "lnRR_lower", "lnRR_upper", "lnRR_se"
)
 
#Loop
for (t in 1:n) {
  tryCatch(
    {
      results <- data %>% 
                  data_subset_parameterid_individual_by_age(t) %>% 
                  calculate_population_stats() %>% 
                  create_meta_analysis_effect_sizes() %>%
                  mutate(err = seq_len(n()))
                  
      # lnCVR, logaritm of the ratio of male and female coefficients of variance
      cvr. <- metafor::rma.mv(yi = effect_size_CVR, 
                              V = sample_variance_CVR, 
                              random = list(~ 1 | strain_name, ~ 1 | production_center,~ 1 | err), 
                              control = list(optimizer = "optim", 
                                             optmethod = "Nelder-Mead", 
                                             maxit = 1000), 
                              data = results)
      
      results.allhetero.grouping[t, 2] <- cvr.$sigma2[1]
      results.allhetero.grouping[t, 3] <- cvr.$sigma2[2]
      results.allhetero.grouping[t, 4] <- cvr.$sigma2[3]
      results.allhetero.grouping[t, 5] <- cvr.$s.nlevels[1]
      results.allhetero.grouping[t, 6] <- cvr.$s.nlevels[2]
      results.allhetero.grouping[t, 7] <- cvr.$s.nlevels[3]
      results.allhetero.grouping[t, 20] <- cvr.$b
      results.allhetero.grouping[t, 21] <- cvr.$ci.lb
      results.allhetero.grouping[t, 22] <- cvr.$ci.ub
      results.allhetero.grouping[t, 23] <- cvr.$se
      
       # lnVR, male to female variability ratio (logarithm of male and female standard deviations)
      vr. <- metafor::rma.mv(yi = effect_size_VR, 
                             V = sample_variance_VR, 
                             random = list(~ 1 | strain_name, ~ 1 | production_center, ~ 1 | err), 
                             control = list(optimizer = "optim", 
                                            optmethod = "Nelder-Mead", 
                                            maxit = 1000), 
                             data = results)
      
      results.allhetero.grouping[t, 8] <- vr.$sigma2[1]
      results.allhetero.grouping[t, 9] <- vr.$sigma2[2]
      results.allhetero.grouping[t, 10] <- vr.$sigma2[3]
      results.allhetero.grouping[t, 11] <- vr.$s.nlevels[1]
      results.allhetero.grouping[t, 12] <- vr.$s.nlevels[2]
      results.allhetero.grouping[t, 13] <- vr.$s.nlevels[3]
      results.allhetero.grouping[t, 24] <- vr.$b
      results.allhetero.grouping[t, 25] <- vr.$ci.lb
      results.allhetero.grouping[t, 26] <- vr.$ci.ub
      results.allhetero.grouping[t, 27] <- vr.$se
      # lnRR, response ratio (logarithm of male and female means)
      rr. <- metafor::rma.mv(yi = effect_size_ROM, 
                             V = sample_variance_ROM, 
                             random = list(~ 1 | strain_name, ~ 1 | production_center, ~ 1 | err), 
                             control = list(optimizer = "optim", 
                                            optmethod = "Nelder-Mead", 
                                            maxit = 1000), 
                             data = results)
      
      results.allhetero.grouping[t, 14] <- rr.$sigma2[1]
      results.allhetero.grouping[t, 15] <- rr.$sigma2[2]
      results.allhetero.grouping[t, 16] <- rr.$sigma2[3]
      results.allhetero.grouping[t, 17] <- rr.$s.nlevels[1]
      results.allhetero.grouping[t, 18] <- rr.$s.nlevels[2]
      results.allhetero.grouping[t, 19] <- rr.$s.nlevels[3]
      results.allhetero.grouping[t, 28] <- rr.$b
      results.allhetero.grouping[t, 29] <- rr.$ci.lb
      results.allhetero.grouping[t, 30] <- rr.$ci.ub
      results.allhetero.grouping[t, 31] <- rr.$se
    },
    error = function(e) {
      cat("ERROR :", conditionMessage(e), "\n")
    }
     )
}

              results.allhetero.grouping2 <- results.allhetero.grouping[results.allhetero.grouping$s.nlevels.strain.VR != 0, ]
# nrow(results.allhetero.grouping) #232  
# Merging dataset with correct procedure names
# procedures <- read.csv(here("export", "procedures.csv"))
results.allhetero.grouping2$parameter_group <- data$parameter_group[match(results.allhetero.grouping2$id, data$id)]
      results.allhetero.grouping2$procedure <- data$procedure_name[match(results.allhetero.grouping2$id, data$id)]
   results.allhetero.grouping2$GroupingTerm <- procedures$GroupingTerm[match(results.allhetero.grouping2$procedure, procedures$procedure)]
 results.allhetero.grouping2$parameter_name <- data$parameter_name[match(results.allhetero.grouping2$id, data$id)]
#We deal with correlated parameters following the procedures described above (extraction of effects sizes).
metahetero1 <- results.allhetero.grouping2
# length(unique(metahetero1$procedure)) #19
# length(unique(metahetero1$GroupingTerm)) #9 
# length(unique(metahetero1$parameter_group)) #152
# length(unique(metahetero1$parameter_name)) #223
# Count of number of parameter names (correlated sub-traits) in each parameter group (par_group_size)
metahetero1b <- metahetero1                    %>%
                group_by(parameter_group)      %>%
                mutate(par_group_size = n_distinct(parameter_name))
metahetero1$par_group_size <- metahetero1b$par_group_size[match(metahetero1$parameter_group, metahetero1b$parameter_group)]
# Create subsets with > 1 count (par_group_size > 1)
metahetero1_sub <- subset(metahetero1, par_group_size > 1) # 92 observations
# Nest data
n_count. <- metahetero1_sub %>%
  group_by(parameter_group) %>%
  nest()

# meta-analysis preparation
model_count. <- n_count. %>%
  mutate(
    model_lnRR = map(data, ~ robu(.x$lnRR ~ 1,
      data = .x, studynum = .x$id, modelweights = c("CORR"), rho = 0.8,
      small = TRUE, var.eff.size = (.x$lnRR_se)^2
    )),
    model_lnVR = map(data, ~ robu(.x$lnVR ~ 1,
      data = .x, studynum = .x$id, modelweights = c("CORR"), rho = 0.8,
      small = TRUE, var.eff.size = (.x$lnVR_se)^2
    )),
    model_lnCVR = map(data, ~ robu(.x$lnCVR ~ 1,
      data = .x, studynum = .x$id, modelweights = c("CORR"), rho = 0.8,
      small = TRUE, var.eff.size = (.x$lnCVR_se)^2
    ))
  )
# Robumeta object details:  str(model_count.$model_lnCVR[[1]])
# Perform meta-analyses on correlated sub-traits, using robumeta
# Extract and save parameter estimates
count_fun. <- function(mod_sub) {
  return(c(as.numeric(mod_sub$mod_info$term1), mod_sub$N))
}
robusub_RR. <- model_count. %>%
  transmute(estimatelnRR = map(model_lnRR, count_fun.)) %>%    
  mutate(r = map(estimatelnRR, ~ data.frame(t(.)))) %>%
  unnest(r) %>%
  select(-estimatelnRR) %>%
  purrr::set_names(c("parameter_group", "var.RR", "N.RR"))
robusub_CVR. <- model_count. %>%
  transmute(estimatelnCVR = map(model_lnCVR, count_fun.)) %>%
  mutate(r = map(estimatelnCVR, ~ data.frame(t(.)))) %>%
  unnest(r) %>%
  select(-estimatelnCVR) %>%
  purrr::set_names(c("parameter_group", "var.CVR", "N.CVR"))
robusub_VR. <- model_count. %>%
  transmute(estimatelnVR = map(model_lnVR, count_fun.)) %>%
  mutate(r = map(estimatelnVR, ~ data.frame(t(.)))) %>%
  unnest(r) %>%
   select(-estimatelnVR) %>%
  purrr::set_names(c("parameter_group", "var.VR", "N.VR"))
robu_all. <- full_join(robusub_CVR., robusub_VR.) %>% full_join(., robusub_RR.)
#Merge the two data sets (the new [robu_all.] and the initial [uncorrelated sub-traits with count = 1])
#In this step, we 	1) merge the N from robumeta and the  N from metafor (s.nlevels.error) together into the same columns (N.RR, N.VR, N.CVR)
# 2) calculate the total variance for metafor models as the sum of random effect variances and the residual error, then add in the same columns together with the residual variances from robumeta 
  
  
metahetero_all <- metahetero1 %>%
  filter(par_group_size == 1) %>%
  as_tibble()
metahetero_all$N.RR <- metahetero_all$s.nlevels.error.RR
metahetero_all$N.CVR <- metahetero_all$s.nlevels.error.CVR
metahetero_all$N.VR <- metahetero_all$s.nlevels.error.VR
metahetero_all$var.RR <- log(sqrt(metahetero_all$sigma2_strain.RR + metahetero_all$sigma2_center.RR + metahetero_all$sigma2_error.RR))
metahetero_all$var.VR <- log(sqrt(metahetero_all$sigma2_strain.VR + metahetero_all$sigma2_center.VR + metahetero_all$sigma2_error.VR))
metahetero_all$var.CVR <- log(sqrt(metahetero_all$sigma2_strain.CVR + metahetero_all$sigma2_center.CVR + metahetero_all$sigma2_error.CVR))
# str(metahetero_all)
# str(robu_all.)

metahetero_all <- metahetero_all %>% mutate(
  var.RR = if_else(var.RR == -Inf, -7, var.RR),   # numbers chosen as limits; based on the values in the dataset
  var.VR = if_else(var.VR == -Inf, -5, var.VR),
  var.CVR = if_else(var.CVR == -Inf, -6, var.CVR)
)
# Combine data
# Step1
combinedmetahetero <- bind_rows(robu_all., metahetero_all)
# glimpse(combinedmetahetero)
# Steps 2&3
              metacombohetero <- combinedmetahetero
       metacombohetero$counts <- metahetero1$par_group_size[match(metacombohetero$parameter_group, metahetero1$parameter_group)]
   metacombohetero$procedure2 <- metahetero1$procedure[match(metacombohetero$parameter_group, metahetero1$parameter_group)]
metacombohetero$GroupingTerm2 <- metahetero1$GroupingTerm[match(metacombohetero$parameter_group, metahetero1$parameter_group)]
# **Clean-up and rename
metacombohetero <- metacombohetero %>% select(parameter_group, var.CVR, N.CVR, var.VR, N.VR, var.RR, N.RR, counts, procedure = procedure2, GroupingTerm = GroupingTerm2)  
# Invidual table for heterogeneity across all traits
 kable(metacombohetero,, digits = 3) %>%
  kable_styling() %>%
  scroll_box(width = "100%", height = "200px")
#### Meta-analysis of heterogeneity
## Perform meta-meta-analysis (3 for each of the 9 grouping terms: var.CVR, var.VR, var.RR)
metacombohetero_final <- metacombohetero %>%
  group_by(GroupingTerm) %>%
  nest()
# Final random effects meta-analyses within grouping terms, with SE of the estimate
heterog1 <- metacombohetero_final %>%
  mutate(
    model_heteroCVR = map(data, ~ metafor::rma.uni(
      yi = .x$var.CVR, sei = sqrt(1 / 2 * (.x$N.CVR - 1)),
      control = list(optimizer = "optim", 
                     optmethod = "Nelder-Mead",
		                          maxit = 10000, 
                     stepadj = 0.5), verbose = F)),
    model_heteroVR = map(data, ~ metafor::rma.uni(
      yi = .x$var.VR, sei = sqrt(1 / 2 * (.x$N.VR - 1)),
      control = list(optimizer = "optim", 
                     optmethod = "Nelder-Mead", 
                     maxit = 10000, 
                     stepadj = 0.5), verbose = F)),
    model_heteroRR = map(data, ~ metafor::rma.uni(
      yi = .x$var.RR, sei = sqrt(1 / 2 * (.x$N.RR - 1)),
      control = list(optimizer = "optim", 
                     optmethod = "Nelder-Mead", 
                     maxit = 10000, 
                     stepadj = 0.5), verbose = F)) )
# Across all grouping terms   
metacombohetero_all_final <- metacombohetero %>%
  nest(data = everything()) 
# Final random effects meta-analyses ACROSS grouping terms, with SE of the estimate
heterog1_all <- metacombohetero_all_final %>%
  
  mutate( model_heteroCVR = map(data, ~ metafor::rma.uni(
      yi = .x$var.CVR, sei = sqrt(1 / 2 * (.x$N.CVR - 1)),
      control = list(optimizer = "optim", 
                     optmethod = "Nelder-Mead", 
                     maxit = 10000, 
                     stepadj = 0.5), verbose = F)),
    model_heteroVR = map(data, ~ metafor::rma.uni(yi = .x$var.VR, sei = sqrt(1 / 2 * (.x$N.VR - 1)),
      control = list(optimizer = "optim", 
                     optmethod = "Nelder-Mead", 
                     maxit = 10000, 
                     stepadj = 0.5), verbose = F)),
    model_heteroRR = map(data, ~ metafor::rma.uni(yi = .x$var.RR, sei = sqrt(1 / 2 * (.x$N.RR - 1)),
      control = list(optimizer = "optim", 
                     optmethod = "Nelder-Mead", 
                     maxit = 10000, 
                     stepadj = 0.5), verbose = F)))

# Re-structure data for each grouping term; extract heterogenenity/variance terms; delete un-used variables
extract_heterogeneity <- function(data, type){
  
  tmp <- data %>%
         filter(., GroupingTerm == type) %>%
         select(., -data) %>%
         mutate(heteroCVR       = .[[2]][[1]]$b, 
                heteroCVR_lower = .[[2]][[1]]$ci.lb, 
                heteroCVR_upper = .[[2]][[1]]$ci.ub, 
                heteroCVR_se    = .[[2]][[1]]$se,
                heteroVR        = .[[3]][[1]]$b, 
                heteroVR_lower  = .[[3]][[1]]$ci.lb, 
                heteroVR_upper  = .[[3]][[1]]$ci.ub, 
                heteroVR_se     = .[[3]][[1]]$se,
                heteroRR        = .[[4]][[1]]$b, 
                heteroRR_lower  = .[[4]][[1]]$ci.lb, 
                heteroRR_upper  = .[[4]][[1]]$ci.ub, 
                heteroRR_se     = .[[4]][[1]]$se) %>%
        select(., GroupingTerm, heteroCVR:heteroRR_se)
  
  return(tmp)
}
   Behaviour. <- extract_heterogeneity(heterog1, type = "Behaviour")
  Immunology. <- extract_heterogeneity(heterog1, type = "Immunology")
  Hematology. <- extract_heterogeneity(heterog1, type = "Hematology")
     Hearing. <- extract_heterogeneity(heterog1, type = "Hearing")
  Physiology. <- extract_heterogeneity(heterog1, type = "Physiology")
  Metabolism. <- extract_heterogeneity(heterog1, type = "Metabolism")
  Morphology. <- extract_heterogeneity(heterog1, type = "Morphology")
       Heart. <- extract_heterogeneity(heterog1, type = "Heart")
         Eye. <- extract_heterogeneity(heterog1, type = "Eye")
#Reorder to be able to keep cell referencing 
heterog1_all <- heterog1_all %>% 
                mutate(GroupingTerm = "All") %>% 
                select(GroupingTerm, everything())
		
		All. <- heterog1_all %>% 
        select(., -data) %>%
        mutate(heteroCVR       = .[[2]][[1]]$b, 
               heteroCVR_lower = .[[2]][[1]]$ci.lb, 
               heteroCVR_upper = .[[2]][[1]]$ci.ub, 
               heteroCVR_se    = .[[2]][[1]]$se, 
               heteroVR        = .[[3]][[1]]$b, 
               heteroVR_lower  = .[[3]][[1]]$ci.lb, 
               heteroVR_upper  = .[[3]][[1]]$ci.ub, 
               heteroVR_se     = .[[3]][[1]]$se, 
               heteroRR        = .[[4]][[1]]$b, 
               heteroRR_lower  = .[[4]][[1]]$ci.lb, 
               heteroRR_upper  = .[[4]][[1]]$ci.ub, 
               heteroRR_se     = .[[4]][[1]]$se) %>%
        select(., GroupingTerm, heteroCVR:heteroRR_se)
	
heterog2 <- bind_rows(Behaviour., Morphology., Metabolism., Physiology., Immunology., Hematology., Heart., Hearing., Eye., All.)

#### Preparation of the Heterogeneity PLOT
#Restructure data for plotting
heterog3 <- gather(heterog2, parameter, value, c(heteroCVR, heteroVR, heteroRR), factor_key = TRUE)
heteroCVR.ci <- heterog3 %>%
                filter(parameter == "heteroCVR") %>%
                mutate(ci.low = heteroCVR_lower, 
                       ci.high = heteroCVR_upper)
heteroVR.ci <-  heterog3 %>%
                filter(parameter == "heteroVR") %>%
                mutate(ci.low = heteroVR_lower, 
                       ci.high = heteroVR_upper)
heteroRR.ci <-  heterog3 %>%
                filter(parameter == "heteroRR") %>%
                mutate(ci.low = heteroRR_lower, 
                       ci.high = heteroRR_upper)
heterog4 <- bind_rows(heteroCVR.ci, heteroVR.ci, heteroRR.ci) %>% select(GroupingTerm, parameter, value, ci.low, ci.high)
# **Re-order grouping terms
heterog4$GroupingTerm <- factor(heterog4$GroupingTerm, 
                                levels = c("Behaviour", "Morphology", "Metabolism", 
                                           "Physiology", "Immunology", "Hematology", "Heart", 
                                           "Hearing", "Eye", "All"))
heterog4$GroupingTerm <- factor(heterog4$GroupingTerm, 
                                rev(levels(heterog4$GroupingTerm)))
heterog4$label <- "All traits"
# write.csv(heterog4, "heterog4.csv")
#### Plot S1 C (Second-order meta analysis on heterogeneity)
heterog5 <- heterog4
heterog5$mean <- as.numeric(exp(heterog5$value))
heterog5$ci.l <- as.numeric(exp(heterog5$ci.low))
heterog5$ci.h <- as.numeric(exp(heterog5$ci.high))
heterog6 <- heterog5
HeteroS1 <-
  heterog6 %>%
  ggplot(aes(y = GroupingTerm, x = mean)) +
  geom_errorbarh(aes(
    xmin = ci.l,
    xmax = ci.h
  ),
  height = 0.1, show.legend = FALSE
  ) +
  geom_point(aes(shape = parameter),
    fill = "black",
    color = "black", size = 2.2,
    show.legend = FALSE
  ) +
  scale_x_continuous(
    limits = c(-0.1, 1.4),
    # breaks = c(0, 0.1, 0.2),
    name = "sigma^2"
  ) +
  # geom_vline(xintercept=0,
  # color='black',
  # linetype='dashed')+
  facet_grid(
    cols = vars(parameter), rows = vars(label),
        labeller = label_wrap_gen(width = 23),
    scales = "free",
    space = "free"
  ) +
  theme_bw() +
  theme(
    strip.text.y = element_text(angle = 270, size = 10, margin = margin(t = 15, r = 15, b = 15, l = 15)),
    strip.text.x = element_text(size = 12),
    strip.background = element_rect(colour = NULL, linetype = "blank", fill = "gray90"),
    text = element_text(size = 14),
    panel.spacing = unit(0.5, "lines"),
    panel.border = element_blank(),
    axis.line = element_line(),
    panel.grid.major.x = element_line(linetype = "solid", colour = "gray95"),
    panel.grid.major.y = element_line(linetype = "solid", color = "gray95"),
    panel.grid.minor.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.title = element_blank(),
    axis.title.x = element_text(hjust = 0.5, size = 14),
    axis.title.y = element_blank())
```

```{r Fig4S1, fig.height = 8, fig.width = 7, fig.cap = "Percentages and numbers of either male (turqoise bars) or female (orange bars) biased traits (Panel A) across functional groups, this time for lnCVR (left hand side), lnVR (middle) and lnRR (right hand side).
Panel B shows effect sizes from separate meta-analysis for each functional group, and Panel C contains results of heterogeneity analyses. All three panels represent results evaluated across all traits."}

#### Combined Figure 4S1: overall Count data, Meta anlysis results, Heterogeneity

Fig4S1 <- ggarrange(malebias_FigS1_alltraits, Metameta_FigS1_alltraits, HeteroS1, nrow = 3, align = "v", heights = c(1.1, 1, 1), labels = c("A", "B", "C"))
Fig4S1

```



## Figure 4—figure supplement 2

```{r Fig4S2_prep, echo = FALSE, eval = TRUE}
 meta.plot2.sig <- meta_clean %>%
                   mutate(lnCVRsig = ifelse(lnCVR_lower * lnCVR_upper > 0, 1, 0), 
                           lnVRsig = ifelse( lnVR_lower * lnVR_upper  > 0, 1, 0),
                           lnRRsig = ifelse( lnRR_lower * lnRR_upper  > 0, 1, 0))
meta.plot2.sig.b <- meta.plot2.sig[, c("lnCVR", "lnRR", "lnCVRsig", "lnVRsig", "lnRRsig", "GroupingTerm")] 
    meta.plot2.sig.c <- gather(meta.plot2.sig.b, trait, value, lnCVR:lnRR)
meta.plot2.sig.c$sig <- "placeholder"
meta.plot2.sig.c$trait <- factor(meta.plot2.sig.c$trait, levels = c("lnCVR", "lnRR")) 
  meta.plot2.sig.c$sig <- ifelse(meta.plot2.sig.c$trait == "lnCVR", 
                                 meta.plot2.sig.c$lnCVRsig,
                                 ifelse(meta.plot2.sig.c$trait == "lnVR", 
                                        meta.plot2.sig.c$lnVRsig, 
                                        meta.plot2.sig.c$lnRRsig))
# Choosing sex biased ln-ratios significantly larger than 0
meta.plot2.sig.malebias <- meta.plot2.sig.c %>%
                           group_by_at(vars(trait, GroupingTerm)) %>%
                           filter(sig == 1) %>%
                           summarise(male_sig = sum(value > 0), female_sig = sum(value < 0), total = male_sig + female_sig)
meta.plot2.sig.malebias <- ungroup(meta.plot2.sig.malebias) %>%
                          # add "Hearing" for lnCVR (not filtered as only zeros)
                          add_row(trait = "lnCVR", GroupingTerm = "Hearing", male_sig = 0, female_sig = 0, .before = 4) %>% 
                          mutate(malepercent = male_sig * 100 / total, femalepercent = female_sig * 100 / total)
meta.plot2.sig.malebias$label <- "CI not overlapping zero"
# Re-structure to create stacked bar plots
  meta.plot2.sig.bothsexes <- as.data.frame(meta.plot2.sig.malebias)
meta.plot2.sig.bothsexes.b <- gather(meta.plot2.sig.bothsexes, key = sex, value = percent, malepercent:femalepercent, factor_key = TRUE)
# create new sample size variable
meta.plot2.sig.bothsexes.b$samplesize <- with(meta.plot2.sig.bothsexes.b, ifelse(sex == "malepercent", male_sig, female_sig))
# Add summary row ('All') and re-arrange rows into correct order for plotting 
meta.plot2.sig.bothsexes.c <- meta.plot2.sig.bothsexes.b %>% 
                              group_by(trait, sex) %>% 
                              summarise(GroupingTerm = "All", 
                                        male_sig = sum(male_sig), 
                                        female_sig = sum(female_sig), 
                                        total = male_sig + female_sig, 
                                        label = "CI not overlapping zero", 
                                        samplesize = sum(samplesize)) %>%
                              mutate(percent = ifelse(sex == "femalepercent", 
                                                      female_sig*100/(male_sig+female_sig), 
                                                      male_sig*100/(male_sig+female_sig))) %>%
                              bind_rows(meta.plot2.sig.bothsexes.b, .) %>%
                              mutate(rownumber = row_number()) %>%
                              .[c(37, 1:9, 39, 10:18, 38, 19:27, 40, 28:36), ] 
  #line references in previous code line corresponding to: 
  #'lnCVR(male(All)), lnCVR(male('single grouping terms'), lnRR(male(All)), lnRR(male('single grouping terms')),
  #lnCVR(female(All)), lnCVR(female('single grouping terms'), lnRR(female(All)), lnRR(female('single grouping terms'))'
meta.plot2.sig.bothsexes.c$GroupingTerm <- factor(meta.plot2.sig.bothsexes.c$GroupingTerm,
                                                  levels = c("Behaviour", "Morphology", "Metabolism", "Physiology", 
                                                             "Immunology", "Hematology", "Heart", "Hearing", "Eye", "All")) 
meta.plot2.sig.bothsexes.c$GroupingTerm <- factor(meta.plot2.sig.bothsexes.c$GroupingTerm, 
                                                  rev(levels(meta.plot2.sig.bothsexes.c$GroupingTerm)))
# Plot Fig2 all significant results (CI not overlapping zero)
# Several grouping terms are added post-hoc (with no data to display): no significant lnCVR for 'Hearing' in either sex; no sig. male-biased lnCVR for 'Immunology' and 'Eye, and no significant female-biased lnRR for 'Eye'.
malebias_Fig2_sigtraits <-
  ggplot(meta.plot2.sig.bothsexes.c) +
  aes(x = GroupingTerm, y = percent, fill = sex) +
  ylab("Percentage") +
  geom_col() +
  geom_hline(yintercept = 50, linetype = "dashed", color = "gray40") +
  geom_text(
    data = subset(meta.plot2.sig.bothsexes.c, samplesize != 0), aes(label = samplesize), position = position_stack(vjust = .5),
    color = "white", size = 3.5) +
  facet_grid(cols = vars(trait), 
             rows = vars(label), 
             labeller = label_wrap_gen(width = 18),
             scales = "free", space = "free") +
  scale_fill_brewer(palette = "Set2") +
  theme_bw(base_size = 18) +
  theme(strip.text.y = element_text(angle = 270, size = 10, margin = margin(t = 15, r = 15, b = 15, l = 15)),
        strip.text.x = element_text(size = 12),
        strip.background = element_rect(colour = NULL, linetype = "blank", fill = "gray90"),
        text = element_text(size = 14),
        panel.spacing = unit(0.5, "lines"),
        panel.border = element_blank(),
        axis.line = element_line(),
        panel.grid.major.x = element_line(linetype = "solid", colour = "gray95"),
        panel.grid.major.y = element_line(linetype = "solid", color = "gray95"),
        panel.grid.minor.y = element_blank(),
        panel.grid.minor.x = element_blank(),
        legend.position = "none",
        axis.title.y = element_blank()) +
  coord_flip()
### Preparation for Plots on significant sex-bias (Second-order meta analysis results)
#### Figure 5 B - traits with CI not overlapping 0 
# Prepare data 
# create column with 1 = different from zero, 0 = zero included in CI
#### Male-biased (significant) traits:
meta.male.plot3.sig <- metacombo %>%
                       mutate(sigCVR = ifelse(lnCVR_lower > 0, 1, 0),
                               sigVR = ifelse( lnVR_lower > 0, 1, 0),
                               sigRR = ifelse( lnRR_lower > 0, 1, 0))
# Significant subset for lnCVR
metacombo_male.plot3.CVR <- meta.male.plot3.sig    %>%
                            filter(sigCVR == 1)    %>%
                            group_by(GroupingTerm) %>%
                            nest()
metacombo_male.plot3.CVR.all <- meta.male.plot3.sig %>%
                                filter(sigCVR == 1) %>%
                                nest(data = everything())   
# Significant subset for lnVR
metacombo_male.plot3.VR <- meta.male.plot3.sig    %>%
                           filter(sigVR == 1)     %>%
                           group_by(GroupingTerm) %>%
                           nest()
metacombo_male.plot3.VR.all <- meta.male.plot3.sig    %>%
                               filter(sigVR == 1)     %>%
                               nest(data = everything())    
# Significant subset for lnRR
metacombo_male.plot3.RR <- meta.male.plot3.sig    %>%
                           filter(sigRR == 1)     %>%
                           group_by(GroupingTerm) %>%
                           nest()
metacombo_male.plot3.RR.all <- meta.male.plot3.sig %>%
                               filter(sigRR == 1)  %>%
                               nest(data = everything())   
# **Final random effects meta-analyses within grouping terms, with SE of the estimate
plot3.male.meta.CVR <- metacombo_male.plot3.CVR %>%
                       mutate(model_lnCVR = map(data, 
                                                ~ metafor::rma.uni(yi = .x$lnCVR, sei = (.x$lnCVR_upper - .x$lnCVR_lower) / (2 * 1.96), 
                                              control = list(optimizer = "optim", 
                                                             optmethod = "Nelder-Mead", 
                                                             maxit = 1000), 
                                              verbose = F)))
plot3.male.meta.VR <- metacombo_male.plot3.VR %>%
                      mutate(model_lnVR = map(data, 
                                              ~ metafor::rma.uni(yi = .x$lnVR, sei = (.x$lnVR_upper - .x$lnVR_lower) / (2 * 1.96),
                                            control = list(optimizer = "optim", 
                                                           optmethod = "Nelder-Mead", 
                                                           maxit = 1000), 
                                            verbose = F)))
plot3.male.meta.RR <- metacombo_male.plot3.RR %>%
                      mutate(model_lnRR = map(data, 
                                              ~ metafor::rma.uni(yi = .x$lnRR, sei = (.x$lnRR_upper - .x$lnRR_lower) / (2 * 1.96),
                                              control = list(optimizer = "optim", 
                                                             optmethod = "Nelder-Mead", 
                                                             maxit = 1000), 
                                              verbose = F)))
# Across all grouping terms #
plot3.male.meta.CVR.all <- metacombo_male.plot3.CVR.all %>%
                            mutate(model_lnCVR = map(data,
                                                     ~ metafor::rma.uni(yi = .x$lnCVR, sei = (.x$lnCVR_upper - .x$lnCVR_lower) / (2 * 1.96),
                                                  control = list(optimizer = "optim", 
                                                                 optmethod = "Nelder-Mead", 
                                                                 maxit = 1000), 
                                                  verbose = F)))
plot3.male.meta.CVR.all <- plot3.male.meta.CVR.all %>% 
                            mutate(GroupingTerm = "All")
plot3.male.meta.VR.all <- metacombo_male.plot3.VR.all %>%
                          mutate(model_lnVR = map(data, 
                                                  ~ metafor::rma.uni(yi = .x$lnVR, sei = (.x$lnVR_upper - .x$lnVR_lower) / (2 * 1.96),
                                                  control = list(optimizer = "optim", 
                                                                 optmethod = "Nelder-Mead", 
                                                                 maxit = 1000), 
                                                  verbose = F)))
plot3.male.meta.VR.all <- plot3.male.meta.VR.all %>% 
                          mutate(GroupingTerm = "All")
plot3.male.meta.RR.all <- metacombo_male.plot3.RR.all %>%
                          mutate(model_lnRR = map(data, 
                                                  ~ metafor::rma.uni(yi = .x$lnRR, sei = (.x$lnRR_upper - .x$lnRR_lower) / (2 * 1.96),
                                                control = list(optimizer = "optim", 
                                                               optmethod = "Nelder-Mead", 
                                                               maxit = 1000), 
                                                verbose = F)))
plot3.male.meta.RR.all <- plot3.male.meta.RR.all %>% 
                          mutate(GroupingTerm = "All")
# Combine with separate grouping term results
plot3.male.meta.CVR <- bind_rows(plot3.male.meta.CVR, plot3.male.meta.CVR.all)
 plot3.male.meta.VR <- bind_rows(plot3.male.meta.VR,   plot3.male.meta.VR.all)
 plot3.male.meta.RR <- bind_rows(plot3.male.meta.RR,   plot3.male.meta.RR.all)
# **Re-structure data for each grouping term; delete un-used variables
plot3.male.meta.CVR.b <- as.data.frame(plot3.male.meta.CVR %>% 
                                       group_by(GroupingTerm) %>%
                                       mutate( lnCVR = map_dbl(model_lnCVR, pluck(2)), 
                                               lnCVR_lower = map_dbl(model_lnCVR, pluck(6)), 
                                               lnCVR_upper = map_dbl(model_lnCVR, pluck(7)), 
                                               lnCVR_se = map_dbl(model_lnCVR, pluck(3)), 
                                               lnCVR_I2 = map_dbl(model_lnCVR, pluck("I2"))))[, c(1, 4:8)]
add.row.hearing <- as.data.frame(t(c("Hearing", NA, NA, NA, NA, NA))) %>% 
                                  setNames(names(plot3.male.meta.CVR.b))
plot3.male.meta.CVR.b <- rbind(plot3.male.meta.CVR.b, add.row.hearing) #use bind_rows in R3.6
plot3.male.meta.CVR.b <- plot3.male.meta.CVR.b[order(plot3.male.meta.CVR.b$GroupingTerm), ]
plot3.male.meta.VR.b <- as.data.frame(plot3.male.meta.VR %>% 
                                      group_by(GroupingTerm) %>%
                                      mutate(lnVR = map_dbl(model_lnVR, pluck(2)), 
                                             lnVR_lower = map_dbl(model_lnVR, pluck(6)),
                                             lnVR_upper = map_dbl(model_lnVR, pluck(7)), 
                                             lnVR_se = map_dbl(model_lnVR, pluck(3)),
                                             lnVR_I2 = map_dbl(model_lnVR, pluck("I2"))))[, c(1, 4:8)]
plot3.male.meta.VR.b <- plot3.male.meta.VR.b[order(plot3.male.meta.VR.b$GroupingTerm), ]
plot3.male.meta.RR.b <- as.data.frame(plot3.male.meta.RR %>% 
                                      group_by(GroupingTerm) %>% 
                                      mutate(lnRR = map_dbl(model_lnRR, pluck(2)), 
                                             lnRR_lower = map_dbl(model_lnRR, pluck(6)),
                                             lnRR_upper = map_dbl(model_lnRR, pluck(7)), 
                                             lnRR_se = map_dbl(model_lnRR, pluck(3)),
                                             lnRR_I2 = map_dbl(model_lnRR, pluck("I2"))))[, c(1, 4:8)]
plot3.male.meta.RR.b <- plot3.male.meta.RR.b[order(plot3.male.meta.RR.b$GroupingTerm), ]
overall.male.plot3 <- full_join(plot3.male.meta.CVR.b, plot3.male.meta.VR.b)
overall.male.plot3 <- full_join(overall.male.plot3,    plot3.male.meta.RR.b)
overall.male.plot3$GroupingTerm <- factor(overall.male.plot3$GroupingTerm, 
                                          levels = c("Behaviour", "Morphology", "Metabolism", "Physiology", 
                                                     "Immunology", "Hematology", "Heart", "Hearing", "Eye", "All"))
overall.male.plot3$GroupingTerm <- factor(overall.male.plot3$GroupingTerm, 
                                          rev(levels(overall.male.plot3$GroupingTerm)))
overall.male.plot3$GroupingTerm <- factor(overall.male.plot3$GroupingTerm, 
                                          levels = c("Behaviour", "Morphology", "Metabolism", "Physiology", 
                                                     "Immunology", "Hematology", "Heart", "Hearing", "Eye", "All"))
overall.male.plot3$GroupingTerm <- factor(overall.male.plot3$GroupingTerm, 
                                          rev(levels(overall.male.plot3$GroupingTerm)))
# Restructure MALE data for plotting 
overall3.male.sig <- gather(overall.male.plot3, parameter, value, c(lnCVR, lnRR), factor_key = TRUE) 
lnCVR.ci <- overall3.male.sig            %>%
            filter(parameter == "lnCVR") %>%
            mutate(ci.low = lnCVR_lower, 
                   ci.high = lnCVR_upper)
lnRR.ci <- overall3.male.sig           %>%
           filter(parameter == "lnRR") %>%
           mutate(ci.low = lnRR_lower, 
                 ci.high = lnRR_upper)
overall4.male.sig <- rbind(lnCVR.ci, lnRR.ci) %>%  #use bind_rows in R3.6
                     select(GroupingTerm, 
                            parameter, 
                            value, 
                            ci.low, 
                            ci.high) 
overall4.male.sig$label <- "CI not overlapping zero"
# Plot Fig 5B all significant results (CI not overlapping zero) for males. This is the right panel in Figure 5B.
Metameta_Fig3_male.sig <- overall4.male.sig %>%
  ggplot(aes(y = GroupingTerm, x = value)) +
  geom_errorbarh(aes(xmin = ci.low,
                     xmax = ci.high),
                 height = 0.1, 
                 show.legend = FALSE) +
  geom_point(aes(shape = parameter),
             fill = "mediumaquamarine", 
             color = "mediumaquamarine", 
             size = 2.2, 
             show.legend = FALSE) +
  scale_x_continuous(limits = c(0, 0.4),
                     breaks = c(0, 0.3),
                       name = "Effect size") +
  geom_vline(xintercept = 0,
                  color = "black",
              linetype = "dashed") +
  facet_grid(cols = vars(parameter), 
             rows = vars(label),
             labeller = label_wrap_gen(width = 23),
             scales = "free",
             space = "free") +
  theme_bw() +
  theme(strip.text.y = element_text(angle = 270, 
                                    size = 10, 
                                    margin = margin(t = 15, r = 15, b = 15, l = 15)),
        strip.text.x = element_text(size = 12),
        strip.background = element_rect(colour = NULL, 
                                        linetype = "blank", 
                                        fill = "gray90"),
        text = element_text(size = 14),
        panel.spacing = unit(0.5, "lines"),
        panel.border = element_blank(),
        axis.line = element_line(),
        panel.grid.major.x = element_line(linetype = "solid", 
                                          colour = "gray95"),
        panel.grid.major.y = element_line(linetype = "solid", 
                                          color = "gray95"),
        panel.grid.minor.y = element_blank(),
        panel.grid.minor.x = element_blank(),
        legend.title = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_blank())
# Metameta_Fig3_male.sig
#### Female part, significant traits: Female Fig5B sig
# as above: Prepare data for traits with CI not overlapping 0
# create column with 1 = different from zero, 0 = zero included in CI
# Female-biased traits
meta.female.plot3.sig <- metacombo %>%
  mutate(sigCVR = ifelse(lnCVR_upper < 0, 1, 0),
          sigVR = ifelse(lnVR_upper < 0, 1, 0),
          sigRR = ifelse(lnRR_upper < 0, 1, 0))
# Significant subset for lnCVR
metacombo_female.plot3.CVR <- meta.female.plot3.sig  %>%
                              filter(sigCVR == 1)    %>%
                              group_by(GroupingTerm) %>%
                              nest()
metacombo_female.plot3.CVR.all <- meta.female.plot3.sig %>%
                                  filter(sigCVR == 1)   %>%
                                  nest(data = everything())   
# Significant subset for lnVR
metacombo_female.plot3.VR <- meta.female.plot3.sig  %>%
                             filter(sigVR == 1)     %>%
                             group_by(GroupingTerm) %>%
                             nest()
metacombo_female.plot3.VR.all <- meta.female.plot3.sig  %>%
                                 filter(sigVR == 1)     %>%
                                 nest(data = everything())   
# Significant subset for lnRR
metacombo_female.plot3.RR <- meta.female.plot3.sig  %>%
                             filter(sigRR == 1)     %>%
                             group_by(GroupingTerm) %>%
                             nest()
metacombo_female.plot3.RR.all <- meta.female.plot3.sig %>%
                                 filter(sigRR == 1)    %>%
                                 nest(data = everything())   
# **Final random effects meta-analyses within grouping terms, with SE of the estimate
plot3.female.meta.CVR <- metacombo_female.plot3.CVR %>%
                         mutate(model_lnCVR = map(data, 
                                                  ~ metafor::rma.uni(yi = .x$lnCVR, sei = (.x$lnCVR_upper - .x$lnCVR_lower) / (2 * 1.96),
                                                  control = list(optimizer = "optim", 
                                                                 optmethod = "Nelder-Mead", 
                                                                 maxit = 1000), 
                                                  verbose = F)))
plot3.female.meta.VR <- metacombo_female.plot3.VR %>%
                        mutate(model_lnVR = map(data, ~ metafor::rma.uni(yi = .x$lnVR, sei = (.x$lnVR_upper - .x$lnVR_lower) / (2 * 1.96),
                                                control = list(optimizer = "optim", 
                                                               optmethod = "Nelder-Mead", 
                                                               maxit = 1000), 
                                                verbose = F)))
plot3.female.meta.RR <- metacombo_female.plot3.RR %>%
                        mutate(model_lnRR = map(data, ~ metafor::rma.uni(yi = .x$lnRR, sei = (.x$lnRR_upper - .x$lnRR_lower) / (2 * 1.96),
                                                control = list(optimizer = "optim", 
                                                               optmethod = "Nelder-Mead", 
                                                               maxit = 1000), 
                                                verbose = F)))
# Across all grouping terms #
plot3.female.meta.CVR.all <- metacombo_female.plot3.CVR.all %>%
                             mutate(model_lnCVR = map(data, ~ metafor::rma.uni(yi = .x$lnCVR, sei=(.x$lnCVR_upper-.x$lnCVR_lower)/(2 * 1.96),
                                                       control = list(optimizer = "optim", 
                                                                      optmethod = "Nelder-Mead", 
                                                                      maxit = 1000),
                                                       verbose = F)))
plot3.female.meta.CVR.all <- plot3.female.meta.CVR.all %>% mutate(GroupingTerm = "All")
plot3.female.meta.VR.all <- metacombo_female.plot3.VR.all %>%
  mutate(model_lnVR = map(data, ~ metafor::rma.uni(yi = .x$lnVR, sei = (.x$lnVR_upper - .x$lnVR_lower) / (2 * 1.96),
                          control = list(optimizer = "optim", 
                                         optmethod = "Nelder-Mead", 
                                         maxit = 1000), 
                          verbose = F)))
plot3.female.meta.VR.all <- plot3.female.meta.VR.all %>% mutate(GroupingTerm = "All")
plot3.female.meta.RR.all <- metacombo_female.plot3.RR.all %>%
                            mutate(model_lnRR = map(data, 
                                                    ~ metafor::rma.uni(yi = .x$lnRR, sei = (.x$lnRR_upper - .x$lnRR_lower) / (2 * 1.96),
                                                    control = list(optimizer = "optim", 
                                                                   optmethod = "Nelder-Mead", 
                                                                   maxit = 1000), 
                                                    verbose = F)))
plot3.female.meta.RR.all <- plot3.female.meta.RR.all %>% 
                            mutate(GroupingTerm = "All")
# Combine with separate grouping term results
plot3.female.meta.CVR <- bind_rows(plot3.female.meta.CVR, plot3.female.meta.CVR.all) #use bind_rows in R3.6
 plot3.female.meta.VR <- bind_rows(plot3.female.meta.VR,   plot3.female.meta.VR.all)
 plot3.female.meta.RR <- bind_rows(plot3.female.meta.RR,   plot3.female.meta.RR.all)
# Re-structure data for each grouping term; delete un-used variables
plot3.female.meta.CVR.b <- as.data.frame(plot3.female.meta.CVR %>% 
                                           group_by(GroupingTerm) %>%
                                          mutate(    lnCVR = map_dbl(model_lnCVR, pluck(2)), 
                                               lnCVR_lower = map_dbl(model_lnCVR, pluck(6)),
                                               lnCVR_upper = map_dbl(model_lnCVR, pluck(7)), 
                                                  lnCVR_se = map_dbl(model_lnCVR, pluck(3)),
                                                  lnCVR_I2 = map_dbl(model_lnCVR, pluck("I2"))))[, c(1, 4:8)]
add.row.hearing <- as.data.frame(t(c("Hearing", NA, NA, NA, NA,NA))) %>% 
                                 setNames(names(plot3.female.meta.CVR.b))
plot3.female.meta.CVR.b <- rbind(plot3.female.meta.CVR.b, add.row.hearing) #use bind_rows in R3.6
plot3.female.meta.CVR.b <- plot3.female.meta.CVR.b[order(plot3.female.meta.CVR.b$GroupingTerm), ]
plot3.female.meta.VR.b <- as.data.frame(plot3.female.meta.VR %>% 
                                        group_by(GroupingTerm) %>%
                                        mutate(      lnVR = map_dbl(model_lnVR, pluck(2)), 
                                               lnVR_lower = map_dbl(model_lnVR, pluck(6)),
                                               lnVR_upper = map_dbl(model_lnVR, pluck(7)), 
                                                  lnVR_se = map_dbl(model_lnVR, pluck(3)),
                                                  lnVR_I2 = map_dbl(model_lnVR, pluck("I2"))))[, c(1, 4:8)]
plot3.female.meta.VR.b <- plot3.female.meta.VR.b[order(plot3.female.meta.VR.b$GroupingTerm), ]
plot3.female.meta.RR.b <- as.data.frame(plot3.female.meta.RR %>% 
                                        group_by(GroupingTerm) %>%
                                        mutate(      lnRR = map_dbl(model_lnRR, pluck(2)), 
                                               lnRR_lower = map_dbl(model_lnRR, pluck(6)), 
                                               lnRR_upper = map_dbl(model_lnRR, pluck(7)), 
                                                  lnRR_se = map_dbl(model_lnRR, pluck(3)),
                                                  lnRR_I2 = map_dbl(model_lnRR, pluck("I2"))))[, c(1, 4:8)]
plot3.female.meta.RR.b <- plot3.female.meta.RR.b[order(plot3.female.meta.RR.b$GroupingTerm), ]
overall.female.plot3 <- full_join(plot3.female.meta.CVR.b, plot3.female.meta.VR.b)
overall.female.plot3 <- full_join(overall.female.plot3, plot3.female.meta.RR.b)
overall.female.plot3$GroupingTerm <- factor(overall.female.plot3$GroupingTerm, 
                                            levels = c("Behaviour", "Morphology", "Metabolism", "Physiology", 
                                                       "Immunology", "Hematology", "Heart", "Hearing", "Eye", "All"))
overall.female.plot3$GroupingTerm <- factor(overall.female.plot3$GroupingTerm, 
                                            rev(levels(overall.female.plot3$GroupingTerm)))
# Re-structuring data for plotting
overall3.female.sig <- gather(overall.female.plot3, parameter, value, c(lnCVR, lnRR), factor_key = TRUE) 
lnCVR.ci <- overall3.female.sig %>%
  filter(parameter == "lnCVR") %>%
  mutate(ci.low = lnCVR_lower, 
         ci.high = lnCVR_upper)
# lnVR.ci <- overall3.female.sig  %>% filter(parameter == "lnVR") %>% mutate(ci.low = lnVR_lower, ci.high = lnVR_upper)
lnRR.ci <- overall3.female.sig %>%
  filter(parameter == "lnRR") %>%
  mutate(ci.low = lnRR_lower, ci.high = lnRR_upper)
overall4.female.sig <- rbind(lnCVR.ci, lnRR.ci) %>%  #use bind_rows in R3.6
                       select(GroupingTerm, parameter, value, ci.low, ci.high) # lnVR.ci,
overall4.female.sig$label <- "CI not overlapping zero"
# PLOT preparation for Fig5B: all significant results (CI not overlapping zero, female )
Metameta_Fig3_female.sig  <- overall4.female.sig %>%
  ggplot(aes(y = GroupingTerm, x = value)) +
  geom_errorbarh(aes(
    xmin = ci.low,
    xmax = ci.high),
  height = 0.1, show.legend = FALSE) +
  geom_point(aes(shape = parameter),
    fill = "salmon1", color = "salmon1", size = 2.2,
    show.legend = FALSE) +
  scale_x_continuous(
    limits = c(-0.4, 0),
    breaks = c(-0.3, 0),
    name = "Effect size") +
  geom_vline(
    xintercept = 0,
    color = "black",
    linetype = "dashed") +
  facet_grid(
    cols = vars(parameter), # rows = vars(label),
    # labeller = label_wrap_gen(width = 23),
    scales = "free",
    space = "free") +
  theme_bw() +
  theme(
    strip.text.y = element_text(angle = 270, size = 10, margin = margin(t = 15, r = 15, b = 15, l = 15)),
    strip.text.x = element_text(size = 12),
    strip.background = element_rect(colour = NULL, linetype = "blank", fill = "gray90"),
    text = element_text(size = 14),
    panel.spacing = unit(0.5, "lines"),
    panel.border = element_blank(),
    axis.line = element_line(),
    panel.grid.major.x = element_line(linetype = "solid", colour = "gray95"),
    panel.grid.major.y = element_line(linetype = "solid", color = "gray95"),
    panel.grid.minor.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.title = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_blank()
  )

### Preparing the data
#The supplemental figures include lnVR (in addition to lnCVR and lnRR, as presented in Figures 4 and 5)

# *Prepare data for all traits

meta.plot2.all <- meta_clean %>%
  select(lnCVR, lnVR, lnRR, GroupingTerm) %>%
  arrange(GroupingTerm)

meta.plot2.all.bS1 <- gather(meta.plot2.all, trait, value, c(lnCVR, lnVR, lnRR))

meta.plot2.all.bS1$trait <- factor(meta.plot2.all.bS1$trait, levels = c("lnCVR", "lnVR", "lnRR"))

meta.plot2.all.cS1 <- meta.plot2.all.bS1 %>%
  group_by_at(vars(trait, GroupingTerm)) %>%
  summarise(
    malebias = sum(value > 0), femalebias = sum(value <= 0), total = malebias + femalebias,
    malepercent = malebias * 100 / total, femalepercent = femalebias * 100 / total
  )

meta.plot2.all.cS1$label <- "All traits"

# Re-structure to create stacked bar plots

meta.plot2.all.dS1 <- as.data.frame(meta.plot2.all.cS1)
meta.plot2.all.eS1 <- gather(meta.plot2.all.dS1, key = sex, value = percent, malepercent:femalepercent, factor_key = TRUE)

# Create new sample size variable

meta.plot2.all.eS1$samplesize <- with(meta.plot2.all.eS1, ifelse(sex == "malepercent", malebias, femalebias))

# Add summary row ('All') and re-arrange rows into correct order for plotting (warnings about coercing 'id' into character vector are ok)

meta.plot2.all.fS1 <- meta.plot2.all.eS1 %>% group_by(trait, sex) %>% 
  summarise(GroupingTerm = "All", malebias = sum(malebias), femalebias = sum(femalebias), total = malebias + femalebias, 
            label = "All traits", samplesize = sum(samplesize)) %>%
  mutate(percent = ifelse(sex == "femalepercent", femalebias*100/(malebias+femalebias), malebias*100/(malebias+femalebias))) %>%
  bind_rows(meta.plot2.all.eS1, .) %>%
  mutate(rownumber = row_number()) %>%
  .[c(55, 1:9, 57, 10:18, 59, 19:27, 56, 28:36, 58, 37:45, 60, 46:54), ] 

meta.plot2.all.fS1$GroupingTerm <- factor(meta.plot2.all.fS1$GroupingTerm, levels = c("Behaviour", "Morphology", "Metabolism", "Physiology", "Immunology", "Hematology", "Heart", "Hearing", "Eye", "All")) 
meta.plot2.all.fS1$GroupingTerm <- factor(meta.plot2.all.fS1$GroupingTerm, rev(levels(meta.plot2.all.fS1$GroupingTerm)))

malebias_FigS1_alltraits <-
  ggplot(meta.plot2.all.fS1) +
  aes(x = GroupingTerm, y = percent, fill = sex) +
  ylab("Percentage") +
  geom_col() +
  geom_hline(yintercept = 50, linetype = "dashed", color = "gray40") +
  geom_text(
    data = subset(meta.plot2.all.fS1, samplesize != 0), aes(label = samplesize), position = position_stack(vjust = .5),
    color = "white", size = 3.5
  ) +
  facet_grid(
    cols = vars(trait), rows = vars(label), labeller = label_wrap_gen(width = 18),
    scales = "free", space = "free"
  ) +
  scale_fill_brewer(palette = "Set2") +
  theme_bw(base_size = 18) +
  theme(
    strip.text.y = element_text(angle = 270, size = 10, margin = margin(t = 15, r = 15, b = 15, l = 15)),
    strip.text.x = element_text(size = 12),
    strip.background = element_rect(colour = NULL, linetype = "blank", fill = "gray90"),
    text = element_text(size = 14),
    panel.spacing = unit(0.5, "lines"),
    panel.border = element_blank(),
    axis.line = element_line(),
    panel.grid.major.x = element_line(linetype = "solid", colour = "gray95"),
    panel.grid.major.y = element_line(linetype = "solid", color = "gray95"),
    panel.grid.minor.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = "none",
    #axis.title.x = element_blank(),
    axis.title.y = element_blank()
  ) +
  coord_flip()

# malebias_FigS1_alltraits     #(panel A in Figure S1)


###  Overall results of second order meta analysis, INCLUDING VR
#### Re-structure MALE data for plotting 

overall3.male.sigS <- gather(overall.male.plot3, parameter, value, c(lnCVR, lnVR, lnRR), factor_key = TRUE)

lnCVR.ci <- overall3.male.sigS %>%
  filter(parameter == "lnCVR") %>%
  mutate(ci.low = lnCVR_lower, ci.high = lnCVR_upper)
lnVR.ci <- overall3.male.sigS %>%
  filter(parameter == "lnVR") %>%
  mutate(ci.low = lnVR_lower, ci.high = lnVR_upper)
lnRR.ci <- overall3.male.sigS %>%
  filter(parameter == "lnRR") %>%
  mutate(ci.low = lnRR_lower, ci.high = lnRR_upper)

overall4.male.sigS <- rbind(lnCVR.ci, lnVR.ci, lnRR.ci) %>% select(GroupingTerm, parameter, value, ci.low, ci.high) #use bind_rows in R3.6

overall4.male.sigS$label <- "CI not overlapping zero"

# Data are re-structured, and grouping terms are being re-ordered

overall3S <- gather(overall2, parameter, value, c(lnCVR, lnVR, lnRR), factor_key = TRUE)

lnCVR.ci <- overall3S %>%
  filter(parameter == "lnCVR") %>%
  mutate(ci.low = lnCVR_lower, ci.high = lnCVR_upper)
lnVR.ci <- overall3S %>%
  filter(parameter == "lnVR") %>%
  mutate(ci.low = lnVR_lower, ci.high = lnVR_upper)
lnRR.ci <- overall3S %>%
  filter(parameter == "lnRR") %>%
  mutate(ci.low = lnRR_lower, ci.high = lnRR_upper)

overall4S <- bind_rows(lnCVR.ci, lnVR.ci, lnRR.ci) %>% select(GroupingTerm, parameter, value, ci.low, ci.high)

# Re-order grouping terms

overall4S$GroupingTerm <- factor(overall4S$GroupingTerm, levels = c("Behaviour", "Morphology", "Metabolism", "Physiology", "Immunology", "Hematology", "Heart", "Hearing", "Eye", "All"))
overall4S$GroupingTerm <- factor(overall4S$GroupingTerm, rev(levels(overall4S$GroupingTerm)))
overall4S$label <- "All traits"

#### Preparation for plot, including lnVR
# Sub-Plot  for Figure S1: all traits (S1 B)

Metameta_FigS1_alltraits <- overall4S %>%

  ggplot(aes(y = GroupingTerm, x = value)) +
  geom_errorbarh(aes(
    xmin = ci.low,
    xmax = ci.high
  ),
  height = 0.1, show.legend = FALSE
  ) +
  geom_point(aes(shape = parameter),
    fill = "black",
    color = "black", size = 2.2,
    show.legend = FALSE
  ) +
  scale_x_continuous(
    limits = c(-0.24, 0.25),
    breaks = c(-0.2, -0.1, 0, 0.1, 0.2),
    name = "Effect size"
  ) +
  geom_vline(
    xintercept = 0,
    color = "black",
    linetype = "dashed"
  ) +
  facet_grid(
    cols = vars(parameter), rows = vars(label),
    labeller = label_wrap_gen(width = 23),
    scales = "free",
    space = "free"
  ) +
  theme_bw() +
  theme(
    strip.text.y = element_text(angle = 270, size = 10, margin = margin(t = 15, r = 15, b = 15, l = 15)),
    strip.text.x = element_text(size = 12),
    strip.background = element_rect(colour = NULL, linetype = "blank", fill = "gray90"),
    text = element_text(size = 14),
    panel.spacing = unit(0.5, "lines"),
    panel.border = element_blank(),
    axis.line = element_line(),
    panel.grid.major.x = element_line(linetype = "solid", colour = "gray95"),
    panel.grid.major.y = element_line(linetype = "solid", color = "gray95"),
    panel.grid.minor.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.title = element_blank(),
    axis.title.x = element_text(hjust = 0.5, size = 14),
    axis.title.y = element_blank()
  )


#This Figure extends Figure 4, as it includes results not only for lnCVR and lnRR but also lnCVR. In addition, we compare two different assessments of sex-bias, significance (CI not overlapping zero) and sex differences in male / female ratios > 10%
# Plot FigS2 all significant results (CI not overlapping zero) for males, count data

meta.plot2.sig.bS <- meta.plot2.sig[, c("lnCVR", "lnVR", "lnRR", "lnCVRsig", "lnVRsig", "lnRRsig", "GroupingTerm")]

meta.plot2.sig.cS <- gather(meta.plot2.sig.bS, trait, value, lnCVR:lnRR)
meta.plot2.sig.cS$sig <- "placeholder"

meta.plot2.sig.cS$trait <- factor(meta.plot2.sig.cS$trait, levels = c("lnCVR", "lnVR", "lnRR"))

meta.plot2.sig.cS$sig <- ifelse(meta.plot2.sig.cS$trait == "lnCVR", meta.plot2.sig.cS$lnCVRsig,
  ifelse(meta.plot2.sig.cS$trait == "lnVR", meta.plot2.sig.cS$lnVRsig, meta.plot2.sig.cS$lnRRsig))

# choosing sex biased ln-ratios significantly larger than 0
meta.plotS2.sig.malebias <- meta.plot2.sig.cS %>%
  group_by_at(vars(trait, GroupingTerm)) %>%
  filter(sig == 1) %>%
  summarise(male_sig = sum(value > 0), female_sig = sum(value < 0), total = male_sig + female_sig)

meta.plotS2.sig.malebias <- ungroup(meta.plotS2.sig.malebias) %>%
  add_row(trait = "lnCVR", GroupingTerm = "Hearing", male_sig = 0, female_sig = 0, .before = 4) %>% # add "Hearing" for lnCVR (not filtered as only zeros)
  mutate(malepercent = male_sig * 100 / total, femalepercent = female_sig * 100 / total)

meta.plotS2.sig.malebias$label <- "CI not overlapping zero"

# restructure to create stacked bar plots

meta.plotS2.sig.bothsexes <- as.data.frame(meta.plotS2.sig.malebias)
meta.plotS2.sig.bothsexes.b <- gather(meta.plotS2.sig.bothsexes, key = sex, value = percent, malepercent:femalepercent, factor_key = TRUE)

# create new sample size variable

meta.plotS2.sig.bothsexes.b$samplesize <- with(meta.plotS2.sig.bothsexes.b, ifelse(sex == "malepercent", male_sig, female_sig))

# Add summary row ('All') and re-arrange rows into correct order for plotting (warnings about coercing 'id' into character vector are ok)

meta.plotS2.sig.bothsexes.c <- meta.plotS2.sig.bothsexes.b %>% group_by(trait, sex) %>% 
  summarise(GroupingTerm = "All", male_sig = sum(male_sig), female_sig = sum(female_sig), total = male_sig + female_sig, 
            label = "CI not overlapping zero", samplesize = sum(samplesize)) %>%
  mutate(percent = ifelse(sex == "femalepercent", female_sig*100/(male_sig + female_sig), male_sig*100/(male_sig + female_sig))) %>%
  bind_rows(meta.plotS2.sig.bothsexes.b, .) %>%
  mutate(rownumber = row_number()) %>%
  .[c(55, 1:9, 57, 10:18, 59, 19:27, 56, 28:36, 58, 37:45, 60, 46:54), ] 

meta.plotS2.sig.bothsexes.c$GroupingTerm <- factor(meta.plotS2.sig.bothsexes.c$GroupingTerm, levels = c("Behaviour", "Morphology", "Metabolism", "Physiology", "Immunology", "Hematology", "Heart", "Hearing", "Eye", "All")) 
meta.plotS2.sig.bothsexes.c$GroupingTerm <- factor(meta.plotS2.sig.bothsexes.c$GroupingTerm, rev(levels(meta.plotS2.sig.bothsexes.c$GroupingTerm)))

# *Plot Fig2 all significant results (CI not overlapping zero):
#     no sig. lnCVR for 'Hearing' in either sex; no sig. male-biased lnCVR for 'Immunology' and 'Eye, and no sig. male-biased lnVR for 'Eye'

malebias_FigS2_sigtraits <-
  ggplot(meta.plotS2.sig.bothsexes.c) +
  aes(x = GroupingTerm, y = percent, fill = sex) +
  ylab("Percentage") +
  geom_col() +
  geom_hline(yintercept = 50, linetype = "dashed", color = "gray40") +
  geom_text(
    data = subset(meta.plotS2.sig.bothsexes.c, samplesize != 0), aes(label = samplesize), position = position_stack(vjust = .5),
    color = "white", size = 3.5
  ) +
  facet_grid(
    cols = vars(trait), rows = vars(label), labeller = label_wrap_gen(width = 18),
    scales = "free", space = "free"
  ) +
  scale_fill_brewer(palette = "Set2") +
  theme_bw(base_size = 18) +
  theme(
    strip.text.y = element_text(angle = 270, size = 10, margin = margin(t = 15, r = 15, b = 15, l = 15)),
    strip.text.x = element_text(size = 12),
    strip.background = element_rect(colour = NULL, linetype = "blank", fill = "gray90"),
    text = element_text(size = 14),
    panel.spacing = unit(0.5, "lines"),
    panel.border = element_blank(),
    axis.line = element_line(),
    panel.grid.major.x = element_line(linetype = "solid", colour = "gray95"),
    panel.grid.major.y = element_line(linetype = "solid", color = "gray95"),
    panel.grid.minor.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = "none",
    #axis.title.x = element_blank(),
    axis.title.y = element_blank()
  ) +
  coord_flip()

# malebias_FigS2_sigtraits # this is Figure S2 A

#Prepare data for traits with effect size ratios > 10% larger in males, supplemental Figure S2

#### Over 10% male bias, count data (first- order metanalysis) 

meta.plot2.over10 <- meta_clean %>%
                    select(lnCVR, lnVR, lnRR, GroupingTerm) %>%
                    arrange(GroupingTerm) 

meta.plot2.over10.b <- gather(meta.plot2.over10, trait, value, c(lnCVR, lnVR, lnRR)) 

meta.plot2.over10.b$trait <- factor(meta.plot2.over10.b$trait, levels = c("lnCVR", "lnVR", "lnRR")) 

meta.plot2.over10.c <- meta.plot2.over10.b %>%
                      group_by_at(vars(trait, GroupingTerm)) %>%
                      summarise(malebias = sum(value > log(11 / 10)), 
                                femalebias = sum(value < log(9 / 10)), 
                                total = malebias + femalebias,
                                malepercent = malebias * 100 / total, 
                                femalepercent = femalebias * 100 / total)

meta.plot2.over10.c$label <- "Sex difference in m/f ratios > 10%"

# restructure to create stacked bar plots

meta.plot2.over10.c <- as.data.frame(meta.plot2.over10.c)
meta.plot2.over10.d <- gather(meta.plot2.over10.c, key = sex, value = percent, malepercent:femalepercent, factor_key = TRUE)

# create new sample size variable

meta.plot2.over10.d$samplesize <- with(meta.plot2.over10.d, ifelse(sex == "malepercent", malebias, femalebias))

# Add summary row ('All') and re-arrange rows into correct order for plotting (warnings about coercing 'id' into character vector are ok)

meta.plot2.over10.e <- meta.plot2.over10.d %>% 
                      group_by(trait, sex) %>% 
                      summarise(GroupingTerm = "All", malebias = sum(malebias), 
                                femalebias = sum(femalebias), 
                                total = malebias + femalebias, 
                                label = "Sex difference in m/f ratios > 10%", 
                                samplesize = sum(samplesize)) %>%
                      mutate(percent = ifelse(sex == "femalepercent", 
                                              femalebias*100/(malebias + femalebias), 
                                              malebias*100/(malebias + femalebias))) %>%
                      bind_rows(meta.plot2.over10.d, .) %>%
                      mutate(rownumber = row_number()) %>%
                      .[c(55, 1:9, 57, 10:18, 59, 19:27, 56, 28:36, 58, 37:45, 60, 46:54), ] 

meta.plot2.over10.e$GroupingTerm <- factor(meta.plot2.over10.e$GroupingTerm, levels = c("Behaviour", "Morphology", "Metabolism", "Physiology", "Immunology", "Hematology", "Heart", "Hearing", "Eye", "All")) 
meta.plot2.over10.e$GroupingTerm <- factor(meta.plot2.over10.e$GroupingTerm, rev(levels(meta.plot2.over10.e$GroupingTerm)))


# PLOT:  Fig2 Sex difference in m/f ratio > 10% (Fig S2 A, lower panel)

malebias_Fig2_over10 <-
  ggplot(meta.plot2.over10.e) +
  aes(x = GroupingTerm, y = percent, fill = sex) +
  ylab("Percentage") +
  geom_col() +
  geom_hline(yintercept = 50, linetype = "dashed", color = "gray40") +
  geom_text(
    data = subset(meta.plot2.over10.e, samplesize != 0), aes(label = samplesize), position = position_stack(vjust = .5),
    color = "white", size = 3.5
  ) +
  facet_grid(
    cols = vars(trait), rows = vars(label), labeller = label_wrap_gen(width = 18),
    scales = "free", space = "free"
  ) +
  scale_fill_brewer(palette = "Set2") +
  theme_bw(base_size = 18) +
  theme(
    strip.text.y = element_text(angle = 270, size = 10, margin = margin(t = 15, r = 15, b = 15, l = 15)),
    strip.text.x = element_blank(),
    strip.background = element_rect(colour = NULL, linetype = "blank", fill = "gray90"),
    text = element_text(size = 14),
    panel.spacing = unit(0.5, "lines"),
    panel.border = element_blank(),
    axis.line = element_line(),
    panel.grid.major.x = element_line(linetype = "solid", colour = "gray95"),
    panel.grid.major.y = element_line(linetype = "solid", color = "gray95"),
    panel.grid.minor.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = "none",
    #axis.title.x = element_blank(),
    axis.title.y = element_blank()
  ) +
  coord_flip()

#### 3.2: Preparation for Figure 4S2 B (second-order meta-analysis)
#### 3.3: Female Figure, significant and highly biased traits (Figure S2 B, left hand side panels)
#Preparing data for traits with CI not overlapping 0: 
#create column with 1= different from zero, 0= zero included in CI (for Figure S2 B, top left panel)

#Restructure data for plotting


overall3.female.sigS <- gather(overall.female.plot3, parameter, value, c(lnCVR, lnVR, lnRR), factor_key = TRUE)

lnCVR.ci <- overall3.female.sigS %>%
  filter(parameter == "lnCVR") %>%
  mutate(ci.low = lnCVR_lower, ci.high = lnCVR_upper)
lnVR.ci <- overall3.female.sigS %>%
  filter(parameter == "lnVR") %>%
  mutate(ci.low = lnVR_lower, ci.high = lnVR_upper)
lnRR.ci <- overall3.female.sigS %>%
  filter(parameter == "lnRR") %>%
  mutate(ci.low = lnRR_lower, ci.high = lnRR_upper)

overall4.female.sigS <- rbind(lnCVR.ci, lnVR.ci, lnRR.ci) %>% select(GroupingTerm, parameter, value, ci.low, ci.high) # Note: bind_rows may need to be replaced by rbind

overall4.female.sigS <- overall4.female.sigS %>%
                          mutate(value = as.numeric(value),
                                 ci.low = as.numeric(ci.low),
                                 ci.high = as.numeric(ci.high))

overall4.female.sigS$label <- "CI not overlapping zero"

# PLOT female-bias:  significant traits

Metameta_FigS2_female.sig <- overall4.female.sigS %>%
  ggplot(aes(y = GroupingTerm, x = value)) +
  geom_errorbarh(aes(
    xmin = ci.low,
    xmax = ci.high
  ),
  height = 0.1, show.legend = FALSE
  ) +
  geom_point(aes(shape = parameter),
    fill = "salmon1", color = "salmon1", size = 2.2,
    show.legend = FALSE
  ) +
  scale_x_continuous(
    limits = c(-0.4, 0),
    breaks = c(-0.3, 0),
    name = "Effect size"
  ) +
  geom_vline(
    xintercept = 0,
    color = "black",
    linetype = "dashed"
  ) +
  facet_grid(
    cols = vars(parameter), # rows = vars(label),
    # labeller = label_wrap_gen(width = 23),
    scales = "free",
    space = "free"
  ) +
  theme_bw() +
  theme(
    strip.text.y = element_text(angle = 270, size = 10, margin = margin(t = 15, r = 15, b = 15, l = 15)),
    strip.text.x = element_text(size = 12),
    strip.background = element_rect(colour = NULL, linetype = "blank", fill = "gray90"),
    text = element_text(size = 14),
    panel.spacing = unit(0.5, "lines"),
    panel.border = element_blank(),
    axis.line = element_line(),
    panel.grid.major.x = element_line(linetype = "solid", colour = "gray95"),
    panel.grid.major.y = element_line(linetype = "solid", color = "gray95"),
    panel.grid.minor.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.title = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_blank()
  )

#### Prepare data for traits with m/f difference > 10%, Female bias (Figure 4S2 B, bottom left)

meta.plot3.perc <- metacombo %>%
  mutate(
    percCVR = ifelse(lnCVR < log(9 / 10), 1, 0),
    percVR = ifelse(lnVR < log(9 / 10), 1, 0),
    percRR = ifelse(lnRR < log(9 / 10), 1, 0)
  )

# Significant subset for lnCVR
metacombo_plot3.CVR.perc <- meta.plot3.perc %>%
  filter(percCVR == 1) %>%
  group_by(GroupingTerm) %>%
  nest()

metacombo_plot3.CVR.perc.all <- meta.plot3.perc %>%
  filter(percCVR == 1) %>%
  nest(data = everything())

# Significant subset for lnVR
metacombo_plot3.VR.perc <- meta.plot3.perc %>%
  filter(percVR == 1) %>%
  group_by(GroupingTerm) %>%
  nest()

metacombo_plot3.VR.perc.all <- meta.plot3.perc %>%
  filter(percVR == 1) %>%
  nest(data = everything())

# Significant subset for lnRR
metacombo_plot3.RR.perc <- meta.plot3.perc %>%
  filter(percRR == 1) %>%
  group_by(GroupingTerm) %>%
  nest()

metacombo_plot3.RR.perc.all <- meta.plot3.perc %>%
  filter(percRR == 1) %>%
  nest(data = everything())


# **Final random effects meta-analyses within grouping terms, with SE of the estimate

plot3.meta.CVR.perc <- metacombo_plot3.CVR.perc %>%
  mutate(model_lnCVR = map(data, ~ metafor::rma.uni(
    yi = .x$lnCVR, sei = (.x$lnCVR_upper - .x$lnCVR_lower) / (2 * 1.96),
    control = list(optimizer = "optim", optmethod = "Nelder-Mead", maxit = 1000), verbose = F
  )))

plot3.meta.VR.perc <- metacombo_plot3.VR.perc %>%
  mutate(model_lnVR = map(data, ~ metafor::rma.uni(
    yi = .x$lnVR, sei = (.x$lnVR_upper - .x$lnVR_lower) / (2 * 1.96),
    control = list(optimizer = "optim", optmethod = "Nelder-Mead", maxit = 1000), verbose = F
  )))

plot3.meta.RR.perc <- metacombo_plot3.RR.perc %>%
  mutate(model_lnRR = map(data, ~ metafor::rma.uni(
    yi = .x$lnRR, sei = (.x$lnRR_upper - .x$lnRR_lower) / (2 * 1.96),
    control = list(optimizer = "optim", optmethod = "Nelder-Mead", maxit = 1000), verbose = F
  )))

# Across all grouping terms #

plot3.meta.CVR.perc.all <- metacombo_plot3.CVR.perc.all %>%
  mutate(model_lnCVR = map(data, ~ metafor::rma.uni(
    yi = .x$lnCVR, sei = (.x$lnCVR_upper - .x$lnCVR_lower) / (2 * 1.96),
    control = list(optimizer = "optim", optmethod = "Nelder-Mead", maxit = 1000), verbose = F
  )))

plot3.meta.CVR.perc.all <- plot3.meta.CVR.perc.all %>% mutate(GroupingTerm = "All")

plot3.meta.VR.perc.all <- metacombo_plot3.VR.perc.all %>%
  mutate(model_lnVR = map(data, ~ metafor::rma.uni(
    yi = .x$lnVR, sei = (.x$lnVR_upper - .x$lnVR_lower) / (2 * 1.96),
    control = list(optimizer = "optim", optmethod = "Nelder-Mead", maxit = 1000), verbose = F
  )))

plot3.meta.VR.perc.all <- plot3.meta.VR.perc.all %>% mutate(GroupingTerm = "All")

plot3.meta.RR.perc.all <- metacombo_plot3.RR.perc.all %>%
  mutate(model_lnRR = map(data, ~ metafor::rma.uni(
    yi = .x$lnRR, sei = (.x$lnRR_upper - .x$lnRR_lower) / (2 * 1.96),
    control = list(optimizer = "optim", optmethod = "Nelder-Mead", maxit = 1000), verbose = F
  )))

plot3.meta.RR.perc.all <- plot3.meta.RR.perc.all %>% mutate(GroupingTerm = "All")

# Combine with separate grouping term results

plot3.meta.CVR.perc <- bind_rows(plot3.meta.CVR.perc, plot3.meta.CVR.perc.all)
plot3.meta.VR.perc <- bind_rows(plot3.meta.VR.perc, plot3.meta.VR.perc.all)
plot3.meta.RR.perc <- bind_rows(plot3.meta.RR.perc, plot3.meta.RR.perc.all)


# **Re-structure data for each grouping term; delete un-used variables: "Hearing missing for all 3 parameters"

plot3.meta.CVR.perc.b <- as.data.frame(plot3.meta.CVR.perc %>% group_by(GroupingTerm) %>%
  mutate(
    lnCVR = map_dbl(model_lnCVR, pluck(2)), lnCVR_lower = map_dbl(model_lnCVR, pluck(6)),
    lnCVR_upper = map_dbl(model_lnCVR, pluck(7)), lnCVR_se = map_dbl(model_lnCVR, pluck(3))
  ))[, c(1, 4:7)]
add.row.hearing <- as.data.frame(t(c("Hearing", NA, NA, NA, NA))) %>% setNames(names(plot3.meta.CVR.perc.b))
plot3.meta.CVR.perc.b <- rbind(plot3.meta.CVR.perc.b, add.row.hearing)
plot3.meta.CVR.perc.b <- plot3.meta.CVR.perc.b[order(plot3.meta.CVR.perc.b$GroupingTerm), ]

plot3.meta.VR.perc.b <- as.data.frame(plot3.meta.VR.perc %>% group_by(GroupingTerm) %>%
  mutate(
    lnVR = map_dbl(model_lnVR, pluck(2)), lnVR_lower = map_dbl(model_lnVR, pluck(6)),
    lnVR_upper = map_dbl(model_lnVR, pluck(7)), lnVR_se = map_dbl(model_lnVR, pluck(3))
  ))[, c(1, 4:7)]
add.row.hearing <- as.data.frame(t(c("Hearing", NA, NA, NA, NA))) %>% setNames(names(plot3.meta.VR.perc.b))
plot3.meta.VR.perc.b <- rbind(plot3.meta.VR.perc.b, add.row.hearing)
plot3.meta.VR.perc.b <- plot3.meta.VR.perc.b[order(plot3.meta.VR.perc.b$GroupingTerm), ]

plot3.meta.RR.perc.b <- as.data.frame(plot3.meta.RR.perc %>% group_by(GroupingTerm) %>%
  mutate(
    lnRR = map_dbl(model_lnRR, pluck(2)), lnRR_lower = map_dbl(model_lnRR, pluck(6)),
    lnRR_upper = map_dbl(model_lnRR, pluck(7)), lnRR_se = map_dbl(model_lnRR, pluck(3))
  ))[, c(1, 4:7)]
add.row.hearing <- as.data.frame(t(c("Hearing", NA, NA, NA, NA))) %>% setNames(names(plot3.meta.RR.perc.b))
plot3.meta.RR.perc.b <- rbind(plot3.meta.RR.perc.b, add.row.hearing)
add.row.hematology <- as.data.frame(t(c("Hematology", NA, NA, NA, NA))) %>%
  setNames(names(plot3.meta.RR.perc.b))
plot3.meta.RR.perc.b <- rbind(plot3.meta.RR.perc.b, add.row.hematology)


plot3.meta.RR.perc.b <- plot3.meta.RR.perc.b[order(plot3.meta.RR.perc.b$GroupingTerm), ]

plot3.meta.CVR.perc.c <- full_join(plot3.meta.CVR.perc.b, plot3.meta.VR.perc.b)
overall.plot3.perc <- full_join(plot3.meta.CVR.perc.c, plot3.meta.RR.perc.b)


overall.plot3.perc$GroupingTerm <- factor(overall.plot3.perc$GroupingTerm, levels = c("Behaviour", "Morphology", "Metabolism", "Physiology", "Immunology", "Hematology", "Heart", "Hearing", "Eye", "All"))
overall.plot3.perc$GroupingTerm <- factor(overall.plot3.perc$GroupingTerm, rev(levels(overall.plot3.perc$GroupingTerm)))

# Restructure data for plotting
# Female bias, 10 percent difference

overall3.perc <- gather(overall.plot3.perc, parameter, value, c(lnCVR, lnVR, lnRR), factor_key = TRUE) 

lnCVR.ci <- overall3.perc %>%
  filter(parameter == "lnCVR") %>%
  mutate(ci.low = lnCVR_lower, ci.high = lnCVR_upper)
lnVR.ci <- overall3.perc  %>% filter(parameter == "lnVR") %>% mutate(ci.low = lnVR_lower, ci.high = lnVR_upper)
lnRR.ci <- overall3.perc %>%
  filter(parameter == "lnRR") %>%
  mutate(ci.low = lnRR_lower, ci.high = lnRR_upper)

overall4.perc <- bind_rows(lnCVR.ci, lnVR.ci, lnRR.ci) %>% select(GroupingTerm, parameter, value, ci.low, ci.high)

overall4.perc$label <- "Sex difference in m/f ratios > 10%"

overall4.perc$value <- as.numeric(overall4.perc$value)
overall4.perc$ci.low <- as.numeric(overall4.perc$ci.low)
overall4.perc$ci.high <- as.numeric(overall4.perc$ci.high)

# Plot FigS2 all >10% difference (female)
# Figure S2B, bottom left

Metameta_Fig3_female.perc <- overall4.perc %>%
  ggplot(aes(y = GroupingTerm, x = value)) +
  geom_errorbarh(aes(
    xmin = ci.low,
    xmax = ci.high
  ),
  height = 0.1, show.legend = FALSE
  ) +
  geom_point(aes(shape = parameter),
    fill = "salmon1", color = "salmon1", size = 2.2,
    show.legend = FALSE
  ) +

  # scale_shape_manual(values =

  scale_x_continuous(
    limits = c(-0.53, 0.2),
    breaks = c(-0.3, 0),
    name = "Effect size"
  ) +
  geom_vline(
    xintercept = 0,
    color = "black",
    linetype = "dashed"
  ) +
  facet_grid(
    cols = vars(parameter), # rows = vars(label),
    # labeller = label_wrap_gen(width = 23),
    scales = "free",
    space = "free"
  ) +
  theme_bw() +
  theme(
    strip.text.y = element_text(angle = 270, size = 10, margin = margin(t = 15, r = 15, b = 15, l = 15)),
    strip.text.x = element_blank(),
    strip.background = element_rect(colour = NULL, linetype = "blank", fill = "gray90"),
    text = element_text(size = 14),
    panel.spacing = unit(0.5, "lines"),
    panel.border = element_blank(),
    axis.line = element_line(),
    panel.grid.major.x = element_line(linetype = "solid", colour = "gray95"),
    panel.grid.major.y = element_line(linetype = "solid", color = "gray95"),
    panel.grid.minor.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.title = element_blank(),
    axis.title.x = element_text(hjust = 0.5, size = 14),
    axis.title.y = element_blank()
  )


#### 3.4: Male Figure, significant and highly biased traits (Figure 4S2 B, right hand side panels)
#Restructuring data with significant male bias (for Figure 4S2 B, top right panel)

overall3.male.sigS <- gather(overall.male.plot3, parameter, value, c(lnCVR, lnVR, lnRR), factor_key = TRUE)


lnCVR.ci <- overall3.male.sigS %>%
  filter(parameter == "lnCVR") %>%
  mutate(ci.low = lnCVR_lower, ci.high = lnCVR_upper)
lnVR.ci <- overall3.male.sigS %>%
  filter(parameter == "lnVR") %>%
  mutate(ci.low = lnVR_lower, ci.high = lnVR_upper)
lnRR.ci <- overall3.male.sigS %>%
  filter(parameter == "lnRR") %>%
  mutate(ci.low = lnRR_lower, ci.high = lnRR_upper)

overall4.male.sigS <- rbind(lnCVR.ci, lnVR.ci, lnRR.ci) %>% select(GroupingTerm, parameter, value, ci.low, ci.high) %>% 
  mutate(value = as.numeric(value), ci.low = as.numeric(ci.low), ci.high = as.numeric(ci.high))
# R 3.6: do not use "%>% mutate(value = as.numeric(value), ci.low = as.numeric(ci.low), ci.high = as.numeric(ci.high))"
overall4.male.sigS$label <- "CI not overlapping zero"


# Plot FigS2 all significant results (CI not overlapping zero, male )

Metameta_FigS2_male.sig <- overall4.male.sigS %>%
  ggplot(aes(y = GroupingTerm, x = value)) +
  geom_errorbarh(aes(
    xmin = ci.low,
    xmax = ci.high
  ),
  height = 0.1, show.legend = FALSE
  ) +
  geom_point(aes(shape = parameter),
    fill = "mediumaquamarine", color = "mediumaquamarine", size = 2.2,
    show.legend = FALSE
  ) +
  scale_x_continuous(
    limits = c(0, 0.4),
    breaks = c(0, 0.3),
    name = "Effect size"
  ) +
  geom_vline(
    xintercept = 0,
    color = "black",
    linetype = "dashed"
  ) +
  facet_grid(
    cols = vars(parameter), rows = vars(label),
    labeller = label_wrap_gen(width = 23),
    scales = "free",
    space = "free"
  ) +
  theme_bw() +
  theme(
    strip.text.y = element_text(angle = 270, size = 10, margin = margin(t = 15, r = 15, b = 15, l = 15)),
    strip.text.x = element_text(size = 12),
    strip.background = element_rect(colour = NULL, linetype = "blank", fill = "gray90"),
    text = element_text(size = 14),
    panel.spacing = unit(0.5, "lines"),
    panel.border = element_blank(),
    axis.line = element_line(),
    panel.grid.major.x = element_line(linetype = "solid", colour = "gray95"),
    panel.grid.major.y = element_line(linetype = "solid", color = "gray95"),
    panel.grid.minor.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.title = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_blank()
  )

# Metameta_FigS2_male.sig

## Prepare data for traits with m/f difference > 10%. Male bias
### Create column with 1= larger, 0= difference not larger than 10% between male/female ratios

meta.male.plot3.perc <- metacombo %>%
  mutate(
    percCVR = ifelse(lnCVR > log(11 / 10), 1, 0),
     percVR = ifelse( lnVR > log(11 / 10), 1, 0),
     percRR = ifelse( lnRR > log(11 / 10), 1, 0)
  )

# Significant subset for lnCVR
metacombo_male.plot3.CVR.perc <- meta.male.plot3.perc %>%
                                filter(percCVR == 1) %>%
                                group_by(GroupingTerm) %>%
                                nest()

metacombo_male.plot3.CVR.perc.all <- meta.male.plot3.perc %>%
                filter(percCVR == 1) %>%
  nest(data = everything())

# Significant subset for lnVR
metacombo_male.plot3.VR.perc <- meta.male.plot3.perc %>%
  filter(percVR == 1) %>%
  group_by(GroupingTerm) %>%
  nest()

metacombo_male.plot3.VR.perc.all <- meta.male.plot3.perc %>%
  filter(percVR == 1) %>%
  nest(data = everything())

# Significant subset for lnRR
metacombo_male.plot3.RR.perc <- meta.male.plot3.perc %>%
  filter(percRR == 1) %>%
  group_by(GroupingTerm) %>%
  nest()

metacombo_male.plot3.RR.perc.all <- meta.male.plot3.perc %>%
  filter(percRR == 1) %>%
  nest(data = everything())


# Final random effects meta-analyses within grouping terms and across grouping terms, with SE of the estimate

plot3.male.meta.CVR.perc <- metacombo_male.plot3.CVR.perc %>%
  mutate(model_lnCVR = map(data, ~ metafor::rma.uni(
    yi = .x$lnCVR, sei = (.x$lnCVR_upper - .x$lnCVR_lower) / (2 * 1.96),
    control = list(optimizer = "optim", optmethod = "Nelder-Mead", maxit = 1000), verbose = F
  )))

plot3.male.meta.VR.perc <- metacombo_male.plot3.VR.perc %>%
  mutate(model_lnVR = map(data, ~ metafor::rma.uni(
    yi = .x$lnVR, sei = (.x$lnVR_upper - .x$lnVR_lower) / (2 * 1.96),
    control = list(optimizer = "optim", optmethod = "Nelder-Mead", maxit = 1000), verbose = F
  )))

plot3.male.meta.RR.perc <- metacombo_male.plot3.RR.perc %>%
  mutate(model_lnRR = map(data, ~ metafor::rma.uni(
    yi = .x$lnRR, sei = (.x$lnRR_upper - .x$lnRR_lower) / (2 * 1.96),
    control = list(optimizer = "optim", optmethod = "Nelder-Mead", maxit = 1000), verbose = F
  )))

# Across all grouping terms 

plot3.male.meta.CVR.perc.all <- metacombo_male.plot3.CVR.perc.all %>%
  mutate(model_lnCVR = map(data, ~ metafor::rma.uni(
    yi = .x$lnCVR, sei = (.x$lnCVR_upper - .x$lnCVR_lower) / (2 * 1.96),
    control = list(optimizer = "optim", optmethod = "Nelder-Mead", maxit = 1000), verbose = F
  )))

plot3.male.meta.CVR.perc.all <- plot3.male.meta.CVR.perc.all %>% mutate(GroupingTerm = "All")

plot3.male.meta.VR.perc.all <- metacombo_male.plot3.VR.perc.all %>%
  mutate(model_lnVR = map(data, ~ metafor::rma.uni(
    yi = .x$lnVR, sei = (.x$lnVR_upper - .x$lnVR_lower) / (2 * 1.96),
    control = list(optimizer = "optim", optmethod = "Nelder-Mead", maxit = 1000), verbose = F
  )))

plot3.male.meta.VR.perc.all <- plot3.male.meta.VR.perc.all %>% mutate(GroupingTerm = "All")

plot3.male.meta.RR.perc.all <- metacombo_male.plot3.RR.perc.all %>%
  mutate(model_lnRR = map(data, ~ metafor::rma.uni(
    yi = .x$lnRR, sei = (.x$lnRR_upper - .x$lnRR_lower) / (2 * 1.96),
    control = list(optimizer = "optim", optmethod = "Nelder-Mead", maxit = 1000), verbose = F
  )))

plot3.male.meta.RR.perc.all <- plot3.male.meta.RR.perc.all %>% mutate(GroupingTerm = "All")

# Combine with separate grouping term results

plot3.male.meta.CVR.perc <- bind_rows(plot3.male.meta.CVR.perc, plot3.male.meta.CVR.perc.all)
plot3.male.meta.VR.perc <- bind_rows(plot3.male.meta.VR.perc, plot3.male.meta.VR.perc.all)
plot3.male.meta.RR.perc <- bind_rows(plot3.male.meta.RR.perc, plot3.male.meta.RR.perc.all)


# **Re-structure data for each grouping term; delete un-used variables: "Hearing missing for all 3 parameters"

plot3.male.meta.CVR.perc.b <- as.data.frame(plot3.male.meta.CVR.perc %>% group_by(GroupingTerm) %>%
  mutate(
    lnCVR = map_dbl(model_lnCVR, pluck(2)), lnCVR_lower = map_dbl(model_lnCVR, pluck(6)),
    lnCVR_upper = map_dbl(model_lnCVR, pluck(7)), lnCVR_se = map_dbl(model_lnCVR, pluck(3))
  ))[, c(1, 4:7)]
add.row.hearing <- as.data.frame(t(c("Hearing", NA, NA, NA, NA))) %>% setNames(names(plot3.male.meta.CVR.perc.b))
plot3.male.meta.CVR.perc.b <- rbind(plot3.male.meta.CVR.perc.b, add.row.hearing)
plot3.male.meta.CVR.perc.b <- plot3.male.meta.CVR.perc.b[order(plot3.male.meta.CVR.perc.b$GroupingTerm), ]

plot3.male.meta.VR.perc.b <- as.data.frame(plot3.male.meta.VR.perc %>% group_by(GroupingTerm) %>%
  mutate(
    lnVR = map_dbl(model_lnVR, pluck(2)), lnVR_lower = map_dbl(model_lnVR, pluck(6)),
    lnVR_upper = map_dbl(model_lnVR, pluck(7)), lnVR_se = map_dbl(model_lnVR, pluck(3))
  ))[, c(1, 4:7)]
add.row.hearing <- as.data.frame(t(c("Hearing", NA, NA, NA, NA))) %>% setNames(names(plot3.male.meta.VR.perc.b))
plot3.male.meta.VR.perc.b <- rbind(plot3.male.meta.VR.perc.b, add.row.hearing)
plot3.male.meta.VR.perc.b <- plot3.male.meta.VR.perc.b[order(plot3.male.meta.VR.perc.b$GroupingTerm), ]

plot3.male.meta.RR.perc.b <- as.data.frame(plot3.male.meta.RR.perc %>% group_by(GroupingTerm) %>%
  mutate(
    lnRR = map_dbl(model_lnRR, pluck(2)), lnRR_lower = map_dbl(model_lnRR, pluck(6)),
    lnRR_upper = map_dbl(model_lnRR, pluck(7)), lnRR_se = map_dbl(model_lnRR, pluck(3))
  ))[, c(1, 4:7)]
add.row.hearing <- as.data.frame(t(c("Hearing", NA, NA, NA, NA))) %>%
  setNames(names(plot3.male.meta.RR.perc.b))
plot3.male.meta.RR.perc.b <- rbind(plot3.male.meta.RR.perc.b, add.row.hearing)

add.row.eye <- as.data.frame(t(c("Eye", NA, NA, NA, NA))) %>%
  setNames(names(plot3.male.meta.RR.perc.b))
plot3.male.meta.RR.perc.b <- rbind(plot3.male.meta.RR.perc.b, add.row.eye)

plot3.male.meta.RR.perc.b <- plot3.male.meta.RR.perc.b[order(plot3.male.meta.RR.perc.b$GroupingTerm), ]

plot3.male.meta.CVR.Vr.perc <- full_join(plot3.male.meta.CVR.perc.b, plot3.male.meta.VR.perc.b)
overall.male.plot3.perc <- full_join(plot3.male.meta.CVR.Vr.perc, plot3.male.meta.RR.perc.b)


overall.male.plot3.perc$GroupingTerm <- factor(overall.male.plot3.perc$GroupingTerm, levels = c("Behaviour", "Morphology", "Metabolism", "Physiology", "Immunology", "Hematology", "Heart", "Hearing", "Eye", "All"))
overall.male.plot3.perc$GroupingTerm <- factor(overall.male.plot3.perc$GroupingTerm, rev(levels(overall.male.plot3.perc$GroupingTerm)))


# Restructure data for plotting : Male biased, 10% difference

overall3.perc <- gather(overall.male.plot3.perc, parameter, value, c(lnCVR, lnVR, lnRR), factor_key = TRUE) 

lnCVR.ci <- overall3.perc %>%
  filter(parameter == "lnCVR") %>%
  mutate(ci.low = lnCVR_lower, ci.high = lnCVR_upper)
lnVR.ci <- overall3.perc  %>% filter(parameter == "lnVR") %>% mutate(ci.low = lnVR_lower, ci.high = lnVR_upper)
lnRR.ci <- overall3.perc %>%
  filter(parameter == "lnRR") %>%
  mutate(ci.low = lnRR_lower, ci.high = lnRR_upper)

overall4.male.perc <- bind_rows(lnCVR.ci,lnVR.ci, lnRR.ci) %>% select(GroupingTerm, parameter, value, ci.low, ci.high) 

overall4.male.perc$label <- "Sex difference in m/f ratios > 10%"

overall4.male.perc$value <- as.numeric(overall4.male.perc$value)
overall4.male.perc$ci.low <- as.numeric(overall4.male.perc$ci.low)
overall4.male.perc$ci.high <- as.numeric(overall4.male.perc$ci.high)

# Plot Fig S2 all >10% difference (male bias)
# S2 B, bottom right 

Metameta_Fig3_male.perc <- overall4.male.perc %>% # filter(., GroupingTerm != "Hearing") %>%
  ggplot(aes(y = GroupingTerm, x = value)) +
  geom_errorbarh(aes(
    xmin = ci.low,
    xmax = ci.high
  ),
  height = 0.1, show.legend = FALSE
  ) +
  geom_point(aes(
    shape = parameter,
    fill = parameter
  ),
  color = "mediumaquamarine", size = 2.2,
  show.legend = FALSE
  ) +
  scale_x_continuous(
    limits = c(-0.2, 0.62),
    breaks = c(0, 0.3),
    name = "Effect size"
  ) +
  geom_vline(
    xintercept = 0,
    color = "black",
    linetype = "dashed"
  ) +
  facet_grid(
    cols = vars(parameter), rows = vars(label),
    labeller = label_wrap_gen(width = 23),
    scales = "free",
    space = "free"
  ) +
  theme_bw() +
  theme(
    strip.text.y = element_text(angle = 270, size = 10, margin = margin(t = 15, r = 15, b = 15, l = 15)),
    strip.text.x = element_blank(),
    strip.background = element_rect(colour = NULL, linetype = "blank", fill = "gray90"),
    text = element_text(size = 14),
    panel.spacing = unit(0.5, "lines"),
    panel.border = element_blank(),
    axis.line = element_line(),
    panel.grid.major.x = element_line(linetype = "solid", colour = "gray95"),
    panel.grid.major.y = element_line(linetype = "solid", color = "gray95"),
    panel.grid.minor.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.title = element_blank(),
    axis.title.x = element_text(hjust = 0.5, size = 14),
    axis.title.y = element_blank()
  )

# Metameta_Fig3_male.perc (Figure S2 right panel)
```


```{r Fig4S1, fig.height = 8, fig.width = 7, fig.cap=" "**Sex bias in trait groups for lnCVR, lnVR and lnRR.**
(A) Percentages and numbers of affected traits, for variance (lnCVR and lnVR) and means (lnRR), where there is a significant difference between the sexes (i.e. CI not overlapping zero), and where the sex bias is greater than 10% difference (regardless of significance). Panel B depicts results for the effect size magnitudes in those traits that differ between the sexes (second-order meta-analysis). Triangles represent sex bias in means (response ratio) and black circles differences in the coefficient of variation ratio (mean-adjusted variability). The orange bars represent trait groups with a female bias, turqouise bars male-biased traits."}

#### Combined Figure 4S1: overall Count data, Meta anlysis results, Heterogeneity

Fig4S1 <- ggarrange(malebias_FigS1_alltraits, Metameta_FigS1_alltraits, HeteroS1, nrow = 3, align = "v", heights = c(1.1, 1, 1), labels = c("A", "B", "C"))
Fig4S1

```



### Testing the two hypotheses
We found that some means and variabilities of traits were biased towards males (i.e. ‘male-biased’, hereafter; turquoise shaded traits, Figure 4), but others towards females (i.e. ‘female-biased’, here- after; orange shading, Figure 4) within all functional groups. These sex-specific biases occur in mean trait sizes and also in our measures of trait variability. There were strong positive relationships between mean and variance across traits (r > 0.94 on the log scale; Figure 1—figure supplement 1), and therefore, we report the results of lnCVR, which controls for differences in means, in the main text. Results on lnVR are presented as figure supplements (Figure 4—figure supplements 1 and 2).
There was no consistent pattern in which sex has more variability (lnCVR) in the examined traits (left panel in Figure 4A). Our meta-analytic results also did not support a consistent pattern of either higher male variability or higher female variability (see Figure 4B, left panel: ‘All’ indicates that across all traits and functional groups, there was no significant sex bias in variances; lnCVR = 0.005, 95% confidence interval, 95% CI = [-0.009 to 0.018]). However, there was high heterogeneity among traits (I2 = 76.5%, Supplementary file 1, Table 4 and see also Table 5), indicating sex differ- ences in variability are trait-dependent, corroborating our general observation that variability in some traits was male-biased but others female-biased (Figure 4A).
As expected, specific functional trait groups showed significant sex-specific bias in variability (Figure 4B). The variability among traits within a functional group was lower than that of all the traits combined (Supplementary file 1, Table 4). For example, males exhibited an 8.05% increase in CV relative to females for morphological traits (lnCVR = 0.077; CI = [0.041 to 0.113], I2 = 67.3%), but CV was female-biased for immunological traits (6.59% higher in females, lnCVR = -0.068, CI = [-0.098 to 0.038], I2 = 40.8%) and eye morphology (7.85% higher in females, lnCVR = -0.081, CI = [-0.147 to (-0.016)], I2 = 49.8%).
The pattern was similar for overall sexual dimorphism in mean trait values (here, a slight male bias is indicated by larger ‘turquoise’ than ‘orange’ areas; Figure 4B, right and Figure 4B, lnRR: ‘All’, lnRR = 0.012, CI = [-0.006 to 0.31]). Trait means (lnRR) were 7% larger for males (lnRR = 0.067; CI =[0.007 to 0.128]) in morphological traits and 15.3% larger in males for metabolic traits (lnRR = 0.142; CI = [0.036 to 0.248]). In contrast, females had 5.59% (lnRR = 0.057, CI = [-0.107 to (-0.007)]) larger means than those of males for immunological traits. We note that these meta-analytic estimates were accompanied by very large between-trait heterogeneity values (morphology I2 = 99.7%, metabolism I2 = 99.4%, immunology I2 = 96.2; see Supplementary file 1, Table 4), indicating that even within the same functional groups, the degree and direction of sex bias in the mean was not consistent among traits.

## Discussion
We tested competing predictions from two hypotheses explaining why sex biases in trait variability exist. Neither the ‘greater male variability’ hypothesis nor the ‘estrus-mediated variability’ hypothesis explain the observed patterns in sex-biased trait variation on their own. Therefore, our results add further empirical weight to calls that question the basis for the routine exclusion of one sex in bio- medical research based on the estrus-mediated variability hypothesis (Flanagan, 2014; Klein et al., 2015; Prendergast et al., 2014; Shansky and Woolley, 2016; Becker et al., 2016). It is important to know that for each trait we estimated the mean effect size (i.e. lnCVR) over strains and locations. As such, our results may not necessarily apply to every group of mice, which may or may not result in stronger support for either of the two hypotheses.

### Greater male variability vs. estrus-mediated variability?
Evolutionary biologists commonly expect greater variability in the heterogametic sex than the homo- gametic sex. In mammals, males are heterogametic, and hence are expected to exhibit higher trait variability compared to females, which is also consistent with an expectation from sexual selection theory (Reinhold and Engqvist, 2013). Our results provide only partial support for the greater male variability hypothesis, because the expected pattern only manifested for morphological traits (see Figures 4 and 5). This result corroborates a previous analysis across animals, which found that the heterogametic sex was more variable in body size (Reinhold and Engqvist, 2013). However, our data do not support the conclusion that higher variability in males occurs across all traits, including for many other morphological traits.

## Figure 5
 ![Fig5](./Figures/Figure5.png)

### Summary of sex differences in the mean trait values (lnRR) and variances (lnCVR) across nine functional trait groups, and overall.

The estrus-mediated variability hypothesis was, at least until recently (Prendergast et al., 2014; Smarr et al., 2017), regularly used as a rationale for including only male subjects in many biomedical studies. So far, we know very little about the relationship between hormonal fluctuations and general trait variability within and among female subjects. Our results are consistent with the estrus-medi- ated variability hypothesis for immunological traits only. Immune responses can strongly depend on sex hormones (Zuk and McKean, 1996; Grossman, 1989), which may explain higher female variabil- ity in these traits. However, if estrus status affects traits through variation in hormone levels, we would expect to also find higher female variability in physiological and hematological traits. This was not the case in our dataset. Interestingly, however, eye morphology (structural traits, which should fluctuate little across the estrous cycle) also appeared to be more variable in females than males, but little is known about sex differences in ocular traits in general (Wagner et al., 2008; Shaqiri et al., 2018). Overall, we find no consistent support for the female estrus-mediated variability hypothesis.
In line with our findings, recent studies have refuted the prediction of higher female variability (Prendergast et al., 2014; Smarr et al., 2017; Beery and Zucker, 2011; Becker et al., 2016; Beery, 2018). For example, several rodent studies have found that males are more variable than females (Prendergast et al., 2014; Smarr et al., 2017; Becker et al., 2016; Beery, 2018; Fritz et al., 2017; Mogil and Chanda, 2005). Further studies should investigate whether higher female variability in immunological traits is indeed due to the estrous cycle, or generally because of greater between-individual variation (Figure 2).
In general, we found many traits to be sexually dimorphic (Figure 5) in accordance with the previ- ous study, which used the same database (Karp et al., 2017). Although the original study also pro- vided estimates for sex differences in traits both with and without controlling for weight (we did not control for weight; Nakagawa et al., 2017). More specifically, males are larger than females, while females have higher immunological parameters (see Figure 5). Notably, the most sexually dimorphic trait means also show the greatest differences in trait variance (Figures 4 and 5). Indeed, theory pre- dicts that sexually selected traits (e.g. larger body size for males due to male-male competition) are likely more variable, as these traits are often condition-dependent (Rowe and Houle, 1996). There- fore, this sex difference in variability could be more pronounced under natural conditions compared to laboratory settings. This relationship may explain why male-biased morphological traits are larger and more variable.

### Eco-evolutionary implications
We have used lnCVR values to compare phenotypic variability (CV) between the sexes. When lnCVR is used for fitness-related traits, it can signify sex differences in the ‘opportunity for selection’ between females and males (Rowe and Houle, 1996). If we assume that phenotypic variation (i.e. variability in traits) has a heritable basis, then large ratios of lnCVR may indicate differences in the evolutionary potential of each sex to respond to selection, at least in the short term (Hansen and Houle, 2008). For example, more variable morphological traits of males could potentially provide them with better capacity than females to adapt morphologically to a changing climate. We note, however, that in our study, lnCVR reflects sex differences in trait variability within strains, such that the variability differences we observe between the sexes may be partially the result of phenotypic plasticity.
Demographic parameters, such as age-dependent mortality rate (Lemaıˆtre et al., 2020) can often be different for each sex. For example, a study on European sparrowhawks found that variability in mortality was higher in females compared to males (Colchero et al., 2017). In this species, sex-spe- cific variation affects age-dependent mortality and results in higher average female life expectancy. Therefore, population dynamic models, which make predictions about how populations change in their size over time, should take sex differences in variability into account to produce more accurate predictions (Caswell and Weeks, 1986; Lindstro ̈m and Kokko, 1998). In our rapidly changing world, better predictions on population dynamics are vital for understanding whether climate change is likely to result in population extinction and lead to further biodiversity loss.

### Statistical and practical implications
It is now mandatory to include both sexes in biomedical experiments and clinical trials funded by the NIH, unless there exists strong justification against the inclusion of both sexes (NIH, 2015a; NIH, 2015b). In order to conduct meaningful research and make sound clinical recommendations for both male and female patients, it is necessary to understand both how trait means and variances dif- fer between the sexes. If one sex is systematically more variable in a trait of interest than the other, then experiments should be designed to accommodate relative differences in statistical power between the sexes (which has not been considered before, see Flanagan, 2014; Klein et al., 2015; Prendergast et al., 2014; Shansky and Woolley, 2016). For example, female immunological traits are generally more variable (i.e. having higher CV and SD). Therefore, in an experiment measuring immunological traits, we would need to include a larger sample (N) of females than males (N[female] > N[male]; N[total] = N[female] + N[male]) to achieve the same power as when the experiment only includes males (N[total*] = 2N[male]). In other words, in an experiment with both sexes we would need a larger sample size than the same experiment with males only (N[total] > N[total*]).
To help researchers adjust their sex-specific sample size to achieve optimal statistical power, we provide an online tool (ShinyApp; https://bit.ly/sex-difference). This tool may serve as a starting point for checking baseline variability for each sex in mice. The sex bias (indicated by the % differ- ence between the sexes) is provided for separate traits, procedures, and functional groups. These meta-analytic results are based on our analyses of more than 2 million rodent data points, from 26,916 individual mice. We note, however, that variability in a trait measured in untreated individuals maintained under carefully standardized environmental conditions, as reported here, may not directly translate into the same variability when measured in experimentally treated individuals, or individuals exposed to a range of environments (i.e. natural populations or human cohorts). Further, these estimates are overall mean differences across strains and locations. Therefore, these may not be particularly informative if one’s experiment only includes one specific strain. Nonetheless, we point out that our estimates may be useful in the light of a recent recommendation of using ‘hetero- genization’ where many different strains are systematically included (i.e. randomized complete block design) to increase the robustness of experimental results (Voelkl et al., 2020). However, note that an experiment with heterogenization might only include a few strains with several animals per strain. Even in such a case, using just a few strains, our tool could provide potentially useful benchmarks. Incidentally, heterogenization would be key to making one’s experimental outcome more generaliz- able (Webster and Rutz, 2020).
Importantly, when two groups (e.g. males and females) show differences in variability, we violate homogeneity of variance or homoscedasticity assumptions. Such a violation is detrimental because it leads to a higher Type I error rate. Therefore, we should consider incorporating heteroscedasticity (different variances) explicitly or using robust estimators of variance (also known as ‘the sandwich variance estimator’) to prevent an inflated Type I error rate (Cleasby and Nakagawa, 2011), espe- cially when we compare traits between the sexes.

## Conclusion
We have shown that sex biases in variability occur in many mouse traits, but that the directions of those biases differ between traits. Neither the ‘greater male variability’ nor the ‘estrus-mediated var- iability’ hypothesis provides a general explanation for sex differences in trait variability. Instead, we have found that the direction of the sex bias varies across traits and among trait types (Figures 4 and 5). Our findings have important ecological and evolutionary ramifications. If the differences in variability correspond to the potential of each sex to respond to changes in specific environments, this sex difference needs to be incorporated into demographic and population dynamic modelling. Moreover, in the (bio-)medical field, our results should inform decisions during study design by pro- viding more rigorous power analyses that allow researchers to incorporate sex-specific differences for sample size. We believe that taking sex differences in trait variability into account will help avoid misleading conclusions and provide new insights into sex differences across many areas of biological and bio-medical research. Ultimately, such considerations will not only better our knowledge, but also close the current gaps in our biased knowledge (Tannenbaum et al., 2019).

## Materials and methods
### Data selection and process
The IMPC (International Mouse Phenotyping Consortium) provides a comprehensive catalogue of mammalian gene function for investigating the genetics of health and disease, by systematically col- lecting phenotypes of knock-out and wildtype mice. To investigate differences in trait variability between the sexes, we only considered the data for wildtype control mice. We retrieved the dataset from the IMPC server in June 2018 and filtered it to contain non-categorical traits for wildtype mice. The initial dataset comprised over 2,500,000 data points for 340 traits. In cases where multiple meas- urements were taken over time, data cleaning started with selecting single measurements for each individual and trait. In these cases, we selected the measurement closest to ‘100 days of age’. All data are from unstaged females (with no information about the stage of their estrous cycle). We excluded data for juvenile and unsexed mice (Figure 3A; this dataset and scripts can be found on https://rpubs.com/SusZaj/ESF; https://bit.ly/code-mice-sex-diff; raw data: https://doi.org/10.5281/ zenodo.3759701).

### Grouping and effect-size calculation
We created a grouping variable called ‘population’ (Figure 3B). A population comprised a group of individuals belonging to a distinct wildtype strain maintained at one particular location (institution); populations were identified for every trait of interest. Our data were derived from 11 different loca- tions/institutions, and a given location/institution could provide data on multiple populations (see Supplementary file 1, Table 1 for details on numbers of strains and institutions). We included only populations that contained data points for at least six individuals, and which had information for members of both sexes; further, populations for a particular trait had to come from at least two insti- tutions to be eligible for inclusion. After this selection process, the dataset contained 2,300,000 data points across 232 traits. Overall, we meta-analysed traits with between 2–18 effect sizes (mean = 9.09 effects, SD = 4.47). However, each meta-analysis contained a total number of individ- ual mice that ranged from 83/91 to 13467/13449 (males/females). While a minimum of N = 6 mice were used to create effect sizes for any given group (male or female), in reality samples sizes of male/female groups were much larger (males: mean = 396.66 (SD = 238.23), median = 465.56;females: mean = 407.35 (SD = 240.31), median = 543.89). We used the function escalc in the R pack- age, metafor (Viechtbauer, 2010) to obtain lnCVR, lnVR and lnRR and their corresponding sampling variance for each trait for each population; we worked in the R environment for data cleaning, proc- essing and analyses (R Development Core Team, 2017, version 3.6.0; for the versions of all the soft- ware packages used for this article and all the details and code for the statistical analyses, see the Source code 1 and repositories). As mentioned above, the use of ratio-based effect sizes, such as lnCVR, lnVR and lnRR, controls for baseline changes over time and space, assuming that these changes affect males and females similarly. However, we acknowledge that we could not test this assumption.

### Meta-analyses: overview
We conducted meta-analyses at two different levels (Figure 3C–J). First, we conducted a meta-anal- ysis for each trait for all three effect-size types (lnRR, lnVR and lnCVR), calculated at the ‘population’ level (i.e. using population as a unit of analysis). Second, we statistically amalgamated overall effect sizes estimated at each trait (i.e. overall trait means as a unit of analysis) after accounting for depen- dence among traits. In other words, we conducted second-order meta-analyses (Nakagawa et al., 2019). We used the second-order meta-analyses for three different purposes: (A) estimating overall sex biases in variance (lnCVR and lnVR) and mean (lnRR) in the nine functional groups (for details, see below) and in all these groups combined (the overall estimates); (B) visualizing heterogeneities across populations for the three types of effect size in the nine functional trait groups, which comple- mented the first set of analyses (Figure 3I, Table 6 in Supplementary file 1); and (C) when traits were found to be significantly sex-biased, grouping such traits into either male-biased and female- biased traits, and then, estimating overall magnitudes of sex bias for both sexes again for the nine functional trait groups. Only the first second-order meta-analysis (A) directly related to the testing of our hypotheses, results of B and C are found in Supplementary file 1 and figures and reported in our freely accessible code.

### Meta-analyses: population as an analysis unit
To obtain degree of sex bias for each trait mean and variance (Figure 3C), we used the function rma.mv in the R package metafor (Viechtbauer, 2010) by fitting the following multilevel meta-ana- lytic model, an extension of random-effects models (sensu Nakagawa and Santos, 2012):
ESi ~ 1 + (1 | Strainj) + (1 | Locationk) + (1 | Uniti) + Errori, where ‘ESi’ is the ith effect size (i.e. lnCVR, lnVR and lnRR) for each of 232 traits, the ‘1’ is the overall intercept (other ‘1’s are random intercepts for the following random effects), ‘Strainj’ is a random effect for the jth strain of mice (among nine strains), ‘Locationk’ is a random effect for the kth location (among 11 institutions), ‘Uniti’ is a residual (or effect-size level or ‘population-level’ random effect) for the ith effect size, ‘Errori’ is a random effect of the known sampling error for the ith effect size. Given the model above, meta-ana- lytic results had two components: (1) overall means with standard errors (95% confidence intervals), and (2) total heterogeneity (the sum of the three variance components, which is estimated for the random effects). Note that overall means indicate average (marginalised) effect sizes over different strains and locations, and total heterogeneities reflect variation around overall means due to differ- ent strains and locations.
We excluded traits which did not carry useful information for this study (i.e. fixed traits, such as number of vertebrae, digits, ribs and other traits that were not variable across wildtype mice; note that this may be different for knock-down mutant strains) or where the meta-analytic model for the trait of interest did not converge, most likely due to small sample size from the dataset (14 traits, see SI Appendix, for details: Meta-analyses; 1. Population as analysis unit). We therefore obtained a dataset containing meta-analytic results for 218 traits, at this stage, to use for our second-order meta-analyses (Figure 3D).

### Meta-analyses: accounting for correlated traits
Our dataset of meta-analytic results included a large number of non-independent traits. To account for dependence, we identified 90 out of 218 traits, and organized them into 19 trait sub-groups (containing 2–10 correlated traits, see Figure 3E). For example, many measurements (i.e. traits) from hematological and immunological assays were hierarchically clustered or overlapped with each other (e.g. cell type A, B and A+B). We combined the meta-analytic results from 90 traits into 19 meta- analytic results (Figure 3F) using the function robu in the R package robumeta with the assumption of sampling errors being correlated with the default value of r = 0.8 (Fisher et al., 2017). Conse- quently, our final dataset for secondary meta-analyses contained 147 traits (i.e. the newly condensed 19 plus the remaining 128 independent traits, see Figure 3, Supplementary file 1, Table 2), which we assume to be independent of each other.
Second-order meta-analyses: trait as an analysis unit
We created our nine overarching functional groups of traits (Figure 3G) by condensing the IMPC’s 26 procedural categories (‘procedures’) into related clusters. The categories were based on proce- dures that were biologically related, in conjunction with measurement techniques and the number of available traits in each category (see Supplementary file 1, Table 3 for a list of clustered traits, procedures and grouping terms). To test our two hypotheses about how trait variability changes in relation to sex, we estimated overall effect sizes for nine functional groups by aggregating meta-ana- lytic results via ‘classical’ random-effect models using the function rma.uni in the R package metafor (Viechtbauer, 2010). In other words, we conducted three sets of 10 second-order meta-analyses (i.e. meta-analyzing 3 types of effect size: lnRR, lnVR and lnCVR for nine functional groups and one for all the groups combined, Figure 3H). Although we present the frequencies of male- and female- biased traits in Figure 4A, we did not run inferential statistical tests on these counts because such tests would be considered as vote-counting, which has been severely criticised in the meta-analytic literature (Higgins, 2019).

### Acknowledgements
SRKZ and ML were supported by the Australian (ARC) Discovery Grant (DP180100818) awarded to SN. JM was supported by EMBL core funding and the NIH Common Fund (UM1-H G006370). AMS was supported by an ARC fellowship (DE180101520).

## References
Ahmad AA, Randall MD, Roberts RE. 2017. Sex differences in the role of phospholipase A2 -dependent arachidonic acid pathway in the perivascular adipose tissue function in pigs. The Journal of Physiology 595:6623–6634. DOI: https://doi.org/10.1113/JP274831, PMID: 28877347. 

Becker JB, Prendergast BJ, Liang JW. 2016. Female rats are not more variable than male rats: a meta-analysis of neuroscience studies. Biology of Sex Differences 7:34. DOI: https://doi.org/10.1186/s13293-016-0087-5, PMID: 27468347. 

Beery AK. 2018. Inclusion of females does not increase variability in rodent research studies. Current Opinion in Behavioral Sciences 23:143–149. DOI: https://doi.org/10.1016/j.cobeha.2018.06.016. 

Beery AK, Zucker I. 2011. Sex bias in neuroscience and biomedical research. Neuroscience & Biobehavioral Reviews 35:565–572. DOI: https://doi.org/10.1016/j.neubiorev.2010.07.002, PMID: 20620164. 

Caswell H, Weeks DE. 1986. Two-Sex models: chaos, extinction, and other dynamic consequences of sex. The American Naturalist 128:707–735. DOI: https://doi.org/10.1086/284598. 

Clayton JA. 2016. Studying both sexes: a guiding principle for biomedicine. The FASEB Journal 30:519–524. DOI: https://doi.org/10.1096/fj.15-279554. 

Clayton JA, Collins FS. 2014. Policy: NIH to balance sex in cell and animal studies. Nature 509:282–283. DOI: https://doi.org/10.1038/509282a, PMID: 24834516. 

Cleasby IR, Nakagawa S. 2011. Neglected biological patterns in the residuals. Behavioral Ecology and Sociobiology 65:2361–2372. DOI:https://doi.org/10.1007/s00265-011-1254-7. 

Colchero F, Aliaga AE, Jones OR, Conde DA. 2017. Individual heterogeneity determines sex differences in mortality in a monogamous bird with reversed sexual dimorphism. Journal of Animal Ecology 86:899–907. DOI: https://doi.org/10.1111/1365-2656.12677, PMID: 28393353.

Cuervo JJ, Møller AP. 1999. Phenotypic variation and fluctuating asymmetry in sexually dimorphic feather ornaments in relation to sex and mating system. Biological Journal of the Linnean Society 68:505–529. DOI: https://doi.org/10.1111/j.1095-8312.1999.tb01186.x. 

Cuervo JJ, Møller AP. 2001. Components of phenotypic variation in avian ornamental and non-ornamental feathers. Evolutionary Ecology 15:53–72. DOI: https://doi.org/10.1023/A:1011913804309. 

Darwin C. 1871. The descent of man and Selection in Relation to Sex. Journal of Anatomy and Physiology 5:363– 372.  

Dickinson ME, Flenniken AM, Ji X, Teboul L, Wong MD, White JK, Meehan TF, Weninger WJ, Westerberg H, Adissu H, Baker CN, Bower L, Brown JM, Caddle LB, Chiani F, Clary D, Cleak J, Daly MJ, Denegre JM, Doe B, et al. 2016. High-throughput discovery of novel developmental phenotypes. Nature 537:508–514. DOI: https:// doi.org/10.1038/nature19356. 

Dorris DM, Cao J, Willett JA, Hauser CA, Meitzen J. 2015. Intrinsic excitability varies by sex in prepubertal striatal medium spiny neurons. Journal of Neurophysiology 113:720–729. DOI: https://doi.org/10.1152/jn. 00687.2014, PMID: 25376786. 

Fisher Z, Tipton E, Zhipeng H, Fisher MZ. 2017. robumeta. 2.0. Robust Variance Meta-Regression. https://cran.r- project.org/web/packages/robumeta/index.html. 
Flanagan KL. 2014. Sexual dimorphism in biomedical research: a call to analyse by sex. Transactions of the Royal Society of Tropical Medicine and Hygiene 108:385–387. DOI: https://doi.org/10.1093/trstmh/tru079, PMID: 24 934286. 

Foltin RW, Evans SM. 2018. Sex differences in the anorexigenic effects of dexfenfluramine and amphetamine in baboons. Experimental and Clinical Psychopharmacology 26:335–340. DOI: https://doi.org/10.1037/ pha0000201, PMID: 29792471. 

Fritz A, Amrein I, Wolfer DP. 2017. Similar reliability and equivalent performance of female and male mice in the open field and water-maze place navigation task. American Journal of Medical Genetics Part C: Seminars in Medical Genetics 175:380–391. DOI: https://doi.org/10.1002/ajmg.c.31565. 

Grossman C. 1989. Possible underlying mechanisms of sexual dimorphism in the immune response, fact and hypothesis. Journal of Steroid Biochemistry 34:241–251. DOI: https://doi.org/10.1016/0022-4731(89)90088-5, PMID: 2696846.  

Hansen TF, Houle D. 2008. Measuring and comparing evolvability and constraint in multivariate characters. Journal of Evolutionary Biology 21:1201–1219. DOI: https://doi.org/10.1111/j.1420-9101.2008.01573.x, PMID: 18662244. 

Hedges LV, Nowell A. 1995. Sex differences in mental test scores, variability, and numbers of high-scoring individuals. Science 269:41–45. DOI: https://doi.org/10.1126/science.7604277, PMID: 7604277. 

Higgins JP. 2019. Cochrane Handbook for Systematic Reviews of Interventions. John Wiley & Sons. DOI: https:// doi.org/10.1002/9781119536604. 

Ingvorsen C, Karp NA, Lelliott CJ. 2017. The role of sex and body weight on the metabolic effects of high-fat diet in C57BL/6N mice. Nutrition & Diabetes 7:e261. DOI: https://doi.org/10.1038/nutd.2017.6, PMID: 283 94359. 

Itoh Y, Arnold AP. 2015. Are females more variable than males in gene expression? Meta-analysis of microarray datasets. Biology of Sex Differences 6:8. DOI: https://doi.org/10.1186/s13293-015-0036-8, PMID: 26557976. 

Johnson W, Carothers A, Deary IJ. 2008. Sex differences in variability in general intelligence: a new look at theold question. Perspectives on Psychological Science 3:518–531. DOI: https://doi.org/10.1111/j.1745-6924.2008.00096.x, PMID: 26158978. 

Karp NA, Mason J, Beaudet AL, Benjamini Y, Bower L, Braun RE, Brown SDM, Chesler EJ, Dickinson ME,Flenniken AM, Fuchs H, Angelis MH, Gao X, Guo S, Greenaway S, Heller R, Herault Y, Justice MJ, Kurbatova N, Lelliott CJ, et al. 2017. Prevalence of sexual dimorphism in mammalian phenotypic traits. Nature Communications 8:15475. DOI: https://doi.org/10.1038/ncomms15475, PMID: 28650954. 

Klein SL, Schiebinger L, Stefanick ML, Cahill L, Danska J, de Vries GJ, Kibbe MR, McCarthy MM, Mogil JS, Woodruff TK, Zucker I. 2015. Opinion: sex inclusion in basic research drives discovery. Proceedings of the National Academy of Sciences 112:5257–5258. DOI: https://doi.org/10.1073/pnas.1502843112, PMID: 25 902532. 
Lehre AC, Lehre KP, Laake P, Danbolt NC. 2009. Greater intrasex phenotype variability in males than in females is a fundamental aspect of the gender differences in humans. Developmental Psychobiology 51:198–206. DOI: https://doi.org/10.1002/dev.20358, PMID: 19031491. 

Lemaˆıtre JF, Ronget V, Tidiere M, Allaine ́ D, Berger V, Cohas A, Colchero F, Conde DA, Garratt M, Liker A, Marais GAB, Scheuerlein A, Sze ́ kely T, Gaillard JM. 2020. Sex differences in adult lifespan and aging rates of mortality across wild mammals. Proceedings of the National Academy of Sciences 117:8546–8553.
DOI: https://doi.org/10.1073/pnas.1911999117, PMID: 32205429.  

Lindstrom J, Kokko H. 1998. Sexual reproduction and population dynamics: the role of polygyny and demographic sex differences. Proceedings of the Royal Society of London. Series B: Biological Sciences 265: 483–488. DOI: https://doi.org/10.1098/rspb.1998.0320. 

Mogil JS, Chanda ML. 2005. The case for the inclusion of female subjects in basic science studies of pain. Pain 117:1–5. DOI: https://doi.org/10.1016/j.pain.2005.06.020, PMID: 16098670. 

Nakagawa S, Poulin R, Mengersen K, Reinhold K, Engqvist L, Lagisz M, Senior AM. 2015. Meta-analysis of variation: ecological and evolutionary applications and beyond. Methods in Ecology and Evolution 6:143–152. DOI: https://doi.org/10.1111/2041-210X.12309. 

Nakagawa S, Noble DW, Senior AM, Lagisz M. 2017. Meta-evaluation of meta-analysis: ten appraisal questions for biologists. BMC Biology 15:18. DOI: https://doi.org/10.1186/s12915-017-0357-7, PMID: 28257642. 

Nakagawa S, Samarasinghe G, Haddaway NR, Westgate MJ, O’Dea RE, Noble DWA, Lagisz M. 2019. Research weaving: visualizing the future of research synthesis. Trends in Ecology & Evolution 34:224–238. DOI: https:// doi.org/10.1016/j.tree.2018.11.007, PMID: 30580972. 

Nakagawa S, Santos ESA. 2012. Methodological issues and advances in biological meta-analysis. Evolutionary Ecology 26:1253–1274. DOI: https://doi.org/10.1007/s10682-012-9555-5. 

NIH. 2015a. Consideration of Sex as a Biological Variable in NIH-Funded Research, Notice NOT-OD-102: National Institutes of Health.

NIH. 2015b. Enhancing Reproducibility Through Rigor and Transparency, Notice NOT-OD-103: National Institutes of Health.  

Nowogrodzki A. 2017. Inequality in medicine. Nature 550:S18–S19. DOI: https://doi.org/10.1038/550S18a Pomiankowski A, Moller AP. 1995. A resolution of the lek paradox. Proceedings of the Royal Society of London. Series B, Containing Papers of a Biological Character 260:21–29. DOI: https://doi.org/10.1098/rspb.1995.0054. 

Prendergast BJ, Onishi KG, Zucker I. 2014. Female mice liberated for inclusion in neuroscience and biomedical research. Neuroscience & Biobehavioral Reviews 40:1–5. DOI: https://doi.org/10.1016/j.neubiorev.2014.01.001, PMID: 24456941. 

R Development Core Team. 2017. R: A Language and Environment for Statistical Computing. Vienna, Austria, R Foundation for Statistical Computing. http://www.r-project.org. 

Reinhold K, Engqvist L. 2013. The variability is in the sex chromosomes. Evolution 67:3662–3668. DOI: https:// doi.org/10.1111/evo.12224. 
Robinson CM, Wang Y, Pfeiffer JK. 2017. Sex-Dependent intestinal replication of an enteric virus. Journal of Virology 91:e02101-16. DOI: https://doi.org/10.1128/JVI.02101-16, PMID: 28100612. 

Rowe L, Houle D. 1996. The lek paradox and the capture of genetic variance by condition dependent traits. Proceedings of the Royal Society B: Biological Sciences 263:1415–1421. DOI: https://doi.org/10.1098/rspb. 1996.0207. 

Senior AM, Viechtbauer W, Nakagawa S. 2020. Revisiting and expanding the meta-analysis of variation: the log coefficient of variation ratio. Research Synthesis Methods 11:1423:553–567. DOI: https://doi.org/10.1002/jrsm. 1423.  

Shansky RM, Woolley CS. 2016. Considering sex as a biological variable will be valuable for neuroscience research. The Journal of Neuroscience 36:11817–11822. DOI: https://doi.org/10.1523/JNEUROSCI.1390-16. 2016, PMID: 27881768.  

Shaqiri A, Roinishvili M, Grzeczkowski L, Chkonia E, Pilz K, Mohr C, Brand A, Kunchulia M, Herzog MH. 2018. Sex-related differences in vision are heterogeneous. Scientific Reports 8:8. DOI: https://doi.org/10.1038/ s41598-018-25298-8, PMID: 29760400. 

Smarr BL, Grant AD, Zucker I, Prendergast BJ, Kriegsfeld LJ. 2017. Sex differences in variability across timescales in BALB/c mice. Biology of Sex Differences 8:3. DOI: https://doi.org/10.1186/s13293-016-0125-3, PMID: 2 8203366. 

Tannenbaum C, Ellis RP, Eyssel F, Zou J, Schiebinger L. 2019. Sex and gender analysis improves science and engineering. Nature 575:137–146. DOI: https://doi.org/10.1038/s41586-019-1657-6, PMID: 31695204. 

Thompson LP, Chen L, Polster BM, Pinkas G, Song H. 2018. Prenatal hypoxia impairs cardiac mitochondrial and ventricular function in guinea pig offspring in a sex-related manner. American Journal of Physiology. Regulatory, Integrative and Comparative Physiology 315:R1232–R1241. DOI: https://doi.org/10.1152/ajpregu. 00224.2018, PMID: 30365351. 

Tomkins JL, Radwan J, Kotiaho JS, Tregenza T. 2004. Genic capture and resolving the lek paradox. Trends in Ecology & Evolution 19:323–328. DOI: https://doi.org/10.1016/j.tree.2004.03.029, PMID: 16701278. 

Viechtbauer W. 2010. Conducting Meta-Analyses in R with the metafor Package. Journal of Statistical Software 36:i03. DOI: https://doi.org/10.18637/jss.v036.i03. 
Voelkl B, Altman NS, Forsman A, Forstmeier W, Gurevitch J, Jaric I, Karp NA, Kas MJ, Schielzeth H, Van de Casteele T, Wu ̈ rbel H. 2020. Reproducibility of animal research in light of biological variation. Nature Reviews Neuroscience 21:384–393. DOI: https://doi.org/10.1038/s41583-020-0313-3. 

Wagner H, Fink BA, Zadnik K. 2008. Sex- and gender-based differences in healthy and diseased eyes. Optometry - Journal of the American Optometric Association 79:636–652. DOI: https://doi.org/10.1016/j.optm.2008.01. 024, PMID: 19811761. 

Webster MM, Rutz C. 2020. How STRANGE are your study animals? Nature 582:337–340. DOI: https://doi.org/ 10.1038/d41586-020-01751-5, PMID: 32541916. 

Zucker I, Prendergast BJ. 2020. Sex differences in pharmacokinetics predict adverse drug reactions in women. Biology of Sex Differences 11:32. DOI: https://doi.org/10.1186/s13293-020-00308-5, PMID: 32503637. 

Zuk M, McKean KA. 1996. Sex differences in parasite infections: patterns and processes. International Journal for Parasitology 26:1009–1024. DOI: https://doi.org/10.1016/S0020-7519(96)80001-4, PMID: 8982783.