In this tutorial we will explore and analyse the results of kraken2 read profiling data using the re-estimatd abundances of bracken. These exercises are meant to show how to conceptually approach your data analysis but there are many more and different ways to explore your data. The most important thing to keep in mind is that you have to understand your own data and analyses. One way to achieve this is to perform visual explorations that help you to judge whether the data are appropriate for your question.
Now let’s start the fun!
library(tidyverse)
library(vegan)
library(coin)
library(pheatmap)
First we need to load our data. Usually the biggest bottleneck between raw data and analyses is to get the data in the right shape for your purpose. Often this requires a little bit of data mingling. On this road - google is your best friend to master the R universe :)
Let’s first load the relative abundance table of the bracken results.
brel<-read_csv(file = "merged_rel_abund_extended.csv")
Then we shorten the column names to just keep the sample IDs.
colnames(brel)<-gsub(".breport", "", colnames(brel))
And finally, we are replacing all NAs with 0. This is because by meging the samples into a single dataframes we introduced these NAs.
brel[is.na(brel)] <- 0
Let’s check how the dataframe (tibble) looks like
head(brel)
## # A tibble: 6 x 35
## taxlevel taxid taxa Sample9 Sample8 Sample7 Sample6 Sample5 Sample4 Sample32
## <chr> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 R 1.00e0 root 100 100 100 100 100 100 100
## 2 R1 1.32e5 cell… 100 100 100 100 100 100 100
## 3 D 2.00e0 Bact… 99.8 99.8 99.1 98.9 99.1 99.2 98.9
## 4 D1 1.78e6 FCB … 72.8 76.0 35.3 77.3 65.2 57.4 85.0
## 5 D2 6.83e4 Bact… 72.8 76.0 35.3 77.3 65.2 57.4 85.0
## 6 P 9.76e2 Bact… 72.8 76.0 35.3 77.3 65.2 57.4 85.0
## # … with 25 more variables: Sample31 <dbl>, Sample30 <dbl>, Sample3 <dbl>,
## # Sample29 <dbl>, Sample28 <dbl>, Sample27 <dbl>, Sample26 <dbl>,
## # Sample25 <dbl>, Sample24 <dbl>, Sample23 <dbl>, Sample22 <dbl>,
## # Sample21 <dbl>, Sample20 <dbl>, Sample2 <dbl>, Sample19 <dbl>,
## # Sample18 <dbl>, Sample17 <dbl>, Sample16 <dbl>, Sample15 <dbl>,
## # Sample14 <dbl>, Sample13 <dbl>, Sample12 <dbl>, Sample11 <dbl>,
## # Sample10 <dbl>, Sample1 <dbl>
Looking good!
Since we want to get most from our data and ideally interpret the patterns, we need to import the metadata.
meta<-read_delim(file="./metadata_extended.csv", delim="\t")
head(meta)
## # A tibble: 6 x 5
## Sample group MouseID timepoint mocktreat
## <chr> <chr> <chr> <chr> <chr>
## 1 Sample18 Yp 1 Pre-Abx A
## 2 Sample1 Yp 8 Pre-Abx A
## 3 Sample2 Yc 2 Pre-Abx B
## 4 Sample3 Yc 5 Pre-Abx A
## 5 Sample4 Yc 8 Pre-Abx B
## 6 Sample5 Pools old feac water Pre-Abx B
Before we start anything, let’s just check out or data a little bit. Never go blind into your analyses.
Q1: How many samples do we have in the matadata and in out data tibble
nrow(meta)
## [1] 32
# excluding the columns not referring to samples
ncol(brel%>%select(-c("taxlevel","taxid","taxa")))
## [1] 32
A1: We have 32 samples in metadata and data tibble. This means we are good to go.
Since the report file contains all taxonomic levels - this is not really easy to understand. Let’s explore how many taxa we have at each taxonomic level. To do that we will first specify the taxonomic levels we are interested in.
# number of taxa for each taxon level
taxcat<-c("S","G","F","O","C","P","D")
The we will use sapply to iterate a function over our data for each taxonomic level. To understand sapply better check out ?sapply
.
Q2: How many families were detected in our dataset?
taxcount<-sapply(taxcat, function(tax){
nrow(subset(brel, taxlevel == tax) %>% # subset only rows that correspond to tax level
select(-c("taxid","taxlevel")) %>% # remove the columns taxid and taxlevel and only keep the rest
filter(rowSums(select_if(., is.numeric)) != 0)) # remove all taxa that are 0 in all samples
})
# check out the number of taxa per taxonomic level
taxcount
## S G F O C P D
## 85 48 29 16 12 6 2
A2: From the output of taxcount
we see that we have 29 families detected
Now that we know a little bit about our data we can start selecting what we want to look at. Let’s extract a separate tibble for species, family and phylum level. At each tibble we remove all the taxa that are not present in any of the samples.
Let’s start with species
brel_spec<-subset(brel, taxlevel == "S") %>%
select(-c("taxid","taxlevel")) %>%
filter(rowSums(select_if(., is.numeric)) != 0)
head(brel_spec)
## # A tibble: 6 x 33
## taxa Sample9 Sample8 Sample7 Sample6 Sample5 Sample4 Sample32 Sample31
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Dunc… 12.6 9.34 3.08 20.7 12.9 14.9 26.4 19.6
## 2 Dunc… 3.13 2.54 3.76 1.08 0.9 1.02 1.72 3.12
## 3 Dunc… 11.0 8.68 4.15 20.1 12.2 12.6 14.8 5.46
## 4 Muri… 17.7 25.1 5.38 10.4 7.53 6.81 9.27 13.5
## 5 Muri… 23.2 22.8 6.06 16.4 23.4 15.0 25.4 11.0
## 6 Bact… 1.51 2.83 3.27 2.99 0.34 0.43 0.17 4.48
## # … with 24 more variables: Sample30 <dbl>, Sample3 <dbl>, Sample29 <dbl>,
## # Sample28 <dbl>, Sample27 <dbl>, Sample26 <dbl>, Sample25 <dbl>,
## # Sample24 <dbl>, Sample23 <dbl>, Sample22 <dbl>, Sample21 <dbl>,
## # Sample20 <dbl>, Sample2 <dbl>, Sample19 <dbl>, Sample18 <dbl>,
## # Sample17 <dbl>, Sample16 <dbl>, Sample15 <dbl>, Sample14 <dbl>,
## # Sample13 <dbl>, Sample12 <dbl>, Sample11 <dbl>, Sample10 <dbl>,
## # Sample1 <dbl>
Can you come up with the command for family (F) and phylum (P) level? Tip: make sure you save it to a different variable. Otherwise you are overwriting your species data.
brel_fam<-subset(brel, taxlevel == "F") %>%
select(-c("taxid","taxlevel")) %>%
filter(rowSums(select_if(., is.numeric)) != 0)
brel_phy<-subset(brel, taxlevel == "P") %>%
select(-c("taxid","taxlevel")) %>%
filter(rowSums(select_if(., is.numeric)) != 0)
Let’s plot and explore microbial composition and relative abundances in our samples by visualizing them.
As a first overview of coarse differences we can create a stacked bar plot of phyla. We need to mingle our data structure a bit to make it compliant with ggplot and add the metadata to get an extra level of information.
# data mingling
brel_phy_gg1 <- brel_phy %>% gather(key="Sample",value="rel_abun",-taxa)
# now join with metadata by column 'Sample'. We are using left join in case the metadata file contains additional samples not included in our dataset
brel_phy_gg1 <- left_join(brel_phy_gg1,meta,by="Sample")
# check how it looks
head(brel_phy_gg1)
## # A tibble: 6 x 7
## taxa Sample rel_abun group MouseID timepoint mocktreat
## <chr> <chr> <dbl> <chr> <chr> <chr> <chr>
## 1 Bacteroidetes Sample9 72.8 O 616 Sch1 A
## 2 Firmicutes Sample9 24.4 O 616 Sch1 A
## 3 Actinobacteria Sample9 0.18 O 616 Sch1 A
## 4 Proteobacteria Sample9 2.38 O 616 Sch1 A
## 5 Verrucomicrobia Sample9 0.02 O 616 Sch1 A
## 6 Chordata Sample9 0.21 O 616 Sch1 A
# bar plot phyla relative abundances
ggplot(brel_phy_gg1, aes(x=Sample, y=rel_abun)) +
geom_bar(stat="identity",position="stack", aes(fill=taxa)) + # chose bar plot
theme(axis.text.x = element_text(angle=45, hjust=1)) + # put x-axis label at 45 degree angle
facet_grid(. ~ mocktreat, scales="free_x",space = "free_x") # produce two panels according to metatadata category 'mocktreat'
Q3: Do the phyla profiles look similar between samples? Can you spot any trends?
A3: We see that the same two Phyla dominate all samples but there is some variability between samples. From a first look there does not seem to be a major difference between the two groups A and B at this taxonomic level.
If you have time you can also visualize the other taxonomic levels (e.g. species) with the same approach. Try to come up with the code yourself. Hint: Omit legend using legend.position
.
# bar plot phyla relative abundances
brel_spec_gg1 <- brel_spec %>% gather(key="Sample",value="rel_abun",-taxa)
brel_spec_gg1 <- left_join(brel_spec_gg1,meta,by="Sample")
ggplot(brel_spec_gg1, aes(x=Sample, y=rel_abun)) +
geom_bar(stat="identity",position="stack", aes(fill=taxa)) +
theme(axis.text.x = element_text(angle=45, hjust=1),
legend.position = "none") +
facet_grid(. ~ mocktreat, scales="free_x",space = "free_x")
Q4: Does the profile change according to taxonomic level? Is the stacked bar plot helpful in all scenarios?
A4: When too many taxa are present, such as at species level, it becomes difficult to distinguish the colors.
As you might realize, when there are too many taxa it becomes very difficult to spot anything in the stacked bar plot. Another way to visualize the relative abundances is by creating a bubble plot. Let’s do that for the family composition.
# bubble plot
brel_fam_gg1 <- brel_fam %>% gather(key="Sample",value="rel_abun",-taxa)
brel_fam_gg1 <- left_join(brel_fam_gg1,meta,by="Sample")
ggplot(brel_fam_gg1, aes(x=Sample, y=taxa)) +
geom_point(aes(size=rel_abun, color=timepoint), alpha=0.7) + # this time we use points
theme(axis.text.x = element_text(angle=45, hjust=1)) +
scale_size_continuous(limits = c(0.00001,max(brel_phy_gg1$rel_abun))) + # sets minimum above '0'
facet_grid(. ~ mocktreat, scales="free_x",space = "free_x")
## Warning: Removed 310 rows containing missing values (geom_point).
Q5: Did you notice that here we added an extra level of information? Can you spot what it is?
A5: Now we produced panels according to mocktreat and colored according to time point. Like this you can see that we only have two time points in one of the groups. This plot allows us therefore to combine multiple metadata layers.
If you have time you can play with the different information levels and see what this can tell you about your data.
When we want to look at species composition, the communities are often very complex and looking at abundance profiles is not as informative. One way to explore species-level community composition is to filter you data so you focus only on the most abundant taxa across samples. What cutoff and approach you use here of course depends on your question and data.
Let’s create a heatmap of the 20 most abundant spexies in our samples.
# sort species by abundance across samples and select top 20
brel_spec_gg1<-brel_spec[order(rowSums(select_if(brel_spec, is.numeric)),decreasing=T),] %>%
head(20) %>%
column_to_rownames("taxa")
# shape the metadata
meta_h <- subset(meta, Sample %in% colnames(brel_spec_gg1)) %>%
column_to_rownames("Sample") # shoft the 'Sample' column to rownames
# plot the heatmap
pheatmap::pheatmap(brel_spec_gg1,
cluster_rows = TRUE,
cluster_cols = TRUE,
annotation_col = meta_h[,c(1,3,4)],
annotation_names_col=TRUE)
Q6: what can you learn from the heatmap. Are there any informative clusters?
A6: We can see that 4 species dominate the communities in most samples. Adding the metadata we can also see that data do not cluster strongly according to group or time point, but there is some degree of structuring in mocktreat.
Often we want to know whether the microbiomes are different between conditions or groups. One way to explore this is to look at the beta-diversity in an ordination. There are different distances and approaches that can be done and explored. We will perform an NMDS on bray curtis dissimilarities of the species profiles.
# To ensure reproducibility we can fix the seed here. This will ensure you always get the same result each time you run your data.
set.seed(34521)
# Data mingling
brel_spec_gg2 <- brel_spec %>%
column_to_rownames("taxa") %>%
t() # transpose
# Calculate distance matrix
brel_spec_gg2_dist <- vegdist(brel_spec_gg2, method = "bray")
# Perform NMDS on distance matrix
nmds_spec<-metaMDS(brel_spec_gg2_dist,distance = "bray",k = 2)
## Run 0 stress 0.14795
## Run 1 stress 0.1795765
## Run 2 stress 0.1936137
## Run 3 stress 0.2378564
## Run 4 stress 0.14795
## ... Procrustes: rmse 4.309274e-05 max resid 0.0002094073
## ... Similar to previous best
## Run 5 stress 0.1644519
## Run 6 stress 0.169285
## Run 7 stress 0.1835296
## Run 8 stress 0.1920566
## Run 9 stress 0.14795
## ... Procrustes: rmse 4.490045e-05 max resid 0.0002129123
## ... Similar to previous best
## Run 10 stress 0.1839323
## Run 11 stress 0.14795
## ... Procrustes: rmse 4.075948e-05 max resid 0.0001818523
## ... Similar to previous best
## Run 12 stress 0.1835298
## Run 13 stress 0.14795
## ... New best solution
## ... Procrustes: rmse 1.20264e-05 max resid 5.395265e-05
## ... Similar to previous best
## Run 14 stress 0.1692845
## Run 15 stress 0.2115505
## Run 16 stress 0.1792405
## Run 17 stress 0.14795
## ... Procrustes: rmse 1.608663e-05 max resid 7.697304e-05
## ... Similar to previous best
## Run 18 stress 0.14795
## ... Procrustes: rmse 3.475153e-05 max resid 0.0001652939
## ... Similar to previous best
## Run 19 stress 0.1908263
## Run 20 stress 0.1782935
## *** Solution reached
# Check the output
nmds_spec
##
## Call:
## metaMDS(comm = brel_spec_gg2_dist, distance = "bray", k = 2)
##
## global Multidimensional Scaling using monoMDS
##
## Data: brel_spec_gg2_dist
## Distance: bray
##
## Dimensions: 2
## Stress: 0.14795
## Stress type 1, weak ties
## Two convergent solutions found after 20 tries
## Scaling: centring, PC rotation
## Species: scores missing
Here you see a kind of summary of the analysis. For example, you can see that you used 2 dimensions and the stress was approx. 0.15. In general if a stress is above 0.2 then the clustering is not reliably representing the data and should be interpreted with caution. But here the stress is below 0.2, so we are okay.
Now let’s look at the ordination. To plot the data with ggplot, we need to extract the coordinaties of each point from nmds_spec$points
.
# Extract and reshape the data to plot ordination as ggplot and add the metadata
nmds_spec_gg<-as.data.frame(nmds_spec$points) %>%
rownames_to_column("Sample") %>%
left_join(meta, by="Sample")
Then we can greate the plot easily and color according to the metadata. We are choosing timepoint
and mocktreat
for the coloring respectively. But feel free to explore other parameters.
# Let's plot and color according to time point
ggplot(nmds_spec_gg, aes(x=MDS1,y=MDS2)) +
geom_point(aes(color=timepoint), size=3, alpha=0.5) +
ggtitle("NMDS colored according to timepoint")
# Now let's plot according to mocktreat
ggplot(nmds_spec_gg, aes(x=MDS1,y=MDS2)) +
geom_point(aes(color=mocktreat), size=3, alpha=0.5)+
ggtitle("NMDS colored according to mocktreat")
Q7: Can you spot clusters? Do they correspond to metadata groupings or conditions?
A7: The timepoints are not well separated. But we also have a lot more data from one of the two groups. The groups of mocktreat data appear nicely in two clusters.
If we want to test whether the variables significantly explain some of the variance we see, we can perform an Permutational analysis of variance
(PERMANOVA) using the adonis2
package. Check out ?adonis2
is you want to learn more.
To test the overall effect of the timepoint
we can run the following
set.seed(34521) # set seed for reproducibility
# PERMANOVA
adonis2(brel_spec_gg2_dist ~ timepoint, data = as.data.frame(nmds_spec_gg))
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = brel_spec_gg2_dist ~ timepoint, data = as.data.frame(nmds_spec_gg))
## Df SumOfSqs R2 F Pr(>F)
## timepoint 1 0.12839 0.06337 2.0296 0.068 .
## Residual 30 1.89778 0.93663
## Total 31 2.02617 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
How would you run the PERMANOVA for mocktreat
?
set.seed(34521)
adonis2(brel_spec_gg2_dist ~ mocktreat, data = as.data.frame(nmds_spec_gg))
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = brel_spec_gg2_dist ~ mocktreat, data = as.data.frame(nmds_spec_gg))
## Df SumOfSqs R2 F Pr(>F)
## mocktreat 1 0.28407 0.1402 4.8918 0.001 ***
## Residual 30 1.74210 0.8598
## Total 31 2.02617 1.0000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Q8: Explore both PERMANOVA outputs. Does any of the two factors timepoint and mocktreat significantly explain the variance observed in the data?
A8: The timepoints do not significantly explain any of the variance. The mocktreat does significantly explain approx. 14% of the variance.
Differential abundance analysis is a huge discussion field in the statistical community and we cannot cover this in this course. We will give you an example of a very simple non-parametric approach. However, most of the times your data are much more complex, which will require you to use more sophisticated methods. It is important to discuss this with a statistician or someone with the statistical knowledge about these time of anaylsis. Most importantly, if you want to perform a differential abundance analysis you need to start planning your experimental set up accordingly from the beginning and include proper controls. Always make sure you understand what you are doing, to be able to judge the aproprietness of the method and interpret the results correctly.
First we need to extract the species abundance table.
# create a df with all taxa and all variables
taxtab<-as.data.frame(t(brel_spec)) # transpose the data
names<-as.character(unlist(taxtab["taxa",])) # save column names to variable
taxtab<-taxtab[!rownames(taxtab) %in% "taxa", ] # only keep rows with sample names
sampnames<-rownames(taxtab) # save rownames to variable
taxtab<-sapply(taxtab, as.character) # transform class of values to characters
taxtab<-as.data.frame(matrix(as.numeric(taxtab), ncol = ncol(taxtab))) # transform to numeric matrix
names(taxtab)<-names # add back the column names
rownames(taxtab)<-sampnames # add back the rownames
# add metadata
taxtab<-taxtab %>%
rownames_to_column("Sample") %>%
left_join(meta, by="Sample") %>%
mutate_if(is.character,as.factor) # make sure variables are factors
Now let’s pick one taxon Muribaculum intestinale
and check how the data are distributed.
Q9: Are the data normally distributed?
We can quickly check this for one taxon.
# create a Q-Q plot
qqnorm((taxtab[,3]))
qqline((taxtab[,3]), col = "steelblue")
# plot the data distribution
ggplot(taxtab, aes(x=`Muribaculum intestinale`)) +
geom_density(alpha = 0.5)
A9: The data of Muribaculum intestinale are clearly not normally distributed as can be seen from the distribution and the Q-Q plot.
Microbiome data often are not normally distributed, which means assumptions like data normality may not be appropriate. One possibility is to perform a non-parametric test which does not assume normailty of the data. For two-group comparisons this can be a Mann-Whitney-Wilcoxon test (wilcoxon) and for more than 2 groups you could use a Kruskal-Wallis test. These tests can be paired if this information (e.g. the same individual before and after treatment) is avaible but in our case we don’t have this Information.
Let’s practise and perform a wilcoxon test from the coin
package on one taxon.
# run wilcoxon test
wtsingle <- coin::wilcox_test(taxtab[,"Muribaculum intestinale"] ~ mocktreat, data = taxtab, confint=TRUE)
# get the p-value
coin::pvalue(wtsingle)
## [1] 0.4168496
Q10: Is this taxon significantly different between timepoints?
A10: No, this taxon is not significantly different between timepoints (p-value > 0.05).
Why is this the case? Let’s look at the box plot to see.
Mint <- taxtab %>%
select(c("Muribaculum intestinale","mocktreat")) %>%
gather(key=taxa,value = rel_abund,-mocktreat)
ggplot(Mint, aes(x=mocktreat,y=rel_abund)) +
geom_boxplot(aes(color=mocktreat)) +
geom_jitter(aes(x=mocktreat,color=mocktreat))
First we create an empty dataframe to collect the results from the multiple tests for each species.
# save all taxa names
taxnames<-brel_spec$taxa
# empty dataframe, setting up all required columns
df_w<-data.frame(pvalue= vector(length = length(taxnames)),
padjust = vector(length = length(taxnames)))
rownames(df_w)<-taxnames # set rownames to taxa
Then we run the wilcoxon test in a for-loop
for each species and save the p-value to our dataframe.
for (taxa in taxnames) {
wt <- coin::wilcox_test(taxtab[,taxa] ~ mocktreat, data = taxtab)
df_w[taxa,"pvalue"] <- coin::pvalue(wt)
}
Since now we performed many statistical tests, we need to adjust our p-value for multiple testing. Find out more here ?p.adjust
. Here we do this according to the Benjamini-Hochberg (BH) correction.
# Correct p-values for multiple testing (using BH)
df_w$padjust <- p.adjust(df_w$pvalue, method = "BH")
head(df_w)
## pvalue padjust
## Duncaniella sp. C9 0.664091243 0.77325693
## Duncaniella sp. B8 0.039547249 0.15931022
## Duncaniella dubosii 0.416806869 0.56666797
## Muribaculum intestinale 0.416849571 0.56666797
## Muribaculum gordoncarteri 0.002366559 0.02235084
## Bacteroides caecimuris 0.075761641 0.21671976
You can see that we have unadjusted and adjusted p-values for each species.
Q11: How many species were differentially abundant after p-vajue correction? Which species?
# only significant taxa
df_w_sig <- subset(df_w, padjust<0.05)
head(df_w_sig)
## pvalue padjust
## Muribaculum gordoncarteri 0.0023665591 0.02235084
## Acutalibacter muris 0.0006311980 0.01660328
## Faecalibacterium prausnitzii 0.0008833756 0.01660328
## Flavonifractor plautii 0.0009766638 0.01660328
## Lachnoclostridium phocaeense 0.0045816321 0.03894387
## Blautia producta 0.0004400265 0.01660328
sigtax<-rownames(df_w_sig)
sigtax
## [1] "Muribaculum gordoncarteri" "Acutalibacter muris"
## [3] "Faecalibacterium prausnitzii" "Flavonifractor plautii"
## [5] "Lachnoclostridium phocaeense" "Blautia producta"
## [7] "Blautia sp. SC05B48" "Enterococcus faecium"
## [9] "[Clostridium] innocuum" "Adlercreutzia equolifaciens"
A11: 10 species are significantly different between the two mocktreats: Muribaculum gordoncarteri, Acutalibacter muris, Faecalibacterium prausnitzii, Flavonifractor plautii, Lachnoclostridium phocaeense, Blautia producta, Blautia sp. SC05B48, Enterococcus faecium, [Clostridium] innocuum, Adlercreutzia equolifaciens
Now plot the significantly different species as boxplot to see the shift.
sigtax<-taxtab %>%
select(c(all_of(sigtax),mocktreat)) %>%
gather(key=taxa,value = rel_abund,-mocktreat)
ggplot(sigtax, aes(x=mocktreat,y=rel_abund)) +
geom_boxplot(aes(color=mocktreat)) +
theme(axis.text.x = element_text(angle=45,hjust = 1)) +
geom_jitter(aes(color=mocktreat), alpha=0.5, size=2) +
ggtitle("Significantly different taxa") +
facet_grid(. ~ taxa)
Q12: What do you see in the box plots?
A12: The boxplots show the shift in relative abundance between mocktreats. For most taxa it is obvious that despite a significant p-value the shift in relative abundance is extremely minor and it may be questionable whether it reflects biologically meaningful results. Acutalibacter muris and Muribaculum gordoncarteri instead shows a stronger effect and may be more interesting to explore further.