The data set, studentdata was obtained from the LearnBayes package in R. The data contains 657 observations and 11 input variables including height, gender, number of shoes owned by student, among others. For this blog, I will randomly sample 100 observations from the data set and the goal is to compare the number of shoes owned by males and females. Out of the 100 observations, 38 are males and 62 are females.
First, I constructed a parallel one-dimensional scatterplot of the number of shoes owned by males and females. I created this scatterplot by gender to compare the number of shoes owned by females to the number of shows owned by males. The one-dimensional scatterplot helps to show the individual distributions of the male and female values.

From the graph, females tend to have more shoes compared to males (which is quite obvious!!!). While the least number of shoes owned by females is 5, males tend to have 10
or less shoes. The females have a larger spread compared to males, with one female having about 40 shoes. But in general, the parallel one-dimensional scatterplot does not provide enough information to compare the distributions.
Next, I constructed the parallel quantile plot which is often more effective for comparing data distributions. The data set is split into 2 groups, male shoes and female shoes and the f-values are computed separately for each group. The plot of the number of shoes against the f-values are shown below;

From the graph, we see that the quantile values for females are higher than those of males. For example the median number of shoes owned by females is 15 shoes compared to males which is about 5 shoes. Typically females have more shoes compared to males. Also comparing the upper quartile (Q3), 75% of the males have about 8 shoes or less compared to the females with about 75% of them having at most 25 shoes.The females have a larger spread or variability in their data compared to males.
I also constructed a quantile-quantile plot of the male and female values since it is an effective way to compare the quantiles. Here, we graph the quantiles of the females against the corresponding quantiles of males. It is a simple but powerful tool for comparing two distributions. The graph is shown below;

The equation y=x is overlaid (It is the black line on the graph). All the points are above the line and I don’t think it is a good idea to summarize the point with this equation. Throughout most of the range of the distribution, the female quantiles are higher than the males quantiles. The corresponding quantiles do not differ by a constant, thus it is hard to tell by how many more shoes the females own compared to the males. The medians of the two groups differ and higher number of shoes for females are larger than the high number of shoes for males. Also, higher quantiles differ by more compared to lower quantiles.
Lastly, I constructed a Tukey mean-difference plot from the quantiles of the two distributions. Graphing the difference in quantiles of the shoes owned by males and females against the mean quantiles, we see an upward trend. That is as the mean quantiles increase, the difference between the quantiles of males and females tend to increase as well. From the median to the top quantiles, most of the quantiles for the females are about 15 to 25 shoes higher than those for the males, but the difference decreases to less than 10 shoes for the lower quantiles. The means there is a larger gender differences in number of shoes for larger mean value of shoes. The number of shoes owned by males and females differ and is complicated, not like the q-q plots with a simple linear pattern.

From the 4 graphs, we can conclude that females tend to own more shoes compared to males. I believe the parallel quantile plot and the Tukey mean-difference plot graphical comparison of the shoes owned by males and females in the statistics class. The parallel quantile plot helps to compare the medians and quartiles of the two distributions and the tukey mean-difference plot gives further information about the mean and difference between the quantiles to compare the two distributions.