Blog 10: Color

Part A

For this part, I obtained the data set from the Broadway CSV library. The data set is about  Broadway shows,  grouped over week long periods. The data set contains 12 variables including the name of the production, attendance, year, theater, type (whether it is a “Musical”, “Play”, or “Special”)  and so on.

I am interested in comparing the best Broadway shows average capacity  as a function of week number for four years such as 2011, 2012, 2013 and 2014. Using these information, I constructed time series plot for four years on the same graph. The corresponding graph is given below.

The above connected symbol plot compares the average capacity for the Broadways shows in each week for 4 different years. The time series shown on the graph is the weekly average of the attended people of the Broadway shows. It can be clearly seen, that the connected symbol plot, allows us to see the individual data points and the ordering through time.

There was noticeable declined in the average capacity in the latter part of 2011, especially after September. I believe the major reason was that the September 2011 terror attack, as a result of this people may avoid large public gathering. It is clearly evident, in 2012, mean capacity was low compare to all other year. This may be due to the fact that less number of movies released in this year. The another important factor associated with this dip is that 2012 was recorded as coldest year in most part of the world. Further, the mean capacity tend to decrease in the latter part of each year. I believe, people spent most of their time on shopping and visiting to their friends and loved ones home instead of watching Broadway shows. However, in the next two years the mean capacity for the Broadway shows started to escalate in the latter part of the year and reached the peak in 2013.

Further, I also included smoothing curve for all four year.

The above graph depicts the time series plot for each year and the corresponding smoothing curve for each year. It can be clearly seen, that the loess curves behave same pattern in all four years. In general, during the summer (the middle part of the year) time the average capacity was tend to be high compare to other time. Also, in the latter part of the year the loess curves showing an decreasing trend other than the year of  2013.

Part B

Here we simulate a sample of size 200 from a bivariate normal distribution with correlation rho = -0.9 and use a bivariate density estimation algorithm to construct a contour graph of the density estimate. First, I visualized the contour graph using a “Spectral” palette to color the regions. The corresponding graph is given below.

Further, I use ggplot2 to construct a similar graph and using the different set of colors to color the graph. I constructed the following graph with different set of colors. I would say, the following graph is much better than the above graph, because we can clearly visualize the higher level without looking at the key and it provides better order. To explain it further, when value is changing from larger to smaller the color is also changing from darker to lighter. And also, we can easily differentiate the edges of each level. Thus, in a nutshell, my graph provides better visual impression compare to the given graph.

Leave a Reply

Your email address will not be published. Required fields are marked *