Showing posts with label distributions. Show all posts
Showing posts with label distributions. Show all posts

Wednesday, April 3, 2013

Plotting continous distributions (Beta, Gamma, Chi-square, t etc) and discrete distributions (eg. Binomial) in Excel using REXCEL

See the related posts on RExcel (for basic, Excel 2003 and Excel 2007) for basic information.

In previous posts we discussed how to plot different distributions by creating bins and formulas where Y is function of X. But in RExcel it is just easy as click.

Go to continous distributions and then select the distribition you want to plot from the list, and then select submenu plot the distribution. You might need to provide parameter for that distribition, that is it ! click and you will see a curve in plot window.


Similarly other distributions:


In the similarway we can plot discrete distributions are listed under distributions -> discrete distributions. Select distribution you like to plot and select plot the distribution sub-menu.


In the above example we have 200 samples in Binomial distribution.
Look the shape of the distribution, is not it like normal curve. How about when we have 5000 sample.






Histogram and stem leaf plot in Excel using RExcel

See the related posts on RExcel (for basic, Excel 2003 and Excel 2007) for basic information.

It is very easy to plot histogram using RExcel in Excel. Please follow previous posts regarding RExcel installation, loading and data reading. We will use the ready data - variable to demostrate how to create histogram.

Once the data is loaded, select Graphics then histograms. Select the variable you want to create histogram from.You can change number of bins or used auto. Also you can choose whether you want density or exact count (Frequency) or percentages in Y axis. Click OK.






In the sameway we can create stem-leaf plot. From Graphics menus select stem and leaf plot. Then select a variable and other parameters in dialog box and click OK.




The output is not something fancy graph, rather on the script window like this.



Monday, April 1, 2013

generating and plotting of statistical distribution functions or mathmatical formulas in Excel

First of all we need to find a formula for inverse probability distribution, search one in office formula list. Here are some we will plot them.

We need to create a small bin data series (for example with bin of 0.005, from 0 to 1). Then we can simply use scatter plot with smoothed line to visualize the curve.



For example first two formula will plot a beta distribution (regular and cumulative), while the second will plot a normal distribution.  Following are the formula.

=BETA.DIST(A2, 4,5, FALSE,0,1)
=BETA.DIST(A2, 4,5, TRUE,0,1)
=NORM.DIST(A2, 0.5, 0.1, FALSE)


Formula for two more distributions, here the x value is from 1 to 20 with bin of 0.1. The following are the formula where the bins are in cell E2.

 =CHISQ.DIST(E2,1, FALSE)
=T.DIST(E2, 10, FALSE)


In the similar way we can write own formula and plot the lines. Here we have bins of 0.01 to 2.47 in cell I2 and different formula are used to generate the values and plotted.

=(I2^2+3)
=(I2^2-3)
=(I2^3)

Apply the function to whole data range generate the smothed line xy plot.



Shading under curves:

If we want to highlight some portion of these curves we can do so by generating a new data series with values in just areas of intrest and rest is blank. For example if we are interested in Beta (1,4) in x values between 0.2 and 0.5, then first generate a column with full  (in B).

=BETA.DIST(A2, 1,4, FALSE,0,1)

Now in C you can filter values B with following formula.
=IF(A2<1,NA,IF(A2 > 3, NA, B2))



Now we generate the curve with all three values. Remove the lines generated for the selected region after adding error bars. Set length of error bars to 100 so that it fills the regions of interest. Remove X error bars. Increase the width of the error bars to fill the space. You can change color of the filled region.











Friday, March 29, 2013

Multiple histograms: Overlayed or Back to Back



We can calculate frequency with same Bins, then use then to plot histogram with grouped bar chart.  






We can overylay the histograms, with transparency. Look the following plots:




Back to back histograms 
The trick to create back to back histogram is similar to above, here we need to make the frequency negative to the series that will be plotted in the apposite side. We need to overlap the bars (perhaps in opposite direction) and optionally you can set gap width to 0. The following example are in excel 2003, tricks can be applied in similar way in another excel versions. 


Perhaps the negative number are purpose of plotting histograms, we do not want to present them as negative. We can format the axis to look like different color (such as negative appearing as red, in the following example), perhaps back to black color. Use custom format if you want negative color displayed as different color ([Blue]0;[Green]0 format will turn positive numbers to blue and negative number of green as shown in following graph or [Black]0;[Black]0 will turn the both numbers to black and remove the - sign in negative number).