Beginner's guide to R: Painless data visualization

Part 4 of our hands-on guide covers simple graphics, bar graphs and more complex charts.

Add main="Graph of demand" if you want a main headline on your graph:

barplot(BOD$demand, main="Graph of demand")

   Bar plot
Bar chart with R's bar plot() function.

To label the bars on the x axis, use the names.arg argument and set it to the column you want to use for labels:

barplot(BOD$demand, main="Graph of demand", names.arg = BOD$Time)

Sometimes you'd like to graph the counts of a particular variable but you've got just raw data, not a table of frequencies. R's table() function is a quick way to generate counts for each factor in your data.

The R Graphics Cookbook uses an example of a bar graph for the number of 4-, 6- and 8-cylinder vehicles in the mtcars data set. Cylinders are listed in the cyl column, which you can access in R using mtcars$cyl.

Here's code to get the count of how many entries there are by cylinder with the table() function; it stores results in a variable called cylcount:

cylcount <- table(mtcars$cyl)

Bar plot
Creating a bar plot.

That creates a table called cylcount containing:

4 6 8

11 7 14

Now you can create a bar graph of the cylinder count:

barplot(cylcount)

ggplot2's qplot() quick plotting function can also create bar graphs:

qplot(mtcars$cyl)

Blank variables
What happens to your bar chart when you don't instruct R not to plot continuous variables.

However, this defaults to an assumption that 4, 6 and 8 are part of a variable set that could run from 4 through 8, so it shows blank entries for 5 and 7.

To treat cylinders as distinct groups -- that is, you've got a group with 4 cylinders, a group with 6 and a group with 8, not the possibility of entries anywhere between 4 and 8 -- you want cylinders to be treated as a statistical factor:

qplot(factor(mtcars$cyl))

To create a bar graph with the more robust ggplot() function, you can use syntax such as:

ggplot(mtcars, aes(factor(cyl))) + geom_bar()

Histograms

Histograms work pretty much the same, except you want to specify how many buckets or bins you want your data to be separated into. For base R graphics, use:

hist(mydata$columnName, breaks = n)

where columnName is the name of your column in a mydata dataframe that you want to visualize, and n is the number of bins you want.

Join the newsletter!

Or

Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

More about SimpleTest

Show Comments
[]