Interactive Data Visualizations Using R

Picture of Data-Mania Writer's Guild

Data-Mania Writer's Guild

Reading Time: 9 minutes

Data visualization is as much an art as science. There are many tools for producing visualizations in R. Some of the popular ones are ggplot and ggvis. Although ggplot is an amazing tool for creating static visualizations, it tends to fall short in performance when you need to create interactive data visualizations. Same is the case with ggvis. This is where plotly comes in handy. Plotly makes it easy to create rich interactive data visualizations without having the knowledge of either CSS or Javascript.

Using Interactive Data Visualizations

What is Plotly?

how to use Interactive Data Visualizations Using RPlotly is a data visualization tool built on top of visualization libraries such as HTML, D3.js and CSS. It is created using the Django framework. It is compatible with a number of languages. The plots produced by plotly can be hosted online using the plotly API’s.

Advantages of Plotly:

  • Easy to build visualizations using D3.js without knowing the language of D3.js
  • Compatible with a number of languages
  • The plots produced from Plotly can be hosted online so that other can access them
  • Charts created by using Chart Studio which requires no coding, hence non coders can use them with ease
  • The syntax for Plotly is simple
  • Plotly is compatible with ggplot2 on R

Disadvantages of Plotly:

  • Plots created using the community version of Plotly are accessible by everyone
  • Limit on daily API calls

Building Interactive Data Visualizations using Plotly in R

As we have already looked at the advantages and disadvantages of Plotly. Let’s start building interactive data visualizations. First thing to do, install Plotly

install.packages("plotly")
Once the installation is complete, we need to call the Plotly library
> require(plotly)
Loading required package: plotly
Loading required package: ggplot2

Attaching package: ‘plotly’

The following object is masked from ‘package:ggplot2’: last_plot
The following object is masked from ‘package:stats’: filter
The following object is masked from ‘package:graphics’: layout

Before creating the graphs, let’s have a look at the syntax for creating the visualizations
plot_ly( x, y, type, mode, color, size)

Where –

  • x = values for the x-axis
  • y = values for the y-axis
  • type = specifies the type of plot that needs to be created. Eg – ‘scatter’, ’box’ or ‘histogram’
  • mode = the mode in which the plot needs to be represented
  • color = represents the color of the data points

Scatterplot

Let’s start off by creating a scatterplot type from among our choices of interactive data visualizations. We will be using the iris dataset (resides within R, no need to load external file) to create the scatterplot. We attach the iris dataset and have a look at the first few rows. Also we will have a look at the structure of the data using glimpse from reply.

> require(dplyr)
> attach(iris)
> head(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
> glimpse(iris)
Observations: 150
Variables: 5
$ Sepal.Length  5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4...
$ Sepal.Width  3.5, 3.0, 3.2, 3.1, 3.6, 3.9, 3.4, 3.4, 2...
$ Petal.Length  1.4, 1.4, 1.3, 1.5, 1.4, 1.7, 1.4, 1.5, 1...
$ Petal.Width  0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.3, 0.2, 0...
$ Species  setosa, setosa, setosa, setosa, setosa, ...

Looking at the data we can observe that there are four readings which are continuous, whereas species is categorical. The syntax to create a scatter plot is

> sca <- plot_ly(x = ~Sepal.Length, y = ~Petal.Length, type = 'scatter') > layout(sca, title = 'Scatter Plot', xaxis = list(title = 'Sepal length'),
+ yaxis = list(title = 'Petal length'))

The layout command is used to define the title, xaxis and yaxis of the plot.

Let’s add some more layers to the plot by adding colors so as to differentiate the species from one another.

> sca <- plot_ly(x = ~Sepal.Length, y = ~Petal.Length, type = 'scatter', color = ~Species) 
> layout(sca, title = 'Scatter Plot', xaxis = list(title = 'Sepal length'),
+ yaxis = list(title = 'Petal length'))

Now that the plot has been formed, we can –

  • Zoom into the plot

  • Reset the axis
  • Do Box select, lasso select

  • Download the plot as png

Let’s try to add the size variable to make the scatter plot display the readings for the Sepal length.

> sca <- plot_ly(x = ~Sepal.Length, y = ~Petal.Length, type = 'scatter', color = ~Species, size = ~Sepal.Length) 
> layout(sca, title = 'Scatter Plot', xaxis = list(title = 'Sepal length'), yaxis = list(title = 'Petal length'))

As seen from the above diagram the markers tend to increase in size as the reading for sepal length tends to increase.

Line Charts and Time Series

For creating the line charts we shall use the airquality data set

> attach(airquality)

> glimpse(airquality)
Observations: 153
Variables: 6
$ Ozone  41, 36, 12, 18, NA, 28, 23, 19, 8, NA, 7, 16, ...
$ Solar.R  190, 118, 149, 313, NA, NA, 299, 99, 19, 194, ...
$ Wind  7.4, 8.0, 12.6, 11.5, 14.3, 14.9, 8.6, 13.8, 2...
$ Temp  67, 72, 74, 62, 56, 66, 65, 59, 61, 69, 74, 69...
$ Month  5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5...
$ Day  1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,...
> head(airquality)
Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
3 12 149 12.6 74 5 3
4 18 313 11.5 62 5 4
5 NA NA 14.3 56 5 5
6 28 NA 14.9 66 5 6

Looking at the structure of the data, it can be seen that all the variables are of ratio/interval type. We shall be plotting the Solar.R values as a time series.

> ti <- plot_ly(y = ~Solar.R, type = 'scatter', mode = 'lines') 
> layout(ti, title = 'Time Series', yaxis = list(title = 'Solar Reading'))

Let’s add markers to the above chart

> ti <- plot_ly(y = ~Solar.R, type = 'scatter', mode = 'lines+markers') 
> layout(ti, title = 'Time Series', yaxis = list(title = 'Solar Reading'))

Histograms

Nextup we create histograms from the iris dataset. We use the Sepal Length to create the counts.

> #Histogram
> hist <- plot_ly(x = ~Sepal.Length, type = 'histogram') > layout(hist, title = 'Histogram', xaxis = list(title = 'Sepal length'),
+ yaxis = list(title = 'count'))

Bar Plot

With the iris data we create a bar plot.

> bar <- plot_ly(x = ~Sepal.Length, type = 'bar') > layout(bar, title = 'Bar Plot', xaxis = list(title = 'Species'),
+ yaxis = list(title = 'count'))

This is just a normal bar plot, we can create stacked bar plots. For this we create some custom data.

> Animals <- c("giraffes", "orangutans", "monkeys") > SF_Zoo <- c(20, 14, 23) > LA_Zoo <- c(12, 18, 29) > data <- data.frame(Animals, SF_Zoo, LA_Zoo) > data
Animals SF_Zoo LA_Zoo
1 giraffes 20 12
2 orangutans 14 18
3 monkeys 23 29

p <- plot_ly(data, x = ~Animals, y = ~SF_Zoo, type = 'bar', name = 'SF Zoo') %>%
add_trace(y = ~LA_Zoo, name = 'LA Zoo')

layout(p,yaxis = list(title = 'Count'), barmode = 'stack')

Line charts with Scatter plots

As we have already created both the Line chart and Scatter plots individually. It’s time to add both of these to create a single chart with both of these in one. To accomplish this we need to use the command add_trace. First let’s create some data.

> #Scatter plot with Time series
> trace_1 <- rnorm(100, mean = 0) > trace_2 <- rnorm(100, mean = -5) > x <- c(1:100) > data <- data.frame(x, trace_1, trace_2)
  • trace_1 creates a random value from the normal distribution with a count 100 with mean zero
  • trace_2 creates a random value from the normal distribution with a count 100 with mean negative five

The head of the data frame

> head(data)
x trace_1 trace_2
1 1 2.4854207 -5.067471
2 2 0.7854973 -3.974228
3 3 0.4401011 -3.496160
4 4 0.9688235 -4.369723
5 5 -0.6280533 -4.452010
6 6 1.6121642 -4.691630
> plot_ly(data, x = ~x, y = ~trace_1, type = 'scatter', mode = 'lines+markers', name = 'trace1') %>%
+ add_trace(y = ~trace_2, mode = 'markers', name = 'trace2')

Box Plots

Plotly can also be used to create excellent box plots. Using the data from mtcars, we create a boxplot for the horse power values.

>attach(mtcars)
> glimpse(mtcars)
Observations: 32
Variables: 11
$ mpg  21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 2...
$ cyl  6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8...
$ disp  160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, ...
$ hp  110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 12...
$ drat  3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3...
$ wt  2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, ...
$ qsec  16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, ...
$ vs  0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0...
$ am  1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ gear  4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3...
$ carb  4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4...
> #Box Plot
> box <- plot_ly(y = ~hp, type = 'box') > layout(box, title = 'Box Plot', yaxis = list(title = 'Horse Power'))

Heat Maps

Let’s start creating some more advanced interactive data visualizations with Plotly starting off with heat maps.For this particular exercise we will be using the volcano data set.

> #Heat maps
> data(volcano)
> glimpse(volcano)
num [1:87, 1:61] 100 101 102 103 104 105 105 106 107 108 ...
> dim(volcano)
[1] 87 61
> plot_ly(z = ~volcano,type = 'heatmap')

3D Scatter Plots

This is by far one of the coolest of interactive data visualizations tricks in Plotly. For this chart we shall once again use the iris data set. And define values for the three axises.

> #3D Scatter Plots
> plot_ly(x = ~Sepal.Length, y = ~Sepal.Width, z = ~Petal.Length, type="scatter3d", mode = 'markers', 
size = ~Petal.Width, color = ~Species)

Conclusion

Interactive Data Visualizations Using RReading through the article you must now be familiar with the working of Plotly in R. Plotly is used extensively in dashboards to create interactive data visualizations. It has an edge over ggplot in the interactive aspect, yet it can used along with ggplot to create plots to be pushed to the cloud. So, it’s time for you guys to go ahead and start creating awesome visualizations with Plotly.

 

More resources to get ahead...

Get Income-Generating Ideas For Data Professionals

Are you tired of relying on one employer for your income? Are you dreaming of a side hustle that won’t put you at risk of getting fired or sued? Well, my friend, you’re in luck.

Take The Data Superhero Quiz

You can take a much more direct path to the top once you understand how to leverage your skillsets, your talents, your personality and your passions in order to serve in a capacity where you’ll thrive. That’s why I’m encouraging you to take the data superhero quiz.

Author Bio:

This article was contributed by Perceptive Analytics. Rohit, Vishnu Reddy, Saneesh Veetil and Chaitanya Sagar contributed to this article.

Perceptive Analytics provides Tableau Consulting, data analytics, business intelligence and reporting services to e-commerce, retail, healthcare and pharmaceutical industries. Our client roster includes Fortune 500 and NYSE listed companies in the USA and India.

HI, I’M LILLIAN PIERSON.
I’m a fractional CMO that specializes in go-to-market and product-led growth for B2B tech companies.
If you’re looking for marketing strategy and leadership support with a proven track record of driving breakthrough growth for B2B tech startups and consultancies, you’re in the right place. Over the last decade, I’ve supported the growth of 30% of Fortune 10 companies, and more tech startups than you can shake a stick at. I stay very busy, but I’m currently able to accommodate a handful of select new clients. Visit this page to learn more about how I can help you and to book a time for us to speak directly.

Get Featured

We love helping tech brands gain exposure and brand awareness among our active audience of 530,000 data professionals. If you’d like to explore our alternatives for brand partnerships and content collaborations, you can reach out directly on this page and book a time to speak.

Join The Convergence Newsletter

See what 26,000 other data professionals have discovered from the powerful data science, AI, and data strategy advice that’s only available inside this free community newsletter.

Join The Convergence Newsletter for free below.

Our newsletter is exclusively written for operators in the data & AI industry.

Hi, I'm Lillian Pierson, Data-Mania's founder. We welcome you to our little corner of the internet. Data-Mania offers fractional CMO and marketing consulting services to deep tech B2B businesses.

The Convergence community is sponsored by Data-Mania, as a tribute to the data community from which we sprung. You are welcome anytime.

Get more actionable advice by joining The Convergence Newsletter for free below.

See what 26,000 other data professionals have discovered from the powerful data science, AI, and data strategy advice that’s only available inside this free community newsletter.

Join The Convergence Newsletter for free below.
We are 100% committed to you having an AMAZING ✨ experience – that, of course, involves no spam.

Fractional CMO for deep tech B2B businesses. Specializing in go-to-market strategy, SaaS product growth, and consulting revenue growth. American expat serving clients worldwide since 2012.

© Data-Mania, 2012 - 2024+, All Rights Reserved - Terms & Conditions - Privacy Policy | PRODUCTS PROTECTED BY COPYSCAPE

The Convergence is sponsored by Data-Mania, as a tribute to the data community from which we sprung.

Get The Newsletter

See what 26,000 other data professionals have discovered from the powerful data science, AI, and data strategy advice that’s only available inside this free community newsletter.

Join The Convergence Newsletter for free below.
* Zero spam. Unsubscribe anytime.