Style Guide: File Names in R

Should be machine readable:
- Stick with numbers, letters, -, and _
- avoid spaces, symbols, local machine specific characters, and special characters

# Good
fit_models.R
utility_functions.R

# Bad
fit models.R
foo.r
stuff.r

Prefix them with numbers if they should be run in specific order

00_download.R
01_explore.R
...
09_model.R
10_visualize.R

Pay attention to capitalization
Should be Human readable:
- use file names to describe what’s in the file

# good
report-draft-notes.txt

# bad
temp.r

Use the same structure for closely related files

# good
fig-eda.png
fig-model-3.png

# bad
figure eda.PNG
fig model three.png

In R script files or in Rmd files try to comment lines of codes frequently. Files should also be break up into readable chunks by using
Pay attention to internal structure

# Load data ---------------------------

...

# Summarize data ----------------------

... 

# Plot data ---------------------------

...

# Save data ---------------------------

Style Guide: Object names in R

Try to use lowercase letters, numbers and _ to make object name easier to read. Try to use minimum number of words

# Good
day_one
day_1

# Bad
DayOne
dayone
first_day_of_the_month
djm1

Avoid reuse common functions and variable names

# Bad
T <- FALSE
c <- 10
mean <- function(x) sum(x)

Note: T is reserved for TRUE and c is for c() and mean is a function name

Style Guide: Spacing, Parentheses

Always put a space after a comma, never before

# Good
x[, 1]

# Bad
x[,1]
x[ ,1]
x[ , 1]

Parentheses

Do not put spaces inside or outside parentheses for regular function calls

# Good
mean(x, na.rm = TRUE)

# Bad
mean (x, na.rm = TRUE)
mean( x, na.rm = TRUE )

Note: Careful with operators (==, +, -, <-, etc.). Try to use parentheses such that operators are surrounded by spaces

# Good
height <- (feet * 12) + inches
mean(x, na.rm = TRUE)

# Bad
height<-feet*12+inches
mean(x, na.rm=TRUE)

Place a space before and after () when used with if, for, or while (We will see these constructs later)

# Good
if (debug) {
  show(x)
}

# Bad
if(debug){
  show(x)
}

Avoid assignment in function calls (will discuss functions later)

# Good
if (debug) {
  show(x)
}

# Bad
if(debug){
  show(x)
}

Read the rest of the Tidyverse Style Guide as we cover R scripting/Programming

R Objects

R has five basic or “atomic” classes of objects:
- character (“a”, “1”, “TRUE” - with quotation marks)
- numeric (real numbers, 1.234, 18, 1e25)
- integer (1, 2, 100)
- complex (1 - 2i, 3 + 5i)
- logical (True/False, T, F, TRUE, FALSE)
The most basic type of R object is a vector
A vector can only contain objects of the same class
Before moving to creating different data types and using/manipulating them, let’s discuss some basic built-in functions we will use frequently

c() Function

c(): Combine Values into a Vector or List

1:4 # integers from 1 to 4 included

## [1] 1 2 3 4

c(1,2,3,4) # explicit integer vector

## [1] 1 2 3 4

all.equal(1:4,c(1,2,3,4))

## [1] TRUE

c()

## NULL

x <- c()
x

## NULL

c(7, -4, 2, 0)

## [1]  7 -4  2  0

x <- c(7, -4, 2, 0) 

is.vector(x)

## [1] TRUE

length(x) # length of a vector

## [1] 4

print() Function

print(): prints its argument

# before beginning (just in case)
# print() function, see ?print, Print Values
# examples  
print("Hello World") # no object specification

## [1] "Hello World"

print(3)

## [1] 3

# print(3, "hello world") # Error in print
print(1:5)

## [1] 1 2 3 4 5

print(1,6) # 6 can not be printed.

## [1] 1

print(1.23456789)

## [1] 1.234568

print(1.23456789,5)

## [1] 1.2346

print(1.23456789,6)

## [1] 1.23457

cat() Function

cat(): useful for producing output in user-defined functions. It converts its arguments to character vectors, concatenates them to a single character vector, appends the given sep = string(s) to each element and then outputs them.

cat("Hello World") # no object specification

## Hello World

cat(3)

## 3

cat(3, "Hello World")

## 3 Hello World

cat(3, "Hello World", sep = "-")

## 3-Hello World

cat(2,3, "Hello World", sep = "-")

## 2-3-Hello World

cat(c(2,3), "Hello World", sep = "-")

## 2-3-Hello World

1:5

## [1] 1 2 3 4 5

cat(1:5, "Hello World", sep = "-")

## 1-2-3-4-5-Hello World

paste() Function

paste(): Concatenate vectors after converting to character

paste(1," One")

## [1] "1  One"

paste(1," One",sep="")

## [1] "1 One"

paste(1,"One",sep=", ")

## [1] "1, One"

paste(1:5," are numbers")

## [1] "1  are numbers" "2  are numbers" "3  are numbers" "4  are numbers"
## [5] "5  are numbers"

paste(1:5, sep = ", ")

## [1] "1" "2" "3" "4" "5"

paste(1:5, collapse = ", ")

## [1] "1, 2, 3, 4, 5"

paste(paste(1:5,collapse=", ")," are numbers")

## [1] "1, 2, 3, 4, 5  are numbers"

paste0(1:5)

## [1] "1" "2" "3" "4" "5"

paste0(1:5, collapse = ",")

## [1] "1,2,3,4,5"

paste0(1:5," What")

## [1] "1 What" "2 What" "3 What" "4 What" "5 What"

paste0(1:5, collapse = " and ")

## [1] "1 and 2 and 3 and 4 and 5"

Explicit Coercion: as.xxx() Functions

Objects can be explicitly coerced from one class to another using the as.XXXX functions, if available. Some basic functions are:
- as.character()
- as.numeric()
- as.logical()
- as.complex()
- as.integer()
is.XXX() functions are also available
- is.character()
- is.numeric()
- is.logical()
- is.complex()
- is.integer()

Explicit Coercion: as.xxx() Functions

x <- 6
class(x)

## [1] "numeric"

is.integer(x)

## [1] FALSE

is.numeric(x)

## [1] TRUE

as.character(x)

## [1] "6"

as.logical(x)

## [1] TRUE

as.complex(x)

## [1] 6+0i

y <- "a"
is.character(y)

## [1] TRUE

as.numeric(y) # throws NA (Not Available)

## [1] NA

When nonsensical coercion takes place, you will usually get a warning from R

rep() function: Replicate

rep() function, see ?rep: Replicates the values

rep(1,5) # rep(x=1, times=5)

## [1] 1 1 1 1 1

rep(c(1,2),5) # rep(x=c(1,2), times=5)

##  [1] 1 2 1 2 1 2 1 2 1 2

rep(c("a","b","c","d"),2)

## [1] "a" "b" "c" "d" "a" "b" "c" "d"

rep(c("a","b","c","d"), each=2, times=2)

##  [1] "a" "a" "b" "b" "c" "c" "d" "d" "a" "a" "b" "b" "c" "c" "d" "d"

rep(c("a","b","c","d"), each=2, length.out=6)

## [1] "a" "a" "b" "b" "c" "c"

rep("abc", 10*(1-0.7))

## [1] "abc" "abc" "abc"

rep("abc", 3.9)

## [1] "abc" "abc" "abc"

seq() function: Sequence

seq() function, see ?seq: sequence generation

seq(0, 10, length.out=5) # seq(from=0, to=10, length.out=5)

## [1]  0.0  2.5  5.0  7.5 10.0

seq(0, 10, by=2) # seq(from=0, to=10, by=2)

## [1]  0  2  4  6  8 10

# for dates (we will cover dates in a bit later, but for now)
## first days of years
seq(as.Date("1999/1/1"), as.Date("2008/1/1"), "years")

##  [1] "1999-01-01" "2000-01-01" "2001-01-01" "2002-01-01" "2003-01-01"
##  [6] "2004-01-01" "2005-01-01" "2006-01-01" "2007-01-01" "2008-01-01"

## by month
seq(as.Date("2023/1/1"), by = "month", length.out = 12)

##  [1] "2023-01-01" "2023-02-01" "2023-03-01" "2023-04-01" "2023-05-01"
##  [6] "2023-06-01" "2023-07-01" "2023-08-01" "2023-09-01" "2023-10-01"
## [11] "2023-11-01" "2023-12-01"

## quarters
seq(as.Date("2020/1/1"), as.Date("2023/1/1"), by = "quarter")

##  [1] "2020-01-01" "2020-04-01" "2020-07-01" "2020-10-01" "2021-01-01"
##  [6] "2021-04-01" "2021-07-01" "2021-10-01" "2022-01-01" "2022-04-01"
## [11] "2022-07-01" "2022-10-01" "2023-01-01"

sample() function

sample() function: takes a sample of the specified size from the elements with or without replacement
set.seed() is a utility function that helps to generate same random numbers

sample(x, size, replace = FALSE, prob = NULL)

sample(1:10)

##  [1]  1  5  8  2  4  7  9  6 10  3

sample(1:10,2)

## [1] 1 7

sample(1:10,size=2)

## [1] 5 9

set.seed(1)
sample(1:10,size=2)

## [1] 9 4

sample(1:10,size=2)

## [1] 7 1

set.seed(1)
sample(1:10,size=2)

## [1] 9 4

# 100 random samples with associated probabilities (STAT course is useful)
x <- sample(c("a","b"), prob=c(0.8,0.2), size=100, replace=TRUE)
table(x) # table function

## x
##  a  b 
## 83 17

# 1000 random samples with associated probabilities
x <- sample(c("a","b"), prob=c(0.8,0.2), size=1000, replace=TRUE)
table(x) # table function

## x
##   a   b 
## 798 202

# 1000 random samples with associated probabilities
# Relative weights (0.5/0.6, 0.1/0.6)
x <- sample(c("a","b"), prob=c(0.5,0.1), size=1000, replace=TRUE)
table(x) # table function

## x
##   a   b 
## 837 163

MIS 207

Introduction to R Language
Style Guide, Basic Objects, Coercion

Coding Style

Style Guide

Style Guide: File Names in R

Style Guide: Object names in R

Style Guide: Spacing, Parentheses

R Objects

c() Function

print() Function

cat() Function

paste() Function

Explicit Coercion: as.xxx() Functions

Explicit Coercion: as.xxx() Functions

rep() function: Replicate

seq() function: Sequence

sample() function

MIS 207

Introduction to R Language Style Guide, Basic Objects, Coercion

Coding Style

Style Guide

Style Guide: File Names in R

Style Guide: Object names in R

Style Guide: Spacing, Parentheses

R Objects

c() Function

print() Function

cat() Function

paste() Function

Explicit Coercion: as.xxx() Functions

Explicit Coercion: as.xxx() Functions

rep() function: Replicate

seq() function: Sequence

sample() function

Introduction to R Language
Style Guide, Basic Objects, Coercion