Making vector and combining
Checking Dataframe properties
Installing Package in R
We can make vectors with commands
a <- seq (1,10) # make sequence number from 1
to 10 and store in 'a'
b <- seq (10,1)
c <-cbind (a, b) # making matrix by combining two vectors
c # check what is in 'c'
Now try find mean, standard
deviation and variance from above matrix c
mean (c) # mean of 'c'
sd (c) # standard deviation of 'c'
var (c) # variance of 'c'
read.table ('clipboard',header=T)->
a.df # command can be composed this was as well
attach(a.df) #
defining the object to work
mean(a.df) #
the mean
max(a.df) #
the maximum or largest value
min(a.df) #
the minimum value
sd (a.df) #
standard deviation
var(a.df) #
summary(a.df) #
to see summary of all variables at once
dim(a.df) # dimensions of a matrix or data frame
ncol(a.df) # number of columns
nrow(a.df) # number of rows
colnames(a.df) # give headings of the columns
rownames(a.df) # row headings
Adding a column in a.df,
called multiple, which will have average of column age and weight
of a.df. The 'dollor' sigh refers column of dataframe
a.df $ avg<-(a.df$age + a.df $
a.df$sum<-rowSums(a.df) # adding sum col
Try yourself adding a column with
multiple = 'age' * 'weight'
Some simple plots
We are working the the same previous data, if
need import again (copy data from sheet 1, in demodata excel file). Here, pratice some basic plots
bio.df<- read.table
('clipboard', header=T)
attach (bio.df)
plot (age,weight) # plot (predictor,
response) i.e. x, y
plot (age,height)
plot (height,weight)
hist (age) # plot histogram
hist (weight)
hist (height)
hist (age, nclass=6)
boxplot(weight,height) # plot boxplot
To make ease in working, we can first attach
the data frame/matrix = (it tells R to work with the assigned object)
This will shorten the command
when we deal with each variables separately from an object. If 'attach' is forgot, follow command will result error.
mean (age)
mean (weight)
mean (height)
median (age)
median (weight)
median (height)
If we do not 'attach(file)', we will
need to command specifying variable and file name, eg.
mean(a.df $ age)
sd(a.df $ weight)
In R-sofware main menu 'Packages', go to 'Install Pakcage(s)'. Then choose a CRAN from there (anyone, nearest location will be better). Then find the required package name in the list, click it and click 'OK'. Here, installation package prettyR.
Or if you would like to install offline, then downlaod the zip file of the R package first from CRAN page and install it latter from main menu Packages and Install Package(s) from local zip files...
Loading the package to make it
functional load in R
To calculate Mode, we will
load the library “prettyR” as this function is not directly available in
default libraries.
And the library do not come with
the installation of R, so we installed the package first (above) and now we load the library
Mode(x) # mode
calculatioon (x is any variable)
Mode (age)
Mode (weight)
You may want to practice more installation of packages.
Using 'apply' function
The apply function is used for
applying functions to rows or columns of matrices or dataframes
apply(x,1,sum) #1 = row, 2 = col
In this case, the above commands
are equivalent to
Try few more
To apply a function to
vector/variable i.e. column (in matrix or dataframe) then use 'sapply' (rather than 'apply')
read.table('clipboard',header=T)->bio.df # import (from demo.xls, sheet
1) in case previous one is replace or cleared
sapply(bio.df,mean) # find means of each
Above 'sapply' is equivalent to
apply(bio.df,2,mean) # find means of each
Try these as well
sapply(bio.df,sum) # give col sum
sapply(bio.df,sqrt) # square root of all
sapply(bio.df,sample) # samples in each col
sapply(bio.df,levels) # lists levels in categorical variables
