Making vector and combining
Checking Dataframe properties
Installing Package in R
More under construction .....
We can make vectors with commands
a <- seq (1,10) # make sequence number from 1
to 10 and store in 'a'
b <- seq (10,1)
c <-cbind (a, b) # making matrix by combining two vectors
c # check what is in 'c'
Now try find mean, standard
deviation and variance from above matrix c
mean (c) # mean of 'c'
sd (c) # standard deviation of 'c'
var (c) # variance of 'c'
read.table ('clipboard',header=T)->
a.df # command can be composed this was as well
attach(a.df) #
defining the object to work
mean(a.df) #
the mean
max(a.df) #
the maximum or largest value
min(a.df) #
the minimum value
sd (a.df) #
standard deviation
var(a.df) #
variance
summary(a.df) #
to see summary of all variables at once
Checking Dataframe properties
dim(a.df) # dimensions of a matrix or data frame
ncol(a.df) # number of columns
nrow(a.df) # number of rows
colnames(a.df) # give headings of the columns
rownames(a.df) # row headings
Adding a column in a.df,
called multiple, which will have average of column age and weight
of a.df. The 'dollor' sigh refers column of dataframe
a.df $ avg<-(a.df$age + a.df $
weight)/2
rowSums(a.df)
a.df$sum<-rowSums(a.df) # adding sum col
colSums(a.df)
rowSums(a.df)
rowMeans(a.df)
colMeans(a.df)
Try yourself adding a column with
values
multiple = 'age' * 'weight'
Some simple plots
We are working the the same previous data, if
need import again (copy data from sheet 1, in demodata excel file). Here, pratice some basic plots
bio.df<- read.table
('clipboard', header=T)
attach (bio.df)
plot (age,weight) # plot (predictor,
response) i.e. x, y
plot (age,height)
plot (height,weight)
hist (age) # plot histogram
hist (weight)
hist (height)
hist (age, nclass=6)
boxplot(weight,height) # plot boxplot
To make ease in working, we can first attach
the data frame/matrix = (it tells R to work with the assigned object)
attach(a.df)
This will shorten the command
when we deal with each variables separately from an object. If 'attach' is forgot, follow command will result error.
mean (age)
mean (weight)
mean (height)
median (age)
median (weight)
median (height)
If we do not 'attach(file)', we will
need to command specifying variable and file name, eg.
mean(a.df $ age)
sd(a.df $ weight)Installing Package in R
In R-sofware main menu 'Packages', go to 'Install Pakcage(s)'. Then choose a CRAN from there (anyone, nearest location will be better). Then find the required package name in the list, click it and click 'OK'. Here, installation package prettyR.
Or if you would like to install offline, then downlaod the zip file of the R package first from CRAN page and install it latter from main menu Packages and Install Package(s) from local zip files...
Loading the package to make it
functional load in R
To calculate Mode, we will
load the library “prettyR” as this function is not directly available in
default libraries.
And the library do not come with
the installation of R, so we installed the package first (above) and now we load the library
library(prettyR)
Mode(x) # mode
calculatioon (x is any variable)
Mode (age)
Mode (weight)
Mode
(height)
You may want to practice more installation of packages.
Using 'apply' function
The apply function is used for
applying functions to rows or columns of matrices or dataframes
x<-matrix(1:24,nrow=4)
x
apply(x,1,sum) #1 = row, 2 = col
apply(x,1,sum)
In this case, the above commands
are equivalent to
colSums(x)
rowSums(x)
Try few more
apply(x,1,sqrt)
apply(x,2,sqrt)
To apply a function to
vector/variable i.e. column (in matrix or dataframe) then use 'sapply' (rather than 'apply')
read.table('clipboard',header=T)->bio.df # import (from demo.xls, sheet
1) in case previous one is replace or cleared
sapply(bio.df,mean) # find means of each
variables
Above 'sapply' is equivalent to
apply(bio.df,2,mean) # find means of each
variables
Try these as well
sapply(bio.df,sum) # give col sum
sapply(bio.df,sqrt) # square root of all
values
sapply(bio.df,sample) # samples in each col
sapply(bio.df,levels) # lists levels in categorical variables
More under construction .....