Let’s start with a dataframe built from training purposes.
df = mtcars
We use the head(df, n)
function where df
corresponds to the dataframe and n
the number of rows we want to show.
head(df, n = 5)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
We use the tail(df, n)
function where df
corresponds to the dataframe and n
the number of rows we want to show.
tail(df, n = 5)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.9 1 1 5 2
## Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.5 0 1 5 4
## Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.5 0 1 5 6
## Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.6 0 1 5 8
## Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.6 1 1 4 2
We can use the df[row#,col#]
.
df[1,1]
## [1] 21
We can use the row name and column names.
df["Mazda RX4","mpg"]
## [1] 21
We use df[row, column]
where row
or column
can be a number, a value, a vector of numbers, a vector of values. If for instance we need to show row 9 and 12 of column 1 and 2.
df[c(9,12), c(1,2)]
## mpg cyl
## Merc 230 22.8 4
## Merc 450SE 16.4 8
We can achieve the same results with the row names and column names.
df[c("Merc 230", "Merc 450SE"), c("mpg","cyl")]
## mpg cyl
## Merc 230 22.8 4
## Merc 450SE 16.4 8
We use the colnames(df)
function where we can change the all the column names with a vector of names as shown in the example below.
colnames(df) = c("MPG", "CYL", "DISP", "HP", "DRAT", "WT", "QSEC", "VS", "AM", "GEAR", "CARB")
head(df, n = 5)
## MPG CYL DISP HP DRAT WT QSEC VS AM GEAR CARB
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
We can also change a specific column name using the column number.
colnames(df)[2] = "CYL"
We create a new column names CAR
($
separates the dataframe name from the column name) and copy the row names using the row.names()
function. Then we put NULL
in the row names.
df$CAR = row.names(df)
rownames(df) = NULL
head(df, n = 5)
## MPG CYL DISP HP DRAT WT QSEC VS AM GEAR CARB CAR
## 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4
## 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 Mazda RX4 Wag
## 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 Datsun 710
## 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 Hornet 4 Drive
## 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 Hornet Sportabout
We use the which
function on the rows of df
to apply a condition. In the example we select the rows where MPG is greater than 25.
df2 = df[which(df$MPG > 25),]
df2
## MPG CYL DISP HP DRAT WT QSEC VS AM GEAR CARB CAR
## 18 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 Fiat 128
## 19 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 Honda Civic
## 20 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 Toyota Corolla
## 26 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 Fiat X1-9
## 27 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 Porsche 914-2
## 28 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 Lotus Europa
If we need the car models with the MPG greater than 25.
df3 = df[which(df$MPG > 25), "CAR"]
df3
## [1] "Fiat 128" "Honda Civic" "Toyota Corolla" "Fiat X1-9"
## [5] "Porsche 914-2" "Lotus Europa"
If we need the car models with the MPG greater than 25 and GEAR equal 5.
df4 = df[which(df$MPG > 25 & df$GEAR == 5), "CAR"]
df4
## [1] "Porsche 914-2" "Lotus Europa"
We use the order
function to sort the rows. We can combine multiple columns and we can sort the rows descending by using the -
sign as shown in the example below.
df2 = df[which(df$MPG > 25),]
df2 = df2[order(df2$GEAR, -df2$HP),]
df2
## MPG CYL DISP HP DRAT WT QSEC VS AM GEAR CARB CAR
## 18 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 Fiat 128
## 26 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 Fiat X1-9
## 20 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 Toyota Corolla
## 19 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 Honda Civic
## 28 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 Lotus Europa
## 27 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 Porsche 914-2