Import and Export CSV and Excel files

Import from CVS

CSV (comma-separated values) file are very common to exchange data from one system or application to another. The data is stored a plain text with no compression and no size limitations compared to some other formats.

Let’s use the read.csv function with three arguments:

the path of the CSV file ("D:\\Data\\Example.csv")
the confirmation the first row corresponds to the column header (header = TRUE)
the list of strings corresponding to “not available” or blank (na.strings = c("", "NaN", "#N/A", "#REF!")))

Note:

if we are using Windows, the path should use a slash character (/) or a double back-slash (\\) in replacement of a slash (\).
if “NA” does not mean “not available” in our file, we need to create an na.strings list without it (it is part of the default list otherwise).

MyDataframe = read.csv("D:\\Data\\Example.csv", header = TRUE, na.strings = c("", "NaN", "#N/A", "#REF!"))

Selection of the file

If we want the file to be selected during the execution of the script, we can replace the path of the file by the command file.choose() as shown below. A pop-up window will invite the user to select the file.

MyDataframe = read.csv(file.choose(), header = TRUE, na.strings = c("", "NaN", "#N/A", "#REF!"))

Elements modified

space characters in the column names have been replaced by a point “.” (in general all non-alphanumerical characters are replaced by “.”)
if a column name started with a digit, an “X” have been inserted
the blank cells are showing as NA for “not available”

Original CSV file
Col A	Col B	Col C	1 Col D
a	b	3
f	d	4

Dataframe
Col.A	Col.B	Col.C	X1.Col.D
a	b	3	NA
f	d	4	NA

After importing a CVS file, we should verify the numbers have been correctly interpreted (it works fine in most cases).

Export to CVS

To export a dataframe to a CVS file, let’ use the write.csv function with three arguments:

the name of the dataframe to export (MyDataFrame)
the path of the csv file ("D:\\Data\\Export.csv")
the confirmation we did not named the rows (row.names = FALSE)

If we did not rename the columns, the csv file will use the the column names of the dataframe (with “.” replacing spaces, etc.) and the blank cells will contain “NA”.

write.csv(MyDataframe, "D:\\Data\\Export.csv", row.names = FALSE)

Tips: we can add the date to the file name by pasting the system date. For that we will use:

the paste function to paste strings: the folder, the date and the file name
the Sys.time function to obtain the date from the system and use the “YYYY-MM-DD” naming convention

write.csv(MyDataframe,
          paste("D:\\Data\\", format(Sys.time(), "%Y-%m-%d"), " Export.csv", sep = ""),
          row.names = FALSE)

Import from Excel

To import an Excel spreadsheet let’s use the read_excel function part of the readxl package with the following arguments:

the part of the excel file
the tab number to import

library(readxl)
MyDataframe2 = read_excel("D:\\Data\\Example.xlsx", sheet = 1)

When we import from an Excel spreadsheet, the column names have not changed: they still contains spaces and they can start with a digit. The blank cells are marked NA.

Dataframe
Col A	Col B	Col C	1 Col D
a	b	3	NA
f	d	4	NA

In the example above, we imported the first tab of the file but we could have used the name of the tab as shown below.

library(readxl)
MyDataframe2 = read_excel("D:\\Data\\Example.xlsx", sheet = "Example")

Export to Excel

We will simply use the write_xlsx from the writexl package with the following arguments:

the name of the dataframe
the path of the excel file

library(writexl) 
write_xlsx(MyDataframe, "D:\\Data\\Export.xlsx")