Introduction:
In day to day works of data analysts and scientists, you often need to write your results and present them in a nice and neat csv file or a excel file. In this article we will discuss how to write csv files from python scripts. It is important for not only data science but anyone who has to present data on a regular basis.
The basic option: pandas.dataframe.to_csv attribute
This option uses pandas library.If you are not introduced to pandas yet, read about pandas here first. When you have a data table in your environment which you want to save, then the best way is to put it in a dataframe version and then use the to_csv attribute of pandas dataframe to save it into a specific file. The normal use would look like:
To clear the idea, the normal format to use to_csv on a dataframe is:
dataframe.to_csv(destination_file_path_as_string,index = False)
The above example depicts how you would save marks for 4 students. We first create a empty dataframe. Then we store each column with a specific name and then we finally save that csv as marksheet.csv with no index.Finally this csv gets stored in the local directory of the script, which in this case was the home, and there we find the marksheet.csv which look like the following:
Now with this process discussed we move on to a more dynamic way to store csv files.
row by row csv writing:
For writing a csv file, often our data comes row by row and we need to store it like that. For this we have to use file opening code. The normal code is of general format:
import csv
with open(filename,'w') as f:
writer = csv.writer(f)
writer.writerow(list_columns)
f.close()
with i in range(n):
with open(filename,'a') as f:
writer = csv.writer(f)
writer.writerow(row_list[i])
f.close()
Now, I will break down the above code so that you can modify the necessary parts and use it. First of all, for writing csv, we import the csv
reading library named csv. If you get a moduleNotFoundError saying csv is not found then that means that the environment you are using for running the script does not have csv installed. In that case, you have to install csv module.
Once csv is imported properly, the second crucial thing to understand is to open the file in write mode and then creating a writer object to write the csv file. A writer object has a writerow
attribute which is very important to write one row at a time.
You will need to change the last looping part of the code, where I have added a simple loop and then added rows from a already created row_list, but the correct thing will be to create the row in the loop itself so that the above code keeps on saving your data at the latest calculations.
This type of dynamic csv creation is most important when you will be running a lot of calculations together. In such cases, it may happen so that your row creation stops in middle for some exception or bug, but then to properly inspect the script and save time sometimes you should store your previous outputs like this.
Comments
Post a Comment