Pandas is a popular data manipulation library in Python that helps us work with various types of datasets. Pandas provides DataFrames, a two-dimensional tabular structure containing rows and columns that help us store said datasets. In many cases, we may have more than one DataFrame in our program which needs to be combined. Pandas provides many ways of concatenating DataFrames. However, in this article, we’ll look at three different methods of appending DataFrames using for loop specifically, with examples for better understanding.
Examples of Appending DataFrames in Pandas
In this section, we’ll look at three examples to append DataFrames using for loops for DataFrames containing both Textual and Numeric values. However, before we can start working with DataFrames we must make sure that we have Pandas installed on our system.
You can install Pandas using the command:
pip install pandas
Now, we are ready to work with DataFrames!
1. Appending DataFrames Using append() Method & For Loop
The simplest method for appending DataFrames is using the append() method provided by Pandas. This method appends the rows of another data frame to the end of the given data frame and returns a new DataFrame. Let us understand append() in detail.
Syntax:
DataFrame.append(other, ignore_index=False, verify_integrity=False, sort=False)
- other: DataFrame or Series containing the data to be appended.
- ignore_index: A boolean value used to represent whether indices should be labelled as 0, 1, …, n – 1 if set to True or not labelled at all if set to False.
- verify_integrity: Used to raise an error if there are duplicate indices in the resulting DataFrame after appending.
- sort: Used to sort the columns of resulting DataFrame in alphabetic order while appending along columns.
Now let us understand its implementation with an example.
Example:
import pandas as pd
result_df = pd.DataFrame()
for i in range(5):
data = {'A': [i], 'B': [i+1] ,'C': [i+2]}
df = pd.DataFrame(data)
result_df = result_df.append(df, ignore_index=True)
print(result_df)
Firstly, we import pandas as pd in order to be able to use DataFrames and functions like append() and concat(). We then initialize an empty DataFrame result_df which will become the resultant DataFrame. Then we run a for loop 5 times. Within this for loop, we first declare our data as a dictionary called data which contains 3 different columns A,B and C which perform various operations on the current index, then data is converted to a DataFrame using pd.DataFrame(). Finally, we append the DataFrame to our resultant DataFrame result_df. Setting ignore_index=True labels the indices in the resultant DataFrame.
Output:
2. Appending DataFrames Using Lists & For Loop
In this method, we’ll append our individual data frames to a list using append() and then use concat() to concatenate the data frames residing in the list.
Example:
import pandas as pd
data_list = []
for i in range(4):
data = {'A': [i], 'B': [i+1] ,'C': [i*2]}
df = pd.DataFrame(data)
data_list.append(df)
result_df = pd.concat(data_list, ignore_index=True)
print(result_df)
Here we import pandas as pd and then initialize a list data_list which will store our DataFrames. We then run a for loop 4 times, within which we declare our data in a dictionary called data containing 3 columns A,B and C. We then convert data into a DataFrame using the DataFrame() method and store it in a variable df. Then, we append the DataFrame to data_list which we have initially declared. Finally we concatenate all the DataFrames stored in data_list using concat() (which works similarly to append()) and store it as result_df. We then print the resultant DataFrame result_df.
Output:
3. Appending DataFrames Containing Textual Values
We have looked at two examples containing numeric valued DataFrames. Now we’ll observe an example where our DataFrames to be appended contains textual values. Here, we’ll use a list to store the DataFrame like the previous example.
Example:
import pandas as pd
data_list_text = []
name_list = ['Lara', 'Katniss', 'Diana']
for name in name_list:
df_values = name
data_list_text.append(df_values)
result_df = pd.DataFrame(data_list_text, columns= ['Name'])
print(result_df)
Here, we import pandas as pd and declare a list data_list_text which will store our DataFrame containing text values. Now, we’ll create a list name_list containing different names which will become the values of the DataFrame. We will now use a for loop to append the names to our list using the list.append(), data_list_text if the names are present in name_list. Now we create a DataFrame from the data stored in data_list_text, containing a single column labelled Name. This is our resultant DataFrame and is stored in a variable result_df.
Output
Conclusion
Pandas provides many methods by which we can append DataFrames. In this article, we’ve looked at appending DataFrames only using for loops along with some examples for better understanding. We have looked at 3 distinct methods of appending using for loops – using append(), using lists to store DataFrames and appending DataFrames containing textual values. This may not be the most efficient method to append DataFrames but it sure is easy to implement and understand!
Reference
https://stackoverflow.com/questions/28669482/appending-pandas-dataframes-generated-in-a-for-loop