4 Ways to Add a Column to DataFrame in Pandas (With Examples)

Pandas is a popular Python library used for the analysis and manipulation of huge data sets. When it comes to cleaning up messy data, Pandas does the job! Using this library we can make data more readable and easier to work with. It offers two main structures to store data, Series (a one-dimensional array-like structure) and DataFrames (a two-dimensional structure similar to a table). In this article, we’ll look at some methods by which we can add columns to a DataFrame with examples.

Methods to Add Columns To a DataFrame

Before we start working with the Pandas library, it is important to install it on our system or in our virtual environment.

Below is the command to install Pandas:

pip install pandas

After installation, Pandas can be imported into our file in order to use DataFrames. Using DataFrames, we can visualize our data as a two-dimensional structure (like a table), making it easier to read and manipulate. Each DataFrame contains rows and columns that are indexed to help us locate and access our data better.

There are times when we may need to add a new column to an already existing DataFrame in Pandas. This can be done in many ways, but here we’ll look at the simplest and most efficient ways to achieve this.

Below are the methods of adding a column to DataFrame:

  1. Assigning a list as a new column
  2. Using insert() method
  3. Using assign() function
  4. Using apply() method

We’ll now discuss each of these methods in detail with some examples.

1. Assigning a List as a New Column

The easiest way in which we can add a new column to an existing DataFrame is by assigning a list to a new column name. This new column will then be added by Pandas to the end of our DataFrame.

Let us look at an example of adding a new column of grades into a DataFrame containing names and marks.

Example:

import pandas as pd

data = {'Name': ['Bruce', 'Tony', 'Natasha', 'Steve'],
        'Marks': [81, 89, 93, 72]}

dataframe = pd.DataFrame(data)

dataframe['Grade'] = ['B', 'B', 'A', 'C']

print(dataframe)

Here, we first store our data in the form of a dictionary where the field of our columns will be keys (Name and Marks, in this case) in the dictionary and the values of each column are stored as a list. Now we call the DataFrame() method from pandas on our dictionary, to convert it into a DataFrame or a tabular form and store it in a variable called dataframe.

Now we use the bracket notation to add a new column Grade into our DataFrame and assign a list containing the values that will come under the Grade column. Pandas adds this column to the end of our existing DataFrame.

Finally, we can print the DataFrame which gives us the following output.

Output:

Adding a New Column to a DataFrame Using A List
Adding a New Column to a DataFrame using a List

2. Using insert() Method

While adding a new column by assigning a list, the column is always added to the end of the existing DataFrame. However, by using the insert() method we can add the new column at an index of our choice.

Syntax:

Below is the syntax of insert() method.

DataFrame.insert(loc, column, value, allow_duplicates=False)
  • loc is used to specify the location or index where we want to insert the column,
  • column is used to specify the name of the column,
  • value takes a list containing the values in the column,
  • allow_duplicates is a boolean value used to check if a column with the same name already exists in the DataFrame.

We will now look at an example to add the column Grade at index 1, to the DataFrame which contains Name and Marks.

Example:

import pandas as pd

data = {'Name': ['Bruce', 'Tony', 'Natasha', 'Steve'],
        'Marks': [81, 89, 93, 72]}

dataframe = pd.DataFrame(data)

dataframe.insert(1, 'Grade', ['B', 'B', 'A', 'C'])

print(dataframe)

The code works similarly to the previous example where we enter our data as a dictionary which stores column names and their values as key-value pairs, which is later converted into a DataFrame using the DataFrame() method.

The only difference from the previous example is that now we have added the new column Grade at the first index in our DataFrame using the insert() method. The resulting DataFrame is printed and the output is as follows.

Output:

Adding a New Column to a DataFrame Using insert
Adding a New Column to a DataFrame using insert()

3. Using assign() Method

The assign() function creates a new DataFrame with the added column included, thereby leaving the original DataFrame intact. The new column is added to the end of the new DataFrame, which contains all the fields from our existing DataFrame.

Syntax:

Below is the syntax of assign() method.

DataFrame.assign(**kwargs)

**kwargs indicates that column names are keyword arguments where the key is the column name, and the value is what you want to assign to that column.

We will look at how to implement the above example of adding a new column Grade to an existing DataFrame containing Name and Marks using assign().

Example:

import pandas as pd

data = {'Name': ['Bruce', 'Tony', 'Natasha', 'Steve'],
        'Marks': [81, 89, 93, 72]}

dataframe_org = pd.DataFrame(data)

dataframe_new = dataframe_org.assign(Grade=['B', 'B', 'A', 'C'])

print(dataframe_org)

print(dataframe_new)

Here, we store the original DataFrame obtained from our data dictionary, in a variable dataframe_org. Then we call the assign() method on dataframe_org which adds a new column Grade to our existing DataFrame and stores it as a new DataFrame in the dataframe_new variable.

On printing dataframe_org we get our original DataFrame without the addition of the Grade column and on printing dataframe_new, we get the new DataFrame with the Grade column added to its end.

Output:

Adding a New Column to a DataFrame Using Assign
Adding a New Column to a DataFrame using assign()

4. Using apply() Method

The apply() method helps us add a new column based on a function of our choice. It essentially applies a function to an already existing column in the DataFrame to generate a new column.

Syntax:

Below is the syntax of apply() method.

DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwargs)
  • func is a function to be applied to each row or column,
  • axis has a default value of 0, it applies the function to each column. If you specify axis=1, it applies the function to each row,
  • raw is a boolean value used to indicate if row/column should be passed as a ndarray object,
  • result_type determines how data is returned using the function,
  • args is a tuple containing additional arguments to pass into the function,
  • **kwargs is an optional keyword argument.

Let us look at how we can add the Grade column to our DataFrame containing Name and Marks, by applying a function to Marks that calculates the grade according to the values in Marks.

Example:

import pandas as pd 

data = {'Name': ['Bruce', 'Tony', 'Natasha', 'Steve'],
        'Marks': [81, 89, 93, 72]}

df = pd.DataFrame(data)

def grade_calc(grade):
    if grade > 90:
        return 'A'
    elif grade >= 80 and grade <= 90:
        return 'B'
    elif grade <= 80:
        return 'C'
    else:
        return 'D'

df['Grade'] = df['Marks'].apply(grade_calc)

print(df)

Here we have a function grade_calc() that calculates grades based on marks. It is applied to the Marks column which returns a column containing grades for each value in Marks.

Output:

Adding a New Column to a DataFrame Using Apply
Adding a New Column to a DataFrame using apply()

Conclusion

Adding new columns to our DataFrame is a fundamental operation that we may need to perform often while working with Pandas. There are many ways in which we can accomplish adding a new column and in this article, we’ve looked at four of the simplest methods to do so. The most straightforward method is, adding a new column by assigning it a list and we have looked at some in-built functions provided by Pandas such as insert(), assign() and apply(). Knowing how to use these methods will help us effectively manipulate data in Pandas.

Reference

https://stackoverflow.com/questions/12555323/how-to-add-a-new-column-to-an-existing-dataframe

Nandana Pradosh
Nandana Pradosh
Articles: 27