Pandas apply#

I often use Pandas to process NLP data. In many cases I want to create a new column from the information in an existing column. For example, if I want to have the number of characters or tokens.

This can easily be done with the help of the apply function of Pandas.

However, an extreme case is when you want to apply one single function to create two new columns from the information of two existing columns.

Here I show you how it’s done.

import pandas as pd
df = pd.DataFrame(
    {
        "int1": [1, 2, 3],
        "int2": [11, 12, 13],
        "strings": ["string1", "string2", "string3"],
    }
)
df
int1 int2 strings
0 1 11 string1
1 2 12 string2
2 3 13 string3
def add_and_multiply(x, y):
    add_result = x + y
    multiply_result = x * y
    return add_result, multiply_result
df[["int1_plus_int2", "int1_times_int2"]] = df[["int1", "int2"]].apply(
    lambda x: add_and_multiply(*x),
    axis=1,
    result_type="expand",
)
df
int1 int2 strings int1_plus_int2 int1_times_int2
0 1 11 string1 12 11
1 2 12 string2 14 24
2 3 13 string3 16 39