Pandas - reset_index()

Posted on Mar 12, 2022

Functions like sklearn.model_selection.train_test_split can be used on dataframes, and it keeps original indexes to the row that was shuffled.

Fruit
2 Apple
3 Banana

I have a tendency to use reset_index() to cleanup my dataframe:

Fruit
0 Apple
1 Banana

Looks much cleaner, but any operations that relied on indexes will cause issues, such as pandas.concat. Thus sharing code with other developers becomes painful because any operations that happens after the reset_index() loses reference to the original row:

Weight (Grams)
0 100
1 120

A joined table should be:

Fruit Weight (Grams)
0 Apple 100
1 Banana 120

But instead:

Fruit Weight (Grams)
2 Apple nan
3 Banana nan
0 nan 100
1 nan 120