Pandas - reset_index()

Posted on Mar 12, 2022

Functions like sklearn.model_selection.train_test_split can be used on dataframes, and it keeps original indexes to the row that was shuffled.

	Fruit
2	Apple
3	Banana

I have a tendency to use reset_index() to cleanup my dataframe:

	Fruit
0	Apple
1	Banana

Looks much cleaner, but any operations that relied on indexes will cause issues, such as pandas.concat. Thus sharing code with other developers becomes painful because any operations that happens after the reset_index() loses reference to the original row:

	Weight (Grams)
0	100
1	120

A joined table should be:

	Fruit	Weight (Grams)
0	Apple	100
1	Banana	120

But instead:

	Fruit	Weight (Grams)
2	Apple	nan
3	Banana	nan
0	nan	100
1	nan	120