Pandas - reset_index()
Functions like sklearn.model_selection.train_test_split
can be used on dataframes, and it keeps original indexes to the row that was shuffled.
Fruit | |
---|---|
2 | Apple |
3 | Banana |
I have a tendency to use reset_index()
to cleanup my dataframe:
Fruit | |
---|---|
0 | Apple |
1 | Banana |
Looks much cleaner, but any operations that relied on indexes will cause issues, such as pandas.concat
. Thus sharing code with other developers becomes painful because any operations that happens after the reset_index()
loses reference to the original row:
Weight (Grams) | |
---|---|
0 | 100 |
1 | 120 |
A joined table should be:
Fruit | Weight (Grams) | |
---|---|---|
0 | Apple | 100 |
1 | Banana | 120 |
But instead:
Fruit | Weight (Grams) | |
---|---|---|
2 | Apple | nan |
3 | Banana | nan |
0 | nan | 100 |
1 | nan | 120 |