Delete redundant rows in pandas dataframe

import modules

import pandas as pd

Create dataframe with duplicates

raw_data = {'first_name': ['Jason', 'Jason', 'Tina', 'Jake', 'Amy'],
        'last_name': ['Miller', 'Miller', 'Ali', 'Milner', 'Cooze'],
        'age': [42, 42, 36, 24, 73],
        'preTestScore': [4, 4, 31, 2, 3],
        'postTestScore': [25, 25, 57, 62, 70]}
df = pd.DataFrame(raw_data, columns = ['first_name', 'last_name', 'age', 'preTestScore', 'postTestScore'])
df

	first_name	last_name	age	preTestScore	postTestScore
0	Jason	Miller	42	4	25
1	Jason	Miller	42	4	25
2	Tina	Ali	36	31	57
3	Jake	Milner	24	2	62
4	Amy	Cooze	73	3	70

Identify which observations are duplicates

df.duplicated()

0    False
1     True
2    False
3    False
4    False
dtype: bool

Drop duplicates

df.drop_duplicates()

	first_name	last_name	age	preTestScore	postTestScore
0	Jason	Miller	42	4	25
2	Tina	Ali	36	31	57
3	Jake	Milner	24	2	62
4	Amy	Cooze	73	3	70

Drop duplicates in the first name column, but take the last obs in the duplicated set

df.drop_duplicates(['first_name'], keep='last')

	first_name	last_name	age	preTestScore	postTestScore
1	Jason	Miller	42	4	25
2	Tina	Ali	36	31	57
3	Jake	Milner	24	2	62
4	Amy	Cooze	73	3	70

저작자표시 비영리 변경금지

'Tips > Solutions for problems' 카테고리의 다른 글

how to insert dataframe data into mysql database using pymysql(pure python3 library) (0)	2017.04.29
Mariadb Sovle : Plugin 'unix_socket' is not loaded (0)	2017.04.24

Creative Works for JASON

Delete redundant rows in pandas dataframe

import modules

Create dataframe with duplicates

Identify which observations are duplicates

Drop duplicates

Drop duplicates in the first name column, but take the last obs in the duplicated set

'Tips > Solutions for problems' 카테고리의 다른 글

티스토리툴바

Delete redundant rows in pandas dataframe

import modules

Create dataframe with duplicates

Identify which observations are duplicates

Drop duplicates

Drop duplicates in the first name column, but take the last obs in the duplicated set

'Tips > Solutions for problems' 카테고리의 다른 글

'Tips/Solutions for problems' Related Articles

티스토리툴바