Pandas缺失值處理-判斷和刪除
二、缺失值判斷
DataFrame.isna()
df = pd.DataFrame({'age': [5, 6, np.NaN], 'born': [pd.NaT, pd.Timestamp('1939-05-27'), pd.Timestamp('1940-04-25')], 'name': ['Alfred', 'Batman', ''], 'toy': [None, 'Batmobile', 'Joker']})df age born name toy0 5.0 NaT Alfred None1 6.0 1939-05-27 Batman Batmobile2 NaN 1940-04-25 Jokerdf.isna() age born name toy0 False True False True1 False False False False2 True False False Falseser = pd.Series([5, 6, np.NaN])ser.isna()0 False1 False2 True# 但對于DataFrame我們更關心到底每列有多少缺失值 統(tǒng)計缺失值的個數(shù)df.isna().sum()age 1born 1name 0toy 1DataFrame.isnull()
df.isnull() age born name toy0 False True False True1 False False False False2 True False False False#統(tǒng)計某一列的缺失值個數(shù)df['age'].isnull().sum()1DataFrame.notna()
df.notna()age born name toy0 True False True False1 True True True True2 False True True True
DataFrame.notnull()
df.notnull()age born name toy0 True False True False1 True True True True2 False True True True
df.info()<class 'pandas.core.frame.DataFrame'>RangeIndex: 3 entries, 0 to 2Data columns (total 4 columns):# Column Non-Null Count Dtype--- ------ -------------- -----0 age 2 non-null float641 born 2 non-null datetime64[ns]2 name 3 non-null object3 toy 2 non-null objectdtypes: datetime64[ns](1), float64(1), object(2)memory usage: 224.0+ bytes
三、缺失值刪除
DataFrame.dropna
DataFrame.dropna(axis=0, how='any', thresh=None,subset=None, inplace=False)
df = pd.DataFrame({"name": ['Alfred', 'Batman', 'Catwoman'],"toy": [np.nan, 'Batmobile', 'Bullwhip'],"born": [pd.NaT, pd.Timestamp("1940-04-25"),pd.NaT]})dfname toy born0 Alfred NaN NaT1 Batman Batmobile 1940-04-252 Catwoman Bullwhip NaT#刪除包含缺失值的行df.dropna()name toy born1 Batman Batmobile 1940-04-25#刪除包含缺失值的列,需要用到參數(shù)axis='columns'df.dropna(axis='columns')name0 Alfred1 Batman2 Catwomandf.dropna(how='all')name toy born0 Alfred NaN NaT1 Batman Batmobile 1940-04-252 Catwoman Bullwhip NaTdf.dropna(thresh=2)name toy born1 Batman Batmobile 1940-04-252 Catwoman Bullwhip NaTdf.dropna(subset=['name', 'born'])name toy born1 Batman Batmobile 1940-04-25df.dropna(inplace=True)dfname toy born1 Batman Batmobile 1940-04-25
··· END ···
評論
圖片
表情
